Skip to content

Hive Tez Performance Slow Fix

DodaTech Updated 2026-06-24 3 min read

In this tutorial, you'll learn about Hive Tez Performance Slow Fix. We cover key concepts, practical examples, and best practices.

Hive queries using Tez execute slowly:

Hive on Tez: Vertex failed, vertex killed due to container limit
Status: Running (Tez) — Slow progress

Tez is Hive's default execution engine. Slow performance is usually due to insufficient container resources, too few reducers, data skew in Tez vertices, or suboptimal Tez session settings. The Tez UI (port 8080) provides detailed DAG visualizations.

Step-by-Step Fix

1. Check the Tez UI

RIGHT — analyze the DAG:

http://tez-ui-host:8080/tez-ui/#/view/dags

Look for:

- Individual vertex run times (skew = one runs much longer)
- Shuffle errors or retries
- Container allocation delays
- Memory limits hit

2. Tune Tez container sizes

WRONG — default container sizes too small for production:

-- Default (often too low)
SET hive.tez.container.size=1024;  -- 1GB
SET hive.tez.java.opts=-Xmx768m;

RIGHT — increase for production workloads:

SET hive.tez.container.size=4096;  -- 4GB
SET hive.tez.java.opts=-Xmx3276m;  -- 80% of container size
SET hive.auto.convert.join.noconditionaltask.size=512;  -- 512MB for map-join

3. Reduce input split size for parallelism

WRONG — default 256MB splits create too few tasks:

-- Default 256MB splits
SET mapreduce.input.fileinputformat.split.maxsize=268435456;

RIGHT — smaller splits for better parallelism:

SET mapreduce.input.fileinputformat.split.maxsize=67108864;  -- 64MB splits
SET mapreduce.input.fileinputformat.split.minsize=33554432;   -- 32MB min
SET hive.tez.cpu.vcores=4;  -- CPU allocation per container

4. Enable Tez session reuse

WRONG — creating a new Tez session per query:

-- Each query creates a new Tez session (slow)

RIGHT — reuse sessions:

SET hive.server2.tez.initialize.default.sessions=true;
SET hive.tez.session.events.log.dir=/tmp/tez_sessions;
SET hive.tez.session.status.enabled=true;

In hive-site.xml:

<property>
    <name>hive.server2.tez.initialize.default.sessions</name>
    <value>true</value>
</property>
<property>
    <name>hive.server2.tez.default.queues</name>
    <value>default</value>
</property>

5. Handle data skew in Tez

WRONG — skewed data causing reducer hot spots:

SELECT /*+ MAPJOIN(dim) */ fact.key, COUNT(*)
FROM fact JOIN dim ON fact.key = dim.key;

RIGHT — enable skew handling:

SET hive.groupby.skewindata=true;  -- Handle GROUP BY skew
SET hive.tez.dyn.partition.pruning=true;  -- Dynamic partition pruning

-- Or use salt keys:
SELECT CONCAT(key, '_', FLOOR(RAND() * 10)) as salted_key, COUNT(*)
FROM my_table
GROUP BY CONCAT(key, '_', FLOOR(RAND() * 10));

6. Enable vectorization and optimizations

-- Enable vectorized query execution
SET hive.vectorized.execution.enabled=true;
SET hive.vectorized.execution.reduce.enabled=true;

-- Enable CBO and statistics
SET hive.cbo.enable=true;
SET hive.compute.query.using.stats=true;
SET hive.stats.fetch.column.stats=true;

-- Enable ORC predicate pushdown
SET hive.optimize.index.filter=true;

-- Enable LLAP (Live Long and Process)
SET hive.llap.execution.mode=auto;
SET hive.llap.io.enabled=true;

7. Tune reducer count

-- Use auto-reducer detection
SET hive.exec.reducers.bytes.per.reducer=67108864;  -- 64MB per reducer
SET hive.exec.reducers.max=100;
SET hive.tez.auto.reducer.parallelism=true;

-- Or set manually (0.95 * containers)
SET mapreduce.job.reduces=50;

Expected output: Hive on Tez queries complete significantly faster.

Prevention

  • Set hive.tez.container.size to at least 4GB for production.
  • Enable Tez session pooling for consistent performance.
  • Use ORC format with vectorization for analytic queries.
  • Monitor Tez UI for vertex-level performance bottlenecks.
  • Enable CBO and keep table statistics updated.

Common Mistakes with tez performance

  1. Forgetting deriving (Show, Eq) on custom data types needed for debugging
  2. Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
  3. Using head and tail instead of pattern matching, causing runtime errors on empty lists

These mistakes appear frequently in real-world HIVE code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.

Practice Exercise

Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.

FAQ

### What is the difference between MapReduce and Tez execution?

Tez creates a single DAG (Directed Acyclic Graph) for the entire query, avoiding the intermediate HDFS writes that MapReduce requires between stages. Tez reuses containers and pipelines data between vertices in memory, making it much faster for multi-stage queries.

What is LLAP in Hive?

LLAP (Live Long and Process) is a persistent query execution daemon that keeps containers running across queries. It caches data in memory and provides faster query responses. Enable with SET hive.llap.execution.mode=all for interactive workloads.

How do I debug slow Tez vertices?

Check the Tez UI vertex details:

  1. "Time spent" — is the vertex CPU-bound or IO-bound?
  2. "Number of tasks" — are there enough parallel tasks?
  3. "Shuffle errors" — network problems between vertices?
  4. "Failed tasks" — container limits or data errors?

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro