Hadoop MapReduce Job Slow Fix

DodaTech Updated 2026-06-24 3 min read

In this tutorial, you'll learn about Hadoop MapReduce Job Slow Fix. We cover key concepts, practical examples, and best practices.

A MapReduce job runs much slower than expected:

Map 100% reduce 0%  (Stuck at shuffle phase)
Elapsed time: 45 minutes and counting

Slow MapReduce jobs are usually caused by data skew (one reducer gets most of the data), too few reducers, excessive spills to disk, or improper compression. The MapReduce job tracker UI shows the progress of each task.

Step-by-Step Fix

1. Check the JobTracker UI

RIGHT — analyze the running job:

http://jobtracker-host:8088/cluster

Look at:

- Map phase duration: Are all mappers taking similar time?
- Reduce phase: Is one reducer at 99% while others are done?
- Shuffle bytes: Large shuffle indicates data skew
- Spilled records: High spill count = inefficient memory use

2. Fix data skew with custom partitioner

WRONG — using default hash partitioner on a skewed key:

// Default: job.setPartitionerClass(HashPartitioner.class);

RIGHT — implement a custom partitioner:

public class SkewAwarePartitioner extends Partitioner<Text, IntWritable> {
    @Override
    public int getPartition(Text key, IntWritable value, int numPartitions) {
        String k = key.toString();
        if (k.startsWith("hot_key")) {
            // Distribute hot keys across multiple partitions
            return (k.hashCode() & Integer.MAX_VALUE) % numPartitions;
        }
        return super.getPartition(key, value, numPartitions);
    }
}

job.setPartitionerClass(SkewAwarePartitioner.class);

3. Increase reducers for large data

WRONG — default 1 reducer:

# Default: job.setNumReduceTasks(1);

RIGHT — set appropriate reducer count:

# General formula: 0.95 * (node_count * max_reducers_per_node)
hadoop jar myjob.jar MyJob -D mapreduce.job.reduces=50

Or in Java:

job.setNumReduceTasks(50);

4. Enable compression

WRONG — no intermediate compression causes heavy disk I/O:

RIGHT — enable map output compression:

-D mapreduce.map.output.compress=true
-D mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec

And job output compression:

-D mapreduce.output.fileoutputformat.compress=true
-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec

5. Tune memory and JVM settings

WRONG — default 1GB memory per mapper:

# Increase mapper/reducer memory
-D mapreduce.map.memory.mb=2048
-D mapreduce.reduce.memory.mb=4096
-D mapreduce.map.java.opts="-Xmx1638m"
-D mapreduce.reduce.java.opts="-Xmx3276m"

6. Use Combine for map-side aggregation

WRONG — all aggregation done in reducer:

// No combiner set — all data goes to shuffle

RIGHT — add a combiner:

job.setCombinerClass(MyReducer.class);
// The combiner is the reducer class run on the map side
// for local aggregation, reducing shuffle data

// Or write a dedicated combiner:
public class MyCombiner extends Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterable<IntWritable> values, Context context) {
        int sum = 0;
        for (IntWritable val : values) sum += val.get();
        context.write(key, new IntWritable(sum));
    }
}

Expected output: the MapReduce job completes in significantly less time.

Prevention

Analyze data distribution before running jobs.
Use custom partitioners for skewed data.
Always enable intermediate compression for production jobs.
Set appropriate reducer count (0.95 * node capacity).
Use Combiners to reduce shuffle data volume.

Common Mistakes with mapreduce slow

Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
Using head and tail instead of pattern matching, causing runtime errors on empty lists
Forgetting that lazy evaluation defers computation until the value is forced, causing space leaks with unevaluated thunks

These mistakes appear frequently in real-world HADOOP code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.

Practice Exercise

Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.

FAQ

### What is the ideal number of reducers?

A good rule of thumb is 0.95 * (cluster_nodes * mapreduce.tasktracker.reduce.tasks.maximum), or 1 reducer per 1-2 GB of data. Too many reducers cause scheduling overhead, too few cause data skew issues.

Why is my job stuck at "reduce 99%"?

The reducer is in the sorting phase. This happens when the reducer receives a very large amount of data (data skew). Check if one reducer receives much more data than others. Fix with a custom partitioner or increase reducer count.

Does speculative execution help slow jobs?

Speculative execution launches duplicate tasks on different nodes. It helps when nodes are slow due to hardware issues (slow disks, failing CPUs). Speculation adds overhead and should be disabled if the cluster is healthy:

-D mapreduce.map.speculative=false
-D mapreduce.reduce.speculative=false

← Previous Hadoop Kerberos Authentication Error Fix Next → Hadoop NameNode Safe Mode Fix

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Quick Fix