Hadoop YARN Memory Allocation Fix

Q: ### What's the difference between physical and virtual memory in YARN?

Physical memory is the actual RAM used. Virtual memory includes mapped files, shared libraries, and allocated-but-not-touched memory. Java processes often commit more virtual memory than physical. The default vmem-pmem-ratio of 2.1 is too low for some Java workloads. ### Why does YARN kill containers even with enough memory? Check the `yarn.nodemanager.vmem-pmem-ratio` setting. The Java JVM pre-commits virtual memory that can exceed the physical memory limit. Also check `mapreduce.map.java.opts` — it must be less than the container memory. ### How do I calculate the optimal container size? Container size = (total_yarn_memory_per_node - slack) / max_concurrent_containers. For a 24GB node running 4 cores: container size ≈ 6GB per container. Leave slack for OS and other services.

DodaTech Updated 2026-06-24 3 min read

In this tutorial, you'll learn about Hadoop YARN Memory Allocation Fix. We cover key concepts, practical examples, and best practices.

A MapReduce or Spark job fails with:

Container [pid=12345,containerID=container_123] is running beyond virtual memory limits.
Killing container.

YARN kills containers when they exceed their allocated memory. This usually means the application requested less memory than it actually needs, or the cluster's memory configuration is too tight. YARN has both physical and virtual memory limits.

Step-by-Step Fix

1. Increase container memory

WRONG — default container memory is often too low (1GB):

# Defaults:
# yarn.scheduler.minimum-allocation-mb = 1024
# yarn.scheduler.maximum-allocation-mb = 8192

RIGHT — increase memory allocation:

# For the job:
-D mapreduce.map.memory.mb=2048
-D mapreduce.reduce.memory.mb=4096
-D mapreduce.map.java.opts="-Xmx1638m"
-D mapreduce.reduce.java.opts="-Xmx3276m"

In yarn-site.xml:

<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
</property>
<property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>16384</value>
</property>

2. Disable or adjust virtual memory check

WRONG — strict virtual memory check kills valid containers:

<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>true</value>  <!-- Default -->
</property>
<property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>  <!-- Default ratio -->
</property>

RIGHT — increase the ratio or disable for development:

<property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>  <!-- Allow 4x virtual memory vs physical -->
</property>

Or disable entirely (development only):

<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>

3. Calculate cluster memory capacity

# Total cluster memory = sum of all node memory
# Each node's YARN memory = total_ram - OS_overhead - non-YARN_services

# Example: 8 nodes, 32GB RAM each, 25% overhead
# Available: 32 * 0.75 = 24GB per node
# Total cluster: 8 * 24 = 192GB

<!-- Per node -->
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>24576</value>  <!-- 24GB per node -->
</property>
<property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>8</value>
</property>

4. Configure spark.yarn.executor memory

For Spark on YARN:

--conf spark.executor.memory=4g
--conf spark.executor.memoryOverhead=1g
--conf spark.driver.memory=4g
--conf spark.yarn.executor.memoryOverhead=1024

5. Check for memory leaks

WRONG — containers slowly consume more memory over time:

RIGHT — analyze with logs:

# Check container logs for OOM errors
yarn logs -applicationId application_123_456 -containerId container_123

# Look for:
# GC overhead limit exceeded
# OutOfMemoryError: Java heap space
# Container killed on request. Exit code is 143

For memory leaks in MapReduce:

// In setup(), clear any cached data
@Override
protected void setup(Context context) {
    // Clear any static caches
    cache.clear();
}

6. Use fair or capacity scheduler for better allocation

# Configure capacity scheduler
-C mapreduce.job.queuename=high_mem

<property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>default,high_mem</value>
</property>
<property>
    <name>yarn.scheduler.capacity.root.high_mem.capacity</name>
    <value>50</value>
</property>
<property>
    <name>yarn.scheduler.capacity.root.high_mem.maximum-capacity</name>
    <value>100</value>
</property>

Expected output: containers run within memory limits without being killed.

Prevention

Set realistic container memory based on job requirements.
Monitor YARN UI (port 8088) for container kill reasons.
Enable virtual memory checks with a reasonable ratio (3-4x).
Reserve OS memory (25% of total RAM) for system processes.
Use yarn.nodemanager.resource.memory-mb to cap per-node allocation.

Common Mistakes with yarn memory

Mixing let bindings with <- bindings in do notation, producing type errors
Overlapping type class instances that cause GHC to reject the program with ambiguous dispatch errors
Non-exhaustive pattern matches that compile with warnings then crash at runtime

These mistakes appear frequently in real-world HADOOP code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.

Practice Exercise

Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.

FAQ

### What's the difference between physical and virtual memory in YARN?

Physical memory is the actual RAM used. Virtual memory includes mapped files, shared libraries, and allocated-but-not-touched memory. Java processes often commit more virtual memory than physical. The default vmem-pmem-ratio of 2.1 is too low for some Java workloads.

Why does YARN kill containers even with enough memory?

Check the yarn.nodemanager.vmem-pmem-ratio setting. The Java JVM pre-commits virtual memory that can exceed the physical memory limit. Also check mapreduce.map.java.opts — it must be less than the container memory.

How do I calculate the optimal container size?

Container size = (total_yarn_memory_per_node - slack) / max_concurrent_containers. For a 24GB node running 4 cores: container size ≈ 6GB per container. Leave slack for OS and other services.

← Previous Hadoop NameNode Safe Mode Error Fix Next → How to Fix HAProxy ACL Configuration Error

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Quick Fix