Apache Spark Out of Memory Fix

DodaTech Updated 2026-06-24 2 min read

In this tutorial, you'll learn about Apache Spark Out of Memory Fix. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Your Spark job fails with java.lang.OutOfMemoryError: Java heap space or Container killed by YARN for exceeding memory limits — the executor ran out of memory due to large partitions, data skew, or inefficient Serialization.

Step-by-Step Fix

1. Check the current memory configuration

spark-submit --conf spark.executor.memory=2g --conf spark.driver.memory=2g my_job.py

2. Increase executor memory

# Wrong — default memory (1g) is too low for large datasets
spark = SparkSession.builder.appName("myapp").getOrCreate()

# Right — allocate more memory with overhead
spark = SparkSession.builder \
    .appName("myapp") \
    .config("spark.executor.memory", "4g") \
    .config("spark.executor.memoryOverhead", "1g") \
    .config("spark.driver.memory", "4g") \
    .getOrCreate()

3. Handle data skew with salting

# Wrong — full shuffle on skewed key
df.groupBy("city").count().show()

# Right — add salt to distribute the load
from pyspark.sql.functions import col, lit, concat, rand

salted = df.withColumn("salted_key",
    concat(col("city"), lit("_"), (rand() * 10).cast("int")))
counts = salted.groupBy("salted_key").count()

4. Use broadcast joins for small tables

# Wrong — causes large shuffle
result = large_df.join(small_df, "key")

# Right — broadcast the small table
from pyspark.sql.functions import broadcast

result = large_df.join(broadcast(small_df), "key")

Common Mistakes

Mistake	Fix
Too few partitions	Repartition with `df.repartition(200)`
Too many partitions causing overhead	Coalesce with `df.coalesce(50)`
Using `groupBy` on highly skewed column	Use salting or bucketing to distribute data
Kryo Serialization not enabled	Set `spark.serializer=org.apache.spark.serializer.KryoSerializer`
Disk spill due to insufficient memory	Increase `spark.executor.memory` and `spark.shuffle.memoryFraction`

Prevention

Monitor Spark UI's Storage and Executors tabs for memory usage.
Use column pruning and filter pushdown to reduce data volume.
Choose appropriate cluster size — 5 executors with 4g each is better than 1 executor with 20g.
Set spark.dynamicAllocation.enabled=true for variable workloads.

DodaTech Tools

Doda Browser's Spark profiler visualizes executor memory usage and identifies memory-intensive stages. DodaZIP compresses and archives Spark event logs for offline analysis. Durga Antivirus Pro monitors for abnormal memory patterns that could indicate resource abuse.

Common Mistakes with spark oom

Using foldl instead of foldl' causing stack overflow on large lists
Forgetting deriving (Show, Eq) on custom data types needed for debugging
Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable

These mistakes appear frequently in real-world APACHE code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.

Practice Exercise

Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.

FAQ

What is the difference between executor memory and executor memoryOverhead?

Executor memory is for the Spark executor JVM. overhead is off-heap memory for things like Python processes (PySpark), JVM internals, and native libraries. ||| How do I know how much memory each partition needs? Check the Spark UI > Stages > Shuffle Read Size / Records. Aim for partitions of 100-200 MB each. Total memory = partition size * (2-3) for shuffle buffers. ||| Why does increasing memory sometimes not fix OOM? OOM can be caused by data skew (one partition has all the data). Adding memory masks the problem — fixing skew with salting is the real solution.

← Previous Apache Spark Ml Vector Quick Fix Next → Apache Spark Rdd Action Quick Fix

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Quick Fix