Git Garbage Collection — Repo Cleanup Guide
In this tutorial, you'll learn about Git Garbage Collection. We cover key concepts, practical examples, and best practices.
Git repositories grow over time with loose objects, detached commits, and unused pack files. git gc (garbage collection) cleans these up, reducing disk usage and improving clone and fetch performance.
The Problem
du -sh .git
Shows:
2.3G .git
But the actual working tree is only 100MB. The repository has accumulated loose objects, leftover refs, and unreachable commits from rebases, force pushes, and long history.
git count-objects -v
Shows:
count: 45231
size: 523478
in-pack: 102344
packs: 47
prune-packable: 12301
Wrong Approach
# WRONG — aggressive gc without understanding the impact
git gc --aggressive --prune=now
# Runs for hours and may not help much
Right Approach
# Standard garbage collection
git gc
Expected output:
Counting objects: 12345, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (1234/1234), done.
Writing objects: 100% (12345/12345), done.
Total 12345 (delta 6789), reused 12345 (delta 6789), pack-reused 0
Step-by-Step Guide
Step 1: Check repository size
git count-objects -vH
Step 2: Run standard garbage collection
git gc
Step 3: Run auto gc (lighter)
git gc --auto
Step 4: Run a more thorough cleanup
git gc --prune=now --aggressive
Step 5: Prune unreachable objects
git reflog expire --expire-unreachable=now --all
git gc --prune=now
Step 6: Clean up large files from history
# Find large objects
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print $3, $4}' | sort -rn | head -10
Step 7: Verify the reduction
du -sh .git
Prevention Tips
- Run
git gc --autoperiodically (Git runs it automatically based on object count) - Set
gc.auto = 250(trigger gc when loose objects exceed 250, default 6700) - Use
git filter-repoto purge large files from history - Avoid committing large binaries directly
- Use Git LFS for large files and binaries
Common Mistakes with gc cleanup
- Forgetting that lazy evaluation defers computation until the value is forced, causing space leaks with unevaluated thunks
- Using
returnto exit a function early instead of wrapping a pure value in the monad - Mixing let bindings with <- bindings in do notation, producing type errors
These mistakes appear frequently in real-world GIT code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro