Skip to content

Git Garbage Collection — Repo Cleanup Guide

DodaTech Updated 2026-06-24 3 min read

In this tutorial, you'll learn about Git Garbage Collection. We cover key concepts, practical examples, and best practices.

Git repositories grow over time with loose objects, detached commits, and unused pack files. git gc (garbage collection) cleans these up, reducing disk usage and improving clone and fetch performance.

The Problem

du -sh .git

Shows:

2.3G    .git

But the actual working tree is only 100MB. The repository has accumulated loose objects, leftover refs, and unreachable commits from rebases, force pushes, and long history.

git count-objects -v

Shows:

count: 45231
size: 523478
in-pack: 102344
packs: 47
prune-packable: 12301

Wrong Approach

# WRONG — aggressive gc without understanding the impact
git gc --aggressive --prune=now
# Runs for hours and may not help much

Right Approach

# Standard garbage collection
git gc

Expected output:

Counting objects: 12345, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (1234/1234), done.
Writing objects: 100% (12345/12345), done.
Total 12345 (delta 6789), reused 12345 (delta 6789), pack-reused 0

Step-by-Step Guide

Step 1: Check repository size

git count-objects -vH

Step 2: Run standard garbage collection

git gc

Step 3: Run auto gc (lighter)

git gc --auto

Step 4: Run a more thorough cleanup

git gc --prune=now --aggressive

Step 5: Prune unreachable objects

git reflog expire --expire-unreachable=now --all
git gc --prune=now

Step 6: Clean up large files from history

# Find large objects
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print $3, $4}' | sort -rn | head -10

Step 7: Verify the reduction

du -sh .git

Prevention Tips

  • Run git gc --auto periodically (Git runs it automatically based on object count)
  • Set gc.auto = 250 (trigger gc when loose objects exceed 250, default 6700)
  • Use git filter-repo to purge large files from history
  • Avoid committing large binaries directly
  • Use Git LFS for large files and binaries

Common Mistakes with gc cleanup

  1. Forgetting that lazy evaluation defers computation until the value is forced, causing space leaks with unevaluated thunks
  2. Using return to exit a function early instead of wrapping a pure value in the monad
  3. Mixing let bindings with <- bindings in do notation, producing type errors

These mistakes appear frequently in real-world GIT code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.

Practice Exercise

Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.

FAQ

### How often does Git run garbage collection automatically?

Git automatically runs git gc --auto when the number of loose objects exceeds gc.auto (default 6700) or pack files exceed gc.autoPackLimit (default 50). Most repositories never need manual gc.

What is the difference between git gc and git gc --aggressive?

git gc packs loose objects and removes unreachable objects but keeps the existing pack structure. git gc --aggressive re-packs all objects from scratch into a minimal set of packs, which can reduce disk usage but takes significantly longer.

Does garbage collection delete my commits?

No. git gc removes unreachable objects that are not referenced by any branch, tag, or reflog entry. Commits reachable from branches and tags are never deleted. Objects referenced by reflog are retained until the reflog entry expires.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro