Skip to content

Git Clean & GC: Housekeeping for Your Repository

DodaTech Updated 2026-06-22 5 min read

In this tutorial, you'll learn about Git Clean & GC: Housekeeping for Your Repository. We cover key concepts, practical examples, and best practices.

Git clean removes untracked files from your working directory while git gc compresses Git's internal storage, keeping your repository healthy.

In this tutorial, you'll learn Git clean and garbage collection — two essential housekeeping commands that keep your repositories healthy. Untracked files accumulate over time, and Git's internal database grows with every commit. By the end, you'll maintain clean working directories and optimized repositories.

flowchart TD
  A[git clean] --> B{What to remove?}
  B --> C[Untracked files]
  B --> D[Untracked + ignored files]
  B --> E[Dry run - show what would be removed]
  
  F[git gc] --> G[Compress loose objects]
  G --> H[Remove unreachable objects]
  H --> I[Optimize packfiles]
  I --> J[Smaller, faster repository]

Git Clean: Removing Untracked Files

# Dry run - show what would be removed
git clean -n

# Remove untracked files
git clean -f

# Remove untracked files and directories
git clean -fd

# Remove untracked files AND ignored files (e.g., build artifacts)
git clean -fx

# Interactive mode - choose what to remove
git clean -i

Expected output:

Removing temp.log
Removing build/

Working with git clean Safely

Start with a dry run:

$ git clean -n
Would remove temp.pyc
Would remove logs/debug.log

Then force clean:

git clean -f

To keep certain untracked files while removing others, use .gitignore or -e:

git clean -f -e config.env

Git Garbage Collection

Git automatically runs gc periodically, but you can trigger it manually:

git gc

Expected output:

Enumerating objects: 150, done.
Counting objects: 100% (150/150), done.
Delta compression using up to 8 threads
Compressing objects: 100% (120/120), done.
Writing objects: 100% (150/150), done.
Total 150 (delta 30), reused 100 (delta 20), pack-reused 0

Aggressive Garbage Collection

For repositories that have grown very large:

git gc --aggressive

# Force garbage collection even with recent reuse
git gc --prune=now

Warning: --prune=now permanently removes loose objects. Only use it when you're sure you don't need to recover anything.

Repository Size Analysis

Check your repository's size and identify space hogs:

# Check .git folder size
du -sh .git

# Find largest objects
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print $0}' | sort --numeric-sort --key=3 | tail -10

Expected output:

blob a1b2c3d 10485760 large-file.zip
blob b2c3d4e  5242880 another-big-file.iso

Housekeeping Checklist

Task Command Frequency
Dry run clean git clean -n Before clean
Remove untracked git clean -fd Weekly
Garbage collect git gc Monthly
Aggressive GC git gc --aggressive Quarterly
Check size du -sh .git Monthly
Find large objects git rev-list --objects Quarterly
Prune remote tracking git remote prune origin Monthly

Common Errors

Error Cause Fix
fatal: clean.requireForce defaults to true Clean without -f Add -f flag
Would not remove with -f File is ignored Add -x to remove ignored files
gc --prune=now deleted essential objects Objects still referenced Objects in packfiles are safe; loose objects may be lost
Another git process seems to run Lock file exists Remove .git/index.lock
Cannot rewrite packs Repository corruption Use git fsck to check integrity
Clean -fd removed wrong files Did not use dry run first Always git clean -n before -f
Repository size not reducing after gc Packed objects still exist GC only removes unreachable objects
warning: reflog of 'HEAD' pruned Normal GC behavior Reflog entries older than expiry were removed

Practice Questions

What is the difference between git clean and git rm?

git clean removes untracked files from the working directory (files not in Git's index). git rm removes tracked files from both the index and working directory. Use clean for clutter like compiled files; use rm for files that should never have been committed.

What does git gc do?

git gc (garbage collection) compresses Git's object database. It consolidates loose objects into packfiles, removes unreachable objects (orphaned commits from rebases or resets), and optimizes the packfile structure. This shrinks the .git folder and improves performance.

Is git clean dangerous?

Yes, if used without care. Always run git clean -n first for a dry run. The -f flag permanently deletes files (they bypass the trash). The -x flag also removes ignored files that might be important (like .env files). Start with -n, then verify what will be deleted.

How often should I run git gc?

Git runs gc automatically when there are enough loose objects. Manual gc is needed rarely — monthly for active repos, quarterly for others. The --aggressive option is for repos that have grown very large and should be used sparingly (it's CPU-intensive).

Can git gc delete important data?

git gc with default options only removes unreachable objects (commits not in any branch or tag). It does not remove reachable objects. However, git gc --prune=now immediately removes loose unreachable objects — if you need to recover a recently lost commit, prune too aggressively

Challenge

Create a repository with 50 commits, including some large files (create with dd). Run git clean -n then git clean -fd. Check the .git folder size. Run git gc and compare the size. Then run git gc --aggressive and compare again. Finally, find the 5 largest objects in the repository.

Real-World Task

A CI/CD pipeline is running out of disk space because the repository accumulates temporary build artifacts. Configure a post-build cleanup step in the CI pipeline that runs git clean -fdx to remove all untracked and ignored files (compiled binaries, dependencies, logs). Schedule a monthly git gc for the repository using a cron job. This optimization is standard at DodaTech for CI pipelines building Durga Antivirus Pro, where build artifacts can exceed 10 GB per pipeline run.


Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro