Skip to content

Git Archive — Exporting Projects

DodaTech Updated 2026-06-24 6 min read

In this tutorial, you'll learn about Git Archive. We cover key concepts, practical examples, and best practices.

Git archive creates compressed archives of your repository at a specific commit, excluding version control metadata (.git) for clean deployment artifacts.

In this tutorial, you'll learn Git archive — how to export repository snapshots as tar or zip archives, include submodules, customize output formats, and integrate archive creation into your CI/CD pipeline. Archive exports are essential for deployments, sharing source code without history, creating release artifacts, and distributing projects without the .git directory. By the end, you'll automate archive creation for every release.

Real-world use: DodaZIP ships source archives for every release using git archive. Durga Antivirus Pro uses git archive in CI to package deployment artifacts without the repository's full history.

flowchart LR
  A[git archive] --> B[Repository]
  B --> C[Select Format]
  C --> D[tar]
  C --> E[zip]
  C --> F[tar.gz]
  D & E & F --> G[Select Commit]
  G --> H[HEAD]
  G --> I[Tag]
  G --> J[Commit hash]
  H & I & J --> K[Output Archive]
  K --> L[--output file.tar]
  K --> M[stdout stream]

Basic Archive Creation

Create a tar archive of the current HEAD.

# Create a tar archive of the current HEAD
git archive --output=project.tar HEAD

# Create a gzipped tar
git archive --output=project.tar.gz HEAD

# Create a zip archive
git archive --output=project.zip HEAD

Expected output:

$ ls -la project.tar
-rw-r--r-- 1 user user 245760 Jun 24 10:00 project.tar

$ tar tf project.tar | head -5
src/
src/index.js
src/utils/
src/utils/helpers.js
README.md

Exporting Specific Commits

Archive any commit, tag, or branch.

# Archive by tag (for releases)
git archive --output=release-v3.2.0.tar.gz v3.2.0

# Archive by commit hash
git archive --output=snapshot-a1b2c3d.tar.gz a1b2c3d

# Archive a specific branch
git archive --output=develop-snapshot.tar.gz develop

# Archive a relative ref
git archive --output=yesterday.tar.gz HEAD@{1.day.ago}

Adding a Prefix Directory

Wrap all files in a directory to avoid name collisions when extracting.

# Without prefix: contents extract into current directory
git archive --output=app.tar.gz HEAD

# With prefix: all files extract into app-v3.2.0/
git archive \
  --output=app-v3.2.0.tar.gz \
  --prefix=app-v3.2.0/ \
  v3.2.0

Expected extraction:

$ tar xzf app-v3.2.0.tar.gz
$ ls app-v3.2.0/
src/  README.md  package.json  config/

Archiving a Subdirectory

Export only a portion of the repository.

# Archive only the src/ directory at HEAD
git archive --output=src-only.tar.gz HEAD:src/

# Archive the docs/ directory at a specific tag
git archive --output=docs-v3.2.0.tar.gz v3.2.0:docs/

# Archive multiple paths by piping
git archive --output=partial.tar.gz HEAD:src/ HEAD:config/

Using git archive in CI/CD

Automate archive creation in your CI pipeline for deployment artifacts.

# .github/workflows/release-archive.yml
name: Create Release Archive
on:
  push:
tags:
      - 'v*'

jobs:
  archive:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Create archive
        run: |
          TAG=${GITHUB_REF#refs/tags/}
          git archive \
            --output=release-${TAG}.tar.gz \
            --prefix=${TAG}/ \
            ${TAG}

      - name: Upload to release
        uses: softprops/action-gh-release@v2
        with:
          files: release-*.tar.gz

Expected CI output:

Creating archive from tag v3.2.0
Output: release-v3.2.0.tar.gz
Uploaded to GitHub Release

Archive with Submodules

By default, git archive skips submodule content. Include them manually.

#!/bin/bash
# archive-with-submodules.sh
# Creates an archive that includes submodule content

TAG=${1:-HEAD}
OUTPUT=${2:-archive-with-submodules.tar.gz}
PREFIX=${3:-}

# Archive the main repository
git archive --output=main-repo.tar --prefix="${PREFIX}" "${TAG}"

# Create a temporary directory and add submodules
TEMP_DIR=$(mktemp -d)

git submodule foreach --recursive \
  "git archive --output=${TEMP_DIR}/\${name}.tar HEAD:/"

# Combine all archives
tar -Af main-repo.tar ${TEMP_DIR}/*.tar

# Compress
gzip -c main-repo.tar > "${OUTPUT}"
rm -rf main-repo.tar ${TEMP_DIR}

echo "Created ${OUTPUT} with submodules"

Streaming Archives over SSH

Generate archives remotely without cloning.

# Stream archive directly from remote
ssh user@server "cd /repo && git archive --output=deploy.tar.gz HEAD"

# Stream to another server
ssh repo-server "cd /project && git archive HEAD" | \
  ssh deploy-server "cd /var/www && tar xzf -"

# Using git archive --remote (if enabled on server)
git archive --remote=ssh://git@server/repo.git \
  --output=remote-archive.tar.gz \
  HEAD

Customizing Archive Content with .gitattributes

Control exactly which files are included or excluded from archives.

# .gitattributes — archive settings

# Exclude development files from archives
.gitignore          export-ignore
.gitattributes      export-ignore
.env.example        export-ignore
tests/              export-ignore
docs/               export-ignore
.editorconfig       export-ignore
Dockerfile          export-ignore
docker-compose.yml  export-ignore

# Include build artifacts that are normally gitignored
/dist               export-include
/build              export-include

# Set file permissions in archive
*.sh                -export-perm=755
*.py                -export-perm=644
*.js                -export-perm=644

Now build your archive:

git archive --output=clean-release.tar.gz HEAD

Expected content:

$ tar tf clean-release.tar.gz
src/
src/index.js
package.json
README.md
dist/
dist/bundle.js
# tests/ and .gitignore are excluded

Archive Size Comparison

Compare archive sizes for different formats and compression levels.

#!/bin/bash
# compare-archive-formats.sh

TAG=$(git describe --tags --abbrev=0)
echo "Archiving tag: ${TAG}"

git archive --format=tar "${TAG}" | wc -c | numfmt --to=iec
# Expected: 2.3M

git archive --format=tar.gz "${TAG}" | wc -c | numfmt --to=iec
# Expected: 480K

git archive --format=zip "${TAG}" -o /dev/null && \
  ls -lh /dev/null 2>/dev/null || true
# Check actual zip size:
git archive --format=zip --output=test.zip "${TAG}" && \
  ls -lh test.zip && rm test.zip
# Expected: 520K

Common Errors

  1. export-ignore not working — The .gitattributes file must be committed to the repository. Local-only .gitattributes files are not read during git archive.
  2. Empty archives — If export-ignore excludes everything, the archive has zero files. Run git archive --list first to see what would be included.
  3. Submodule content missinggit archive does not include submodule content by default. Use the script above or a CI step to merge submodule archives.
  4. Binary files corrupted in streamed archives — Piping through SSH without -z can corrupt binary files. Use gzip compression or tar czf for streaming.
  5. --remote disabled on server — Many Git hosts disable git archive --remote for security. Use SSH to the server or clone-then-archive instead.

Practice Questions

What does git archive do?

git archive creates a compressed archive (tar or zip) of a repository at a specific commit, tag, or branch. It excludes the .git directory, making it ideal for deployments, releases, and sharing snapshots of the source code.

How do I exclude files from git archive?

Add export-ignore to .gitattributes for files or directories you want to exclude. For example, tests/ export-ignore excludes the test directory. The .gitattributes file must be committed to the repository.

Can git archive include submodules?

Not by default. git archive only archives the main repository. To include submodules, use a script that runs git archive on each submodule separately and combines them. Git 2.36+ has experimental support but not production-ready.

What is the difference between git archive and git bundle?

git archive creates a source code snapshot (no history) for distribution. git bundle creates a binary that includes the full Git history, allowing it to be cloned or pulled from. Use archive for releases, bundle for backup or offline transfer.

How do I use git archive in a CI/CD pipeline?

After checking out the repository, run git archive --output=release.tar.gz HEAD. Upload the resulting file as a build artifact or release asset. Use the --prefix option to wrap files in a versioned directory for clean extraction

Challenge

Write a script that automates release artifact creation. The script should: take a tag name as input, create a tar.gz archive with a versioned prefix, exclude all development files (tests, docs, CI configs), include submodule content merged into the archive, upload the archive to GitHub Releases, and print the final archive size and file count.

Real-World Task

Set up automated archive generation for DodaZIP releases. Configure .gitattributes to exclude tests/, docs/, ci/, Dockerfile, and local config files from release archives. Create a CI workflow that triggers on version tags, builds the archive with git archive, adds submodule content for shared libraries, and uploads the artifact to the GitHub Release page with a checksum file for verification.


Previous: Git Tags & Releases | Related: Git for Teams | Related: GitHub Actions

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro