Skip to content

Instantly share code, notes, and snippets.

@vadimkantorov
Last active April 18, 2025 13:30
Show Gist options
  • Save vadimkantorov/1aecedbd1758010258020f34d75a95dd to your computer and use it in GitHub Desktop.
Save vadimkantorov/1aecedbd1758010258020f34d75a95dd to your computer and use it in GitHub Desktop.
A simple git lfs dedup impl done with hard links to avoid duplication of data object files (suitable for readonly cloned repos like models/datasets from HuggingFace, leaves the repo in an invalid state)
# Usage: bash git_lfs_clone_dedup.sh https://huggingface.co/deepseek-ai/DeepSeek-V3-0324 ~/DeepSeek-V3-0324
# Usage: bash git_lfs_clone_dedup.sh [email protected]:deepseek-ai/DeepSeek-V3-0324 ~/DeepSeek-V3-0324
# https://github.com/git-lfs/git-lfs/discussions/6029
GIT_LFS_SKIP_SMUDGE=1 git clone $1 $2
cd $2
git lfs fetch
git lfs ls-files -l | while read SHA DASH FILEPATH; do rm "$FILEPATH" && ln ".git/lfs/objects/${SHA:0:2}/${SHA:2:2}/$SHA" "$FILEPATH"; done
#git lfs ls-files -l | while read SHA DASH FILEPATH; do mv ".git/lfs/objects/${SHA:0:2}/${SHA:2:2}/$SHA" "$FILEPATH"; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment