Skip to content

Instantly share code, notes, and snippets.

@Sherex
Last active April 22, 2025 20:04
Show Gist options
  • Save Sherex/54fd9f2b9601f779f0c76ebc8b4ac2a8 to your computer and use it in GitHub Desktop.
Save Sherex/54fd9f2b9601f779f0c76ebc8b4ac2a8 to your computer and use it in GitHub Desktop.
A script to delete files and directories from archives specified with a glob pattern

Borg Delete Paths

A simple utility to delete paths from archives to save space in Borg repositories.

Warning

While I have used it a few times on one of my repositories it still should be considered untested. There is NO guarantee that it's bug-free and should be treated like that. Be careful, take measures, keep multiple backups.

I managed to backup several hundred GBs of podman layers over months... (I should monitor my backups more closely..) So I needed a tool to delete the whole podman directory from my archives, while also ensuring I deleted only the files I expected to be deleted.

Overview

The workflow of the script is:

  1. Retrieve all archives in the specified repository.
  2. Keep only the archives matching the glob
  3. Present the user with a list of archives which matched
  4. Loop through each archive a. Use borg list to get all files matching the exclude pattern b. Present the user with the count and size of files (skip if 0) c. Ask the user if they want to recreate the archive excluding the files listed earlier
    1. Using p the user can preview the files.
    2. Using y the user can accept and the archive will be recreated without those files (using borg recreate)
    3. Using a the user can accept this and all subsequent recreation requests (ie. it will go through all the matched archvies and recreate them)

Usage

  [BORG=<borg-executable-path>] ./borg-delete-paths.sh <repository> [archive_glob] <exclude_pattern1> [<exclude_pattern2> ...]

Description:
  Deletes specified paths from one or more Borg archives using exclude patterns.

Arguments:
  BORG                   Optional. Path to a Borg executable (default: "borg").
  <repository>           Target Borg repository. Use "::" for the default repository.
  [archive_glob]         Optional. Glob pattern to match archive names (e.g., "MyArchive-*").
                         If omitted, all archives in the repository will be considered.
                         NOTE: This option is detected by checking for the exitence of a wildcard in the 2nd argument.
                               Horrible I know, maybe a future version uses GNU options.
  <exclude_patternN>     One or more path patterns to delete from the matched archives.

Example:
  BORG=borg-job-safe ./borg-delete-paths.sh :: Archy-safe-2025* persistent/safe/home/sherex/{.cache,containers}
  ./borg-delete-paths.sh :: Archy-safe-2025-01-01* persistent/safe/home/sherex/.cache persistent/safe/home/sherex/.cargo
#!/usr/bin/env bash
BORG=${BORG:-borg}
if [ "$#" -lt 2 ]; then
echo "Usage: $0 <repository> [archive_glob] <exclude_pattern1> [<exclude_pattern2> ...]"
echo "If the archive_glob is omitted, '*' is assumed."
exit 1
fi
REPO="$1"
shift
if [[ "$1" == *"*"* ]]; then
ARCHIVE_GLOB="$1"
shift
else
ARCHIVE_GLOB="*"
fi
if [[ "$REPO" == "::" ]]; then
REPO=""
fi
if [ "$#" -lt 1 ]; then
echo "You must provide at least one --exclude pattern."
exit 1
fi
EXCLUDE_PATTERNS=( "$@" )
echo "Using Borg executable: $BORG"
echo "Repository: $REPO"
echo "Archive glob filter: $ARCHIVE_GLOB"
echo "Exclude patterns:"
for EX in "${EXCLUDE_PATTERNS[@]}"; do
echo " $EX"
done
echo "--------------------------------------------"
echo "Retrieving archives from repository..."
ARCHIVES_RAW=$($BORG list "$REPO" --short 2>/dev/null)
if [ $? -ne 0 ] || [ -z "$ARCHIVES_RAW" ]; then
echo "Error: Unable to list archives from repository $REPO"
exit 1
fi
readarray -t ALL_ARCHIVES <<< "$ARCHIVES_RAW"
MATCHED_ARCHIVES=()
for archive in "${ALL_ARCHIVES[@]}"; do
if [[ "$archive" == $ARCHIVE_GLOB ]]; then
MATCHED_ARCHIVES+=( "$archive" )
fi
done
if [ ${#MATCHED_ARCHIVES[@]} -eq 0 ]; then
echo "No archives in repository '$REPO' match the glob '$ARCHIVE_GLOB'."
exit 0
fi
echo "The following archives will be processed (repository::archive):"
for archive in "${MATCHED_ARCHIVES[@]}"; do
echo " $REPO::$archive"
done
echo "--------------------------------------------"
read -p "Proceed with processing these archives? [y/N]: " confirm
if [[ ! "$confirm" =~ ^[Yy]$ ]]; then
echo "Aborted by user."
exit 0
fi
human_readable_size() {
local size=$1
local units=(B KB MB GB TB PB)
local i=0
while ((size >= 1024 && i < ${#units[@]} - 1)); do
size=$((size / 1024))
((i++))
done
echo "$size ${units[$i]}"
}
ALWAYS_RUN=false
for archive in "${MATCHED_ARCHIVES[@]}"; do
echo "--------------------------------------------"
echo "Processing archive: $REPO::$archive"
CMD=( "$BORG" "recreate" "--progress" )
for EXCLUDE in "${EXCLUDE_PATTERNS[@]}"; do
CMD+=( "--exclude" "$EXCLUDE" )
done
FULL_ARCHIVE="$REPO::$archive"
CMD+=( "$FULL_ARCHIVE" )
echo "Retrieving file list for the exclude patterns above..."
FILE_LIST=$($BORG list "$FULL_ARCHIVE" "${EXCLUDE_PATTERNS[@]}" 2>/dev/null)
FILE_COUNT=$(echo "$FILE_LIST" | wc -l)
TOTAL_SIZE_BYTES=$(echo "$FILE_LIST" | awk '{sum+=$4;} END{print sum;}')
TOTAL_SIZE_HR=$(human_readable_size "$TOTAL_SIZE_BYTES")
if [ "$TOTAL_SIZE_BYTES" -eq 0 ]; then
echo "Total size of files matching pattern: $TOTAL_SIZE_HR"
echo "Pattern did not match any files. Skipping..."
continue
fi
echo "Command to be executed (affects $FILE_COUNT files, total size $TOTAL_SIZE_HR):"
printf ' %q ' "${CMD[@]}"
echo
if [ "$ALWAYS_RUN" = false ]; then
while true; do
read -p "Execute this recreate command? [y/N/a (always)/p (preview)]: " answer
case "$answer" in
[Yy]* ) break ;;
[Aa]* )
ALWAYS_RUN=true
echo "Setting option 'always' for all remaining archives."
break ;;
[Pp]* )
echo "$FILE_LIST" | less
continue ;;
* )
echo "Skipping archive: $FULL_ARCHIVE"
continue 2 ;;
esac
done
fi
echo "Executing command for $FULL_ARCHIVE ..."
"${CMD[@]}"
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
echo "Command executed successfully for $FULL_ARCHIVE."
else
echo "Error executing command for $FULL_ARCHIVE (exit code $RETVAL)."
fi
done
echo "All selected archives have been processed."
echo "You should now run:"
echo "$BORG compact --progress --cleanup-commits $REPO"
echo "This might take a long time (hours) depending on how many files was deleted and how long it is since the last compact.
# Untested (I manually ran compact first and created this while waiting, use with caution)
#COMPACT_CMD=("$BORG" compact --progress --cleanup-commits "$REPO")
#echo "Command to be executed:"
#printf ' %q ' "${COMPACT_CMD[@]}"
#echo
#read -p "Compact repository? [y/N]: " should_execute_compact
#if [ "$should_execute_compact" == "[Yy]" ]; then
# echo "Compacting repository.."
# "${COMPACT_CMD[@]}"
#fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment