Last active
May 11, 2024 21:14
-
-
Save johnnymo87/4701b6671730768ba95f19a5ee29a177 to your computer and use it in GitHub Desktop.
Concatenates code files from a directory and its subdirectories into a single output file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
: <<'END' | |
Script Name: code_concatenator.bash | |
Purpose: | |
This script is designed to concatenate all code files within a specified directory | |
and its subdirectories into a single output file. The output file will contain the | |
file paths and contents of each code file, separated by a delimiter (```) This | |
script is particularly useful for preparing code files for analysis or processing | |
by other tools or services that require a single file input. | |
Usage: | |
./code_concatenator.bash <CODE_DIR> | |
Arguments: | |
- CODE_DIR: The directory containing the code files to be concatenated. | |
Features: | |
- Respects the .gitignore file if the current working directory is a Git repository. | |
- Recursively traverses the specified directory and its subdirectories. | |
- Supports various file extensions (e.g., .js, .py, .java, .cpp, etc.). | |
- Handles files with unbalanced backticks or other special characters. | |
- Outputs the concatenated file contents to the console (can be redirected to a file). | |
Background: | |
The script uses a combination of Bash built-in commands and utilities to achieve | |
its functionality. It checks if the current working directory is a Git repository | |
and uses the `git ls-files` command to list files, respecting the .gitignore file. | |
If the current working directory is not a Git repository, the script falls back to | |
the recursive traversal method. The script handles potential issues with the | |
use of the ``` delimiter appearing in the source code by using a custom | |
intermediate delimiter (###) before adjusting it to ``` at the very end. | |
Dependencies: | |
- Bash: The script is written for Bash shell environments found in Linux and macOS | |
systems. | |
- Git: The script requires Git to be installed if the current working directory is | |
a Git repository. | |
Note: | |
While this script is designed to handle a wide range of code files, it may not work | |
as expected for files with extremely large sizes or specific encoding issues. It's | |
recommended to review the output and adjust the script as needed for your specific | |
use case. | |
Note: | |
This script was written with the assistance of the "claude-3-opus-20240229" model | |
developed by Anthropic. | |
Author: Jonathan Mohrbacher (github.com/johnnymo87) | |
Date: 2024-04-13 | |
END | |
set -euo pipefail | |
# Function to concatenate files | |
concatenate_files() { | |
local dir="$1" | |
local output_file="$2" | |
local repo_root | |
# Check if the current working directory is a Git repository | |
if git rev-parse --is-inside-work-tree >/dev/null 2>&1; then | |
# Get the root directory of the Git repository | |
repo_root=$(git rev-parse --show-toplevel) | |
# Use git ls-files to list files, respecting .gitignore | |
git ls-files -- "$dir" | while read -r file; do | |
file_path="$repo_root/$file" | |
printf -v file_path_output "%s\n" "$file_path" | |
printf -v delimiter "###\n" | |
printf "%s" "$file_path_output" >> "$output_file" | |
printf "%s" "$delimiter" >> "$output_file" | |
cat "$file_path" >> "$output_file" 2>/dev/null | |
printf "%s\n" "$delimiter" >> "$output_file" | |
done | |
else | |
# Traverse the directory recursively | |
for file in "$dir"/*; do | |
if [ -d "$file" ]; then | |
concatenate_files "$file" "$output_file" | |
elif [ -f "$file" ]; then | |
file_path="$file" | |
printf -v file_path_output "%s\n" "$file_path" | |
printf -v delimiter "###\n" | |
printf "%s" "$file_path_output" >> "$output_file" | |
printf "%s" "$delimiter" >> "$output_file" | |
cat "$file" >> "$output_file" 2>/dev/null | |
printf "%s\n" "$delimiter" >> "$output_file" | |
fi | |
done | |
fi | |
} | |
# Check if a directory is provided | |
if [ -z "$1" ]; then | |
echo "Usage: $0 <CODE_DIR>" | |
exit 1 | |
fi | |
# Create a temporary file for output | |
output_file=$(mktemp) | |
# Call the recursive function | |
concatenate_files "$1" "$output_file" | |
# Print the contents of the output file, replacing ### with ```. | |
cat "$output_file" | LC_ALL=C sed 's/###$/```/g' | |
# Clean up the temporary file | |
rm "$output_file" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment