Of course! Here is a well-commented and robust shell script that accomplishes the task. It's designed to be safe, handling filenames with spaces or special characters, and provides clear user feedback.
Key Features of this Script:
Robust File Handling: Uses find -print0 and read -d '' to safely handle filenames containing spaces, newlines, or other special characters.
Clear Argument Parsing: It checks for the required arguments and provides a helpful usage message if they are incorrect.
Dynamic find Command: The script builds the find command dynamically and safely, whether you provide extensions or not.
Relative Paths: Correctly calculates and displays the path relative to the starting directory.
Well-Commented: Each section is explained so you can understand what it's doing.
The Script: concat_files.sh Generated bash #!/bin/bash
set -euo pipefail # Exit on error, undefined variable, or pipe failure
usage() { echo "Usage: $0 <source_directory> <output_file> [extension1 extension2 ...]" echo echo "Arguments:" echo " <source_directory> The directory to search for files recursively." echo " <output_file> The file to write the concatenated content to." echo " [extension1 ...] (Optional) A list of file extensions to include (e.g., .txt .md .py)." echo " If not provided, all files will be included." echo echo "Example (all files):" echo " $0 ./my_project combined_output.txt" echo echo "Example (specific extensions):" echo " $0 ./my_code combined_code.txt .js .css .html" exit 1 }
if [ "$#" -lt 2 ]; then echo "Error: Missing required arguments." usage fi
SOURCE_DIR="$1" OUTPUT_FILE="$2" shift 2 # Remove the first two arguments, leaving only the extensions
if [ ! -d "$SOURCE_DIR" ]; then echo "Error: Source directory '$SOURCE_DIR' not found." exit 1 fi
temp_output_file="$OUTPUT_FILE" if [[ "$OUTPUT_FILE" != /* ]]; then temp_output_file="$PWD/$OUTPUT_FILE" fi OUTPUT_FILE_ABS="$temp_output_file"
cd "$SOURCE_DIR" || exit
find_args=(".") # Start search from the current directory "." find_args+=("-type" "f") # We only want files
if [ "$#" -gt 0 ]; then
find_args+=("(") # Start a group of OR conditions first_ext=true for ext in "$@"; do if [ "$first_ext" = false ]; then find_args+=("-o") # Add "or" between conditions fi # Ensure the extension has a dot, but don't add one if it's already there. [[ "$ext" != .* ]] && ext=".$ext" find_args+=("-name" "*$ext") first_ext=false done find_args+=(")") # End the group fi
echo "Searching in '$SOURCE_DIR' and writing to '$OUTPUT_FILE_ABS'..."
"$OUTPUT_FILE_ABS"
find "${find_args[@]}" -print0 | while IFS= read -r -d '' file; do
relative_path="${file#./}" filename=$(basename "$relative_path")
echo "Processing: $relative_path"
{ echo "filename: $filename" echo "path: $relative_path" echo 'content: """' # Use single quotes to prevent shell expansion inside # Add a check for non-readable files if [ -r "$relative_path" ]; then cat "$relative_path" else echo "!!! ERROR: Could not read file !!!" fi echo '"""' echo # The empty line separator } >> "$OUTPUT_FILE_ABS"
done
echo echo "✅ Done. All content has been concatenated into '$OUTPUT_FILE_ABS'"
How to Use
Save the script: Save the code above into a file named concat_files.sh.
Make it executable: Open your terminal and run the following command:
Generated sh chmod +x concat_files.sh IGNORE_WHEN_COPYING_START content_copy download Use code with caution. Sh IGNORE_WHEN_COPYING_END
Run the script:
To concatenate ALL files in a directory named my_project into a file all_content.txt:
Generated sh ./concat_files.sh ./my_project all_content.txt IGNORE_WHEN_COPYING_START content_copy download Use code with caution. Sh IGNORE_WHEN_COPYING_END
To concatenate only specific file types (e.g., .py and .md files) from a directory named my_code into code_docs.txt:
Generated sh ./concat_files.sh ./my_code code_docs.txt .py .md IGNORE_WHEN_COPYING_START content_copy download Use code with caution. Sh IGNORE_WHEN_COPYING_END
(Note: You can provide the extensions with or without the leading dot; the script handles both.)
Example Output (all_content.txt)
If you had a directory structure like this:
Generated code my_project/ ├── main.py └── docs/ └── guide.md IGNORE_WHEN_COPYING_START content_copy download Use code with caution. IGNORE_WHEN_COPYING_END
The generated file all_content.txt would look like this:
Generated code filename: main.py path: main.py content: """
print("Hello, World!") """
filename: guide.md path: docs/guide.md content: """
This is a guide for my project. """ IGNORE_WHEN_COPYING_START content_copy download Use code with caution. IGNORE_WHEN_COPYING_END