Skip to content

Instantly share code, notes, and snippets.

@kakaroto
Created January 13, 2025 23:23
Show Gist options
  • Select an option

  • Save kakaroto/15bdb81c2b60720042d3bf67bf7dcc52 to your computer and use it in GitHub Desktop.

Select an option

Save kakaroto/15bdb81c2b60720042d3bf67bf7dcc52 to your computer and use it in GitHub Desktop.
Script to transcribe audio or video file using whisper.cpp
#!/bin/bash
FFMPEG="./ffmpeg"
WHISPER="./whisper-cli"
# Function to display usage instructions
usage() {
echo "Usage: $0 <input-file>"
exit 1
}
# Check if the user provided an argument
if [ $# -eq 0 ]; then
echo "Error: No input file provided."
usage
fi
# Get the input file
INPUT_FILE="$1"
# Check if the file exists
if [ ! -f "$INPUT_FILE" ]; then
echo "Error: File '$INPUT_FILE' does not exist."
exit 1
fi
# Check if ffmpeg is installed
if ! command -v "${FFMPEG}" &> /dev/null; then
echo "Error: ffmpeg is not installed. Please install it to use this script."
exit 1
fi
# Check if the models folder exists, and the required model file is available
MODEL_DIR="models"
MODEL_FILE="$MODEL_DIR/ggml-small.en.bin"
MODEL_DOWNLOAD_SCRIPT="$MODEL_DIR/download-ggml-model.sh"
MODEL_DOWNLOAD_URL="https://raw.githubusercontent.com/ggerganov/whisper.cpp/refs/heads/master/models/download-ggml-model.sh"
if [ ! -d "$MODEL_DIR" ] || [ ! -f "$MODEL_FILE" ]; then
echo "Models folder or required model file not found."
echo "Setting up the models folder and downloading the required model..."
# Create the models folder if it doesn't exist
mkdir -p "$MODEL_DIR"
# Download the model download script
echo "Downloading the model download script..."
curl -o "$MODEL_DOWNLOAD_SCRIPT" "$MODEL_DOWNLOAD_URL"
# Make the script executable
chmod +x "$MODEL_DOWNLOAD_SCRIPT"
# Run the script to download the small.en model
echo "Downloading the small.en model..."
"$MODEL_DOWNLOAD_SCRIPT" small.en
# Check if the model was successfully downloaded
if [ ! -f "$MODEL_FILE" ]; then
echo "Error: Failed to download the model file."
exit 1
fi
echo "Model setup completed successfully."
fi
# Extract the file extension
EXT="${INPUT_FILE##*.}"
FILENAME="${INPUT_FILE%.*}"
OUTPUT_FILE="${FILENAME}.wav"
# Check if the input file is a valid format
if ! "${FFMPEG}" -v error -i "$INPUT_FILE" -f null - &> /dev/null; then
echo "Error: Invalid input file format."
exit 1
fi
# Check if the file is already a .wav
if [ "$EXT" != "wav" ]; then
# Convert to .wav with specified arguments
echo "Converting '$INPUT_FILE' to '$OUTPUT_FILE'..."
"${FFMPEG}" -i "$INPUT_FILE" -ar 16000 -ac 1 -c:a pcm_s16le "$OUTPUT_FILE"
# Check if the conversion was successful
if [ $? -eq 0 ]; then
echo "Conversion successful! Output file: '$OUTPUT_FILE'"
else
echo "Error: Conversion failed."
exit 1
fi
fi
"${WHISPER}" -t 6 -m "${MODEL_FILE}" -osrt -f "${OUTPUT_FILE}"
echo "Transcribed subtitled are in ${OUTPUT_FILE}.srt"
if [ "$EXT" != "wav" ]; then
echo "Deleting temporary wav file"
rm -f "${OUTPUT_FILE}"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment