Last active
January 8, 2025 12:48
-
-
Save vijinho/ad9bd80c990a803efef7f96ae3e7ee98 to your computer and use it in GitHub Desktop.
wrapper for https://github.com/ggerganov/whisper.cpp whisper.cpp to generate transcriptions from given input media file
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Default values | |
MODEL="$HOME/src/whisper.cpp/models/ggml-large-v3-turbo-q5_0.bin" | |
INPUT_FILE="" | |
OUTPUT_FILE="$TEMP/whisper-$(date "+%Y%m%d%H%M%S").wav" | |
WHISPER_ARGS="-m $MODEL -l en -t 12 -pp -pc -otxt -ovtt -osrt" | |
# Function to display usage information | |
usage() { | |
echo "Usage: $0 -i <input_file> [-o <output_file>] [-w <whisper-args>]" | |
echo " -i, --input-file The path to the input media file" | |
echo " -o, --output-file Optional path to the temporary output audio file (should have .wav extension, defaults to /tmp/whisper.wav)" | |
echo " -w, --whisper-args Arguments for whisper-cli (defaults to '$WHISPER_ARGS')" | |
} | |
# Parse named arguments | |
while [[ "$#" -gt 0 ]]; do | |
case $1 in | |
-i|--input-file) INPUT_FILE="$2"; shift ;; | |
-o|--output-file) OUTPUT_FILE="$2"; shift ;; | |
-w|--whisper-args) WHISPER_ARGS="$2"; shift ;; | |
*) echo "Unknown parameter passed: $1"; usage; exit 1 ;; | |
esac | |
shift | |
done | |
# Check if the input file is provided | |
if [ -z "$INPUT_FILE" ]; then | |
echo "Error: Input file is required." | |
usage | |
exit 1 | |
fi | |
# Check if FFmpeg is installed | |
if ! command -v ffmpeg &> /dev/null; then | |
echo "Error: FFmpeg is not installed. Please install it first." | |
exit 1 | |
fi | |
# Check if the input file exists | |
if [ ! -f "$INPUT_FILE" ]; then | |
echo "Error: Input file '$INPUT_FILE' does not exist." | |
exit 1 | |
fi | |
# Ensure the output file has a .wav extension | |
if [[ "$OUTPUT_FILE" != *.wav ]]; then | |
OUTPUT_FILE="${OUTPUT_FILE}.wav" | |
echo "Note: Output file extension changed to .wav ($OUTPUT_FILE)" | |
fi | |
# Check if the output file already exists and delete it if so | |
if [ -f "$OUTPUT_FILE" ]; then | |
echo "Warning: Output file '$OUTPUT_FILE' already exists. Deleting the existing file..." | |
rm -f "$OUTPUT_FILE" | |
fi | |
# Convert the audio file using FFmpeg | |
echo "Converting '$INPUT_FILE' to '$OUTPUT_FILE'..." | |
ffmpeg -i "$INPUT_FILE" -ar 16000 -ac 1 -c:a pcm_s16le "$OUTPUT_FILE" | |
# Check if the conversion was successful | |
if [ $? -eq 0 ]; then | |
echo "Conversion successful." | |
else | |
echo "Error: FFmpeg encountered an issue during conversion." | |
exit 1 | |
fi | |
# Execute whisper-cli on the output WAV file | |
echo "Executing whisper-cli with arguments '$WHISPER_ARGS' on '$OUTPUT_FILE'..." | |
$HOME/src/whisper.cpp/build/bin/whisper-cli $WHISPER_ARGS -of "${INPUT_FILE%.*}" "$OUTPUT_FILE" | |
# Check if whisper-cli execution was successful | |
if [ $? -eq 0 ]; then | |
echo "whisper-cli execution successful." | |
rm "$OUTPUT_FILE" | |
else | |
echo "Error: whisper-cli encountered an issue during execution." | |
exit 1 | |
fi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Examples
Example 1: Basic Usage with Default Settings
Explanation:
input_video.mp4
) and uses the default output file path (/tmp/whisper.wav
).input_video.mp4
to WAV format and then usewhisper-cli
with default arguments (-t 8 -pp -pc -otxt -ovtt -osrt
) to transcribe it.Example 2: Custom Whisper-CLI Arguments
./whisper-wrapper.sh -i /path/to/input_video.mp4 -w "-t 16 -pp -pc -l fr"
Explanation:
input_video.mp4
) and custom arguments forwhisper-cli
(-t 16 -pp -pc -l fr
).input_video.mp4
to WAV format and then use the specifiedwhisper-cli
arguments to transcribe it, with the language set to French (-l fr
).Script Explanation
This script is designed to automate the process of converting a media file into an audio WAV format and then transcribing that audio using the
whisper-cli
tool. Here's a step-by-step summary:Usage Information: The script starts by defining a usage function that explains how to use the script, including which arguments are required and optional.
Argument Parsing: It uses
getopts
to parse named arguments (-i
,-o
,-w
) for the input file, output file, andwhisper-cli
arguments respectively. If these arguments are not provided, it defaults some values (e.g., the default output file is/tmp/whisper.wav
).Input Validation: The script checks if FFmpeg is installed, verifies that the input file exists, ensures the output file has a
.wav
extension, and removes any existing file with the same name.Audio Conversion: Using FFmpeg, the script converts the input media file into a 16kHz mono PCM WAV format, which is suitable for transcription by
whisper-cli
.Transcription: The script then runs
whisper-cli
with the specified arguments to transcribe the audio file. The output format and other settings can be customized through the-w
argument.Final Steps: After transcription, the script checks if the
whisper-cli
execution was successful and removes the temporary WAV file used for transcription if everything goes well.In essence, this script streamlines the process of converting any media file into a format suitable for transcription and then transcribing it using a specified tool, with user options to customize various aspects of both steps.
Submitted to the project: ggml-org/whisper.cpp#2703