Skip to content

Instantly share code, notes, and snippets.

@int128
Last active March 31, 2025 23:39
Show Gist options
  • Save int128/e7d6a93fd9e3ca18043dc901cc7f67d1 to your computer and use it in GitHub Desktop.
Save int128/e7d6a93fd9e3ca18043dc901cc7f67d1 to your computer and use it in GitHub Desktop.
Run faster-whisper on Amazon EC2 g4dn.xlarge instance

Spin up an instance of g4dn.xlarge with Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023). Log in to the instance via Session Manager.

Make sure the GPU is available.

nvidia-smi

Install https://github.com/SYSTRAN/faster-whisper and cuDNN.

pip3 install faster-whisper

pip3 install nvidia-cublas-cu12 'nvidia-cudnn-cu12==9.*'
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`

Run the example script of https://github.com/SYSTRAN/faster-whisper?tab=readme-ov-file#faster-whisper.

python3 test.py

Here is an example log of time to transcript 30 minutes audio.

2025-03-30 06:21:31.513916
2025-03-30 06:23:53.890056
from datetime import datetime
from faster_whisper import WhisperModel
print(datetime.now())
model_size = "large-v3"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")
segments, info = model.transcribe("audio.mp3", beam_size=5)
print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
print(datetime.now())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment