Skip to content

Instantly share code, notes, and snippets.

@simonw
Forked from ivanfioravanti/mlx_whisper_realtime.py
Last active April 3, 2025 05:37
Show Gist options
  • Save simonw/57f9c15bbd9d484f762058f83412aefb to your computer and use it in GitHub Desktop.
Save simonw/57f9c15bbd9d484f762058f83412aefb to your computer and use it in GitHub Desktop.
mlx-whisper real time audio
# /// script
# dependencies = [
# "SpeechRecognition",
# "mlx-whisper",
# "pyaudio",
# ]
# ///
import speech_recognition as sr
import numpy as np
import mlx_whisper
r = sr.Recognizer()
mic = sr.Microphone(sample_rate=16000)
print("Listening...")
try:
with mic as source:
r.adjust_for_ambient_noise(source)
while True:
audio = r.listen(source)
# Convert audio to numpy array
audio_data = np.frombuffer(audio.get_raw_data(), dtype=np.int16).astype(np.float32) / 32768.0
# Process audio with Apple MLXWhisper model
result = mlx_whisper.transcribe(audio_data, path_or_hf_repo="mlx-community/whisper-large-v3-turbo")["text"]
# Print the transcribed text
print(result)
except KeyboardInterrupt:
print("Stopped listening.")
@simonw
Copy link
Author

simonw commented Nov 3, 2024

With that comment you can run this using:

uv run mlx_whisper_realtime.py

@ivanfioravanti
Copy link

🔝

@mtwebb
Copy link

mtwebb commented Nov 13, 2024

on a clean install I needed to first install Port Audio with brew install portaudio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment