Here we use ffmpeg
command, which can be installed on Ubuntu / Debian using apt-get install ffmpeg
.
Next commands can be combined into one, though I prefer to keep each part separate.
- Convert video (
mp4
withaac
audio) to audio.
ffmpeg -i video.mp4 -vn -acodec copy video.aac
- Convert
aac
towav
.
ffmpeg -i video.aac audio.wav
- Split
wav
in parts 60 seconds long.
ffmpeg -i audio.wav -f segment -segment_time 60 -c copy part%03d.mp3
For conversion we'll use pretrained model jonatasgrosman/wav2vec2-xls-r-1b-russian
Install HugginSound package and run Python interpreter.
pip install huggingsound
python
from huggingsound import SpeechRecognitionModel
n = 165
model = SpeechRecognitionModel("jonatasgrosman/wav2vec2-xls-r-1b-russian")
audio_paths = ["part%03d.wav" % i for i in range(0,n + 1)]
transcriptions = model.transcribe(audio_paths)
transcriptions