I’m using Common Voice to train a real-time spoken language identification system. To extract audio from the tar.gz files I’m using the following Python code
def language_reader(path):
def get_language_data(code):
with tarfile.open('{}/{}.tar.gz'.format(path,code)) as tar:
clips = [clip for clip in tar.getmembers()
if clip.name.endswith('.mp3')]
for clip in clips:
tar.extract(clip)
for sample in open_mp3(clip.name):
yield sample
return get_language_data
def open_mp3(data):
mp3 = subprocess.Popen(['ffmpeg','-i',data,
'-f','wav','-acodec','pcm_s16le','-ac','1','-ar','16000','-'],
stdout=subprocess.PIPE)
Running = True
while Running:
sample = mp3.stdout.read(2)
if sample == b'':
Running = False
else:
yield int.from_bytes(sample,byteorder='little',signed=True)
os.remove(data)
This works fine most of the time. However, for common_voice_br_17332422.mp3 in the Breton dataset, ffmpeg freezes up and the whole system hangs waiting for it to send the data. Does anyone know how to fix this? The file plays normally in VLC.
Examining the ffmpeg process with System Monitor shows “Waiting channel: pipe_wait”