Recordings further away from microphone

dabinat · April 15, 2019, 10:25am

I have noticed that DeepSpeech 0.4.1 seems to perform particularly badly in situations where the user’s mouth is not directly in front of the microphone.

If you listen to the clips on the validation page, it seems that they are all performed by a user with a microphone right in front of their mouth.

This makes it harder to use for certain cases like a smart assistant that you talk to from across the other side of the room.

While some small improvement could perhaps be made by adding artificial reverb to the clips, are there plans to encourage more diverse recording environments in future?

Codigo_Logo_Programacao_e_Inteligencia_Artificial · April 14, 2019, 2:02am

I thought that the DeepSpeech team would be doing some data augmentation such as adding noise, changing the volume and speed rate. By doing this the dataset can increase a lot.

Codigo_Logo_Programacao_e_Inteligencia_Artificial · April 14, 2019, 2:07am

I’ll record some clips with built-in microphone.

kdavis · April 15, 2019, 10:29am

We are currently in the process of training one such augmented data for Deep Speech 0.6.0, an upcoming release.

Codigo_Logo_Programacao_e_Inteligencia_Artificial · April 15, 2019, 9:48pm

Glad to hear about this.