Downloading raw audio data

Hello forum,

I just found Common Voice, and I think it’s an amazing thing!

The power you give to speech researchers is wonderful :slight_smile:

My problem is that I just want raw audio data (I don’t care about validated transcriptions) of as many languages as possible for my research. Is it possible to download your audio data for all languages, not just the ones that are done being validated?

Best,
Martin

1 Like

Hi Martin,

Yup, we make all audio available when we publish. Validated, invalidated, and yet to be validated. You can find those in the current published dataset for english.

Thanks,
Michael

That’s nice! I thought only validated audio was downloadable.

What I really want is all audio (all languages, all speakers), and I don’t need text at all, just the audio. Is it possible to download all samples, not just English?

Best,
Martin

Once we publish data in new languages (which we hope to by the end of the year), you will be able to download all samples.

1 Like