Hello,
I have successfully tried to train deepspeech 0.4.1 model on my own dataset. these are some steps
downloaded mozilla common voice 22gb corpus for english.
I was going to create my new tsv. but could not able to figure out what is client id in corpus tsv
so i just overwrite my own sentence in corpus tsv and replaced my mp3 file with corresponding tsv path name for 15 samples.
but this time I am going to create big data around 600 sample so my question is what is client id in mozilla corpus or how do i create a big sample data for this model is there any script available for the same.
-
accuracy on indian accent is very low. will it help if i retrain the model using mozilla indian accent samples only which is already been used to train the actual 0.4.1 model.
-
is there any preprocessing need to be done to minimize noise while giving input as wav file to model to get prediction. I am using pyaudio with this setting
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000