Error rate is not decreasing

I’m using version 0.4.1 for transfer learning. I have around 100 hrs of data and about 15 hours of the same is from youtube (noisy data) while the rest are Indian english data.

As I run the same, I see a major issue being the train loss start at aroung 40 and then settle at around 82 after about 20 epoch and the validation loss is settling around 180.

I’ve used a varied range of learning rate from 0.0005 to 0.00095, but not much of help was seen.

Could you kindly suggest what could possibly be the issues? And he yo mitigate them.

I’m targeting a training loss of around 40 and a validation loss of around 80.

Did you turn off early stopping? It should stop after 4 epochs of minimal learning. You could do fine tuning with the check point file that last provided a loss lower than the prior. You basically substitute the DeepSpeech 0.4.1 checkpoints with your own. Also, since you’re mixing audio sources, make sure the audio is 16bit pcm at 16kHz. Also, the learning rates don’t seem to different. Batch size is also sensitive to learning rates. If you have a very larg batch, try a very small learning rate.

Thanks a lot for your reply.

Did you turn off early stopping? It should stop after 4 epochs of minimal learning.

No I did not. However, if it isn’t learning much, isn’t it important to understand why is it so rather than trying to stop early stopping? The reason is, without early stopping I think it would take forever for the error to come down from 80 to somewhere in the 30s.

Yes, I’m doing that.

I’m using sox to reduce everything to 16kHz using :

sox input.wav -r  16000 output.wav

Varied batch size from large to small and the same with learning rate from small to large.

You will want to stop training before loss consistently goes up. If you start at 40 and end up at 80, your learning rate is probably too high. What I have had some luck with is training at .0001 for a few eopchs using 0.4.1, then bootstrapping the last checkpoints, lowering lr to .00005 and continuing this until my loss is stable. This is sort of like manual learning rate decay.

@tuttlebr

I started with 0.4.1

The first epoch the learning rate was 0.0001 and error (train) around 40, it reduced to 20-ish in 2 runs and then shot up to 60-ish and they 90-ish finally settling to 80-ish.

The LR was changed from 0.0001 to 0.005 and 0.00095 (on other tries, starting from the same deepspeech checkpoint 0.4.1)