Is it possible to train deepspeech with very small sample set, e.g., 30 audio clips with each in 5 to 10 seconds? I tried with different parameters including n_hidden, learning_rate, etc. and the testing on the same sample set gives very bad results, e.g., WER 0.98.
If I want to train and test on the same small sample set, what parameter should I use and how many epochs are needed for it to converge to a good result?
thanks!