What is the training set of the pre-trained model?

jackhuang · December 19, 2017, 7:40am

I used the TIMIT dataset to test the pre-trained model, and the wer is about 27%. I want to know the training set of the pre-trained model, so I can try to improve the selection strategy of choosing training set. Can anyone help me?

kdavis · December 19, 2017, 11:16am

The pre-trained model was trained on Fisher, Switchboard, and Librivox training data sets.

jackhuang · December 19, 2017, 11:34am

Combining these three sets to be the training set and validation set？

kdavis · December 19, 2017, 12:50pm

No.

The training set was the Fisher, Switchboard, and Librivox training data sets.
The validation set was the clean Librivox validation data set.
The test set was the clean Librivox test data set.

jackhuang · December 20, 2017, 3:57am

Thank you, and is the language model the 4-gram language model with a 30,000 word vocabulary trained on the Fisher and Switchboard transcriptions as the paper says?

kdavis · December 20, 2017, 9:37am

No. We didn’t try to exactly reproduce the paper’s results.

We created a KenLM language model based off of Fisher, Switchboard, andLibrivox training data sets as well as part of Wikipedia.

rao_jjn · December 20, 2017, 12:02pm

Is it possible to train this model further using other voice datasets ?

yv001 · December 20, 2017, 12:27pm

I don’t think that’s possible until tensorflow checkpoints are also published - frozen out_graph.pb cannot be used for further training AFAIK

jackhuang · December 22, 2017, 6:32am

Does the common Voice data set include other data sets like Librivox, Switchboard, Fisher?

kdavis · December 22, 2017, 7:28am

No. Common Voice, Librivox, Switchboard, and Fisher are separate, distinct data sets.

jackhuang · December 25, 2017, 3:45am

Thank you, and may I ask the WER value of the clean Librivox test data set?

jackhuang · December 27, 2017, 3:40am

And did you use the 4-gram model to train the language model?

kdavis · December 27, 2017, 8:12am

The WER is 6.0 percent for the Librivox clean, test data set

kdavis · December 27, 2017, 8:13am

Not sure what you’re asking here.

Do you mean “Is the language model a 4-gram language model?”

jackhuang · December 27, 2017, 8:20am

Yes, that’t what I want to express.

panybj · December 29, 2017, 12:50pm

Where can I download Fisher and Switchboard datasets?

kdavis · January 2, 2018, 9:34am

You have to purchase Fisher and Switchboad from LDC.

srikar · September 11, 2018, 10:46am

So, the Common Voice data is not factored into the pre-trained model?