I am getting Validation Loss: inf

Hi Deepspeech,

I am training model from the common voice Dataset. I have been running the following commands.

python -u DeepSpeech.py
–train_files /home/javi/train/train.csv
–dev_files /home/javi/train/dev.csv
–test_files /home/javi/train/test.csv
–train_batch_size 80
–dev_batch_size 80
–test_batch_size 40
–n_hidden 375
–epoch 33
–validation_step 1
–early_stop True
–earlystop_nsteps 6
–estop_mean_thresh 0.1
–estop_std_thresh 0.1
–dropout_rate 0.22
–learning_rate 0.00095
–report_count 100
–use_seq_length False
–export_dir /home/javi/speech/tools/backup/export_modal/
–checkpoint_dir /home/javi/speech/tools/backup/checkout/
–alphabet_config_path /home/javi/speech/tools/backup/DeepSpeech/data/alphabet.txt
–lm_binary_path /home/javi/speech/tools/backup/DeepSpeech/data/lm.binary \

Here are my outputs, Could some one please help what is happening with my model training its been too long.
Is that my commands above is correct to train the model?
I am getting Validation Loss: inf --> Is that any error? What kind of error?

Please help.

Output::
Instructions for updating:
Use tf.cast instead.
I Initializing variables…
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 21:24:41 | Steps: 60592 | Loss: 129.07832
Epoch 0 | Validation | Elapsed Time: 0:20:00 | Steps: 12229 | Loss: inf | Dataset: /home/javi/train/dev.csv
Epoch 1 | Training | Elapsed Time: 21:01:41 | Steps: 60592 | Loss: 129.14967
Epoch 1 | Validation | Elapsed Time: 0:16:15 | Steps: 12229 | Loss: inf | Dataset: /home/javi/train/dev.csv
Epoch 2 | Training | Elapsed Time: 06:00:55 | Steps: 27698 | Loss: 100.14967

This means your development/validation file contains a file (or more) that generates inf loss.

If you’re using v.0.5.1 release, modify your files as mentioned here: How to find the which file is making loss inf

Run a separate training on your /home/javi/train/dev.csv file, trace your printed output for any lines that saying

The following files caused an infinite (or NaN) loss: … .wav

, remove those wav files from your data.

1 Like