Every time I train my model, there is a final testing step which outputs WER.
During training, “watch -n 0.5 nvidia-smi” tells me that my GPU is used. But during final testing, only the CPU is used. The result is that testing takes almost as much time as the whole training process, which is very painful.
This happens just after the “FINISHED Optimization” text, between “Starting batch…” and “Finished batch step”, so I suppose that this is not caused by some other CPU heavy task. Anyway, the CPU load is even lower than while using GPU because a single thread seems to be used.
I am using the following command :
CUDA_VISIBLE_DEVICES=0 LD_LIBRARY_PATH=native_client/ python -u DeepSpeech.py --log_level 1 --train_files data/train/list.csv --dev_files data/dev/list.csv --test_files data/test/list.csv --checkpoint_dir ${out_dir}/checkpoints --summary_dir ${out_dir}/tensor_board --alphabet_config_path data/alphabet.txt --lm_binary_path data/LE_MONDE_full.utf8.binary_lm --lm_trie_path data/LE_MONDE_full.utf8.deep_speech_trie --use_seq_length False --validation_step 2 --n_hidden 300 --train_batch_size 100 --dev_batch_size 100 --test_batch_size 100 --epoch 40
I am new to TensorFlow, so I don’t know what could cause this behavior…