Hi. I do the instruction in training.rst with data from CommonVoice named ga-IE ( English). When i run:
(deepspeech-train-venv) root@tuan-X450LD:~/DeepSpeech-master# bin/import_cv2.py --filter_alphabet data/alphabet.txt ~/ga-IE/
The result is:
Loading TSV file: /home/tuan/ga-IE/train.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/train.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/train.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 522 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/tuan/ga-IE/test.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/test.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/test.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 482 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/tuan/ga-IE/dev.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/dev.csv
Importing mp3 files…
Progress |#################################################### | 98% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/dev.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 462 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/tuan/ga-IE/validated.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/validated.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/validated.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 2529 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/tuan/ga-IE/other.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/other.csv
Importing mp3 files…
Progress |#################################################### | 98% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/other.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 1033 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
So i only have empty *.csv file. What should i do to fix it?
There is a same problem here, but it seem to not be helpful with me: Import_cv2 : all files failed to convert
It’s saying they failed the conversion, but the importer swallows all errors silently. Are you sure you have SoX installed? You can try taking out the try-except block in _maybe_convert_wav to see the errors. Just call transformer.build without the exception handling.
I have just checked as you said. When i remove sox 14.4.2 and reinstall sox 14.4.1, IT WORKS. Here is the result:
(deepspeech-train-venv) root@tuan-X450LD:~/DeepSpeech-master# bin/import_cv2.py --filter_alphabet data/alphabet.txt ~/ga-IE/
Loading TSV file: /home/tuan/ga-IE/train.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/train.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/train.csv
Progress |#####################################################| 100% completed
Imported 522 samples.
Skipped 460 samples that failed on transcript validation.
Final amount of imported audio: 0:25:33.
Loading TSV file: /home/tuan/ga-IE/test.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/test.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/test.csv
Progress |#####################################################| 100% completed
Imported 482 samples.
Skipped 426 samples that failed on transcript validation.
Final amount of imported audio: 0:30:19.
Loading TSV file: /home/tuan/ga-IE/dev.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/dev.csv
Importing mp3 files…
Progress |#################################################### | 98% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/dev.csv
Progress |#####################################################| 100% completed
Imported 462 samples.
Skipped 410 samples that failed on transcript validation.
Final amount of imported audio: 0:26:33.
Loading TSV file: /home/tuan/ga-IE/validated.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/validated.csv
Importing mp3 files…
Progress |#####################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/validated.csv
Progress |#####################################################| 100% completed
Imported 2529 samples.
Skipped 2215 samples that failed on transcript validation.
Final amount of imported audio: 2:17:12.
Loading TSV file: /home/tuan/ga-IE/other.tsv
Saving new DeepSpeech-formatted CSV file to: /home/tuan/ga-IE/clips/other.csv
Importing mp3 files…
Progress |#################################################### | 98% completedWriting CSV file for DeepSpeech.py as: /home/tuan/ga-IE/clips/other.csv
Progress |#####################################################| 100% completed
Imported 1033 samples.
Skipped 909 samples that failed on transcript validation.
Final amount of imported audio: 1:03:55.
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
Progress |#####################################################| 100% completed
And i alo see .wav and .csv file (with text) as i expected. Thanks for your advice