btw you can use the erogol’s notebook to perform a health check
I could be wrong, but I think this noteboook might be what @carlfm01 had in mind: https://github.com/mozilla/TTS/blob/master/dataset_analysis/AnalyzeDataset.ipynb
(it’s linked to here: https://github.com/mozilla/TTS/wiki/Dataset in the wiki)
It lets you look over the dataset to see things like audio length per character, which might highlight instances of bad audio in your dataset
Hello @carlfm01.
Indeed they were erroneous audios. Thank you for your suggestion.
Now I receive the following error, I don’t know if you know anything about it:
ResourceExhaustedError (see above for tracing): OOM when assigning tensor with form [32,1024] and type float on / job: localhost / replica: 0 / task: 0 / device: GPU: 0 by allocator GPU_0_bfc
[[node Tacotron_model / inference / decoder / while / CustomDecoderStep / decoder_LSTM / decoder_LSTM / multi_rnn_cell / cell_1 / dropout_1 / random_uniform / RandomUniform (defined in / home / manuel_garcia02 / Tacotron-2 / tacotron / models: 13)
]]
Tip: If you want to see a list of the assigned tensioners when OOM occurs, add report_tensor_allocations_upon_oom to RunOptions to get the current assignment information.
[[node Tacotron_model / clip_by_global_norm / mul_38 (defined in /home/manuel_garcia02/Tacotron-2/tacotron/models/tacotron.py:429)]]
Tip: If you want to see a list of the assigned tensioners when OOM occurs, add report_tensor_allocations_upon_oom to RunOptions to get the current assignment information.
I already placed tacotron_batch_size in 4, but still
Out of memory
Whats your GPU model?
do not change the batch_size, instead, sort the train.txt generated by preprocess.py and start removing the longest ones from the file.
Try removing a group then try again, and so on if it fails with a OOM.
I have a NVIDIA Tesla V100. ok i try I will try to do it and I tell you. Thank you.
You are correct, thanks for adding the link
Thanks, the 16GB or 30GB version?
Hello @carlfm01.
I thank you for your answers, I already started with Tacotron training normally. I have another question.
Is it normal for audios generated in tacotron training to be heard like this?
http://www.mediafire.com/file/r7p3ggsbfqrqpud/step-2500-wave-from-mel.wav/file
(‘I’m still a new user’)
Thank you.
Please share your attention plot, sounds like the attention is broken.
Can you share a single file created by this script to see if it is correct? Try to use Mozilla Send
What about the silence at the beginning and at the end? long silence can damage the performance
Did you changed something?
@carlfm01 this is an audio.
I can’t upload the file with Mozilla send. I’m still a new user.
https://transfer.sh/5kb14/audio-archivo-156579483968273.npy
I don’t change nothing
audio-archivo-156579483968273.zip (164,8 KB)
I’m able to synthesize your training file, thus your training format is correct, can be about transcription/audio quality, I mean wrong transcriptions or empty audio, like the last one you removed.
The last thing I eliminated, they were audios with phrases too extensive. I used the erogol’s notebook and eliminated audio. these auidos
eliminated, if renian audio, worse ko could be processed, I do not know what it can be. After that, I had already trained tacotron without lpcnet. And this was the result.
Could it be that I took a workout that was already saved, before deleting the long sentences?
I think the problem is to resume training with different files. At this time, I executed a new training from scratch.
later I discuss the attention plot, when the 2 workouts understand at the same level of training
Ok, let’s wait.
Yes, you need to delete the model trained with the wrong sentences.
Hello @carlfm01
In fact, my attention plot doesn’t look like it used to. this is the current one.
but the audio is heard with the same noise as before.
Share the generated feature to test? Looks like it needs silence trimming at the end