How to train a model?

shad94 · November 25, 2019, 10:22am

It returned ‘0’ in both cases using SUDO, so it is FALSE in both cases (no execution); still, the model doesn’t work

alchemi5t · November 25, 2019, 11:12am

I am not so sure if virtual env would help if you couldn’t do it with sudo.

shad94 · November 25, 2019, 11:13am

So, any other suggestion?

alchemi5t · November 25, 2019, 11:31am

I don’t have enough information to go on here. The only thing I can offer is my time. If you don’t mind, give me access to a vm or something safe so I can try to set it up in your environment.

shad94 · November 26, 2019, 10:42am

Problem solution (thank you, @alchemi5t):

reinstall all packets and requirements,
I was lacking pip (I got pip3);
of course, Python 3.x is necessary,
no CUDA required, although, I got ‘only’ 12 GB RAM, so that might be an obstacle, since training uses the whole ‘arsenal’.
I shouldn’t use Jupter Notebook since it has obvious problems with access.
PYTHONPATH - new test environment
TTS from github is under test
LS data set can be anywhere; just need to be ‘linked’ in config.json; of course, metadata.csv has to be divided into train and validation data sets
I am attaching my config.json file: config.json
Last two lines should be validation and training sets:
“meta_file_train”: “metadata_train.csv”,
“meta_file_val”: “metadata_val.csv”,
but I havent split them yet

shad94 · December 17, 2019, 4:04am

EDIT: it crashes from 68th epoch, but I don’t know why…

Several notifications, among them:

| > Synthesizing test sentences
 !! Error creating Test Sentence - 0
....
OSError: [Errno 12] Cannot allocate memory
 | > Training Loss: 0.06097   Validation Loss: 0.07554

I got 8 threats and ~12 GB
below you can find config.json file, but for Polish language.
My data set contains 1271 samples and it is split into 1144 and 127 for training and evaluation.
config.json

alchemi5t · December 17, 2019, 9:35am

Hi shad,

I’ve read your DM. I’ll try to help you online first before I get working on it myself. Could you post a little bit more of the log?

alchemi5t · December 17, 2019, 9:37am

Also, if it’s only for training for your thesis, I might be able to train and give you a model for the config you give along with the data.

shad94 · December 17, 2019, 10:11pm

@alchemi5t

Hi there Here are some ‘news’ from the log:

 > Epoch 96/100
   | > Step:17/141  GlobalStep:13650  PostnetLoss:0.06650  DecoderLoss:0.07570  StopLoss:0.47290  AlignScore:0.0880  GradNorm:0.24001  GradNormST:0.32928  AvgTextLen:21.8  AvgSpecLen:165.5  StepTime:0.73  LoaderTime:0.29  LR:0.000054
   | > Step:42/141  GlobalStep:13675  PostnetLoss:0.05865  DecoderLoss:0.06999  StopLoss:0.27385  AlignScore:0.0626  GradNorm:0.13811  GradNormST:0.12845  AvgTextLen:30.4  AvgSpecLen:180.6  StepTime:0.88  LoaderTime:0.31  LR:0.000054
   | > Step:67/141  GlobalStep:13700  PostnetLoss:0.05929  DecoderLoss:0.06929  StopLoss:0.58532  AlignScore:0.0471  GradNorm:0.16014  GradNormST:0.47606  AvgTextLen:40.6  AvgSpecLen:228.6  StepTime:1.02  LoaderTime:0.35  LR:0.000054
   | > Step:92/141  GlobalStep:13725  PostnetLoss:0.06289  DecoderLoss:0.07570  StopLoss:0.26959  AlignScore:0.0346  GradNorm:0.21866  GradNormST:0.09893  AvgTextLen:56.6  AvgSpecLen:307.8  StepTime:1.32  LoaderTime:0.44  LR:0.000054
   | > Step:117/141  GlobalStep:13750  PostnetLoss:0.06960  DecoderLoss:0.08432  StopLoss:0.33354  AlignScore:0.0241  GradNorm:0.18223  GradNormST:0.12412  AvgTextLen:80.1  AvgSpecLen:484.0  StepTime:1.74  LoaderTime:0.60  LR:0.000054
   | > EPOCH END -- GlobalStep:13774  AvgTotalLoss:0.06121  AvgPostnetLoss:0.07169  AvgDecoderLoss:0.39223  AvgStopLoss:0.05383  EpochTime:187.22  AvgStepTime:1.32  AvgLoaderTime:0.43

 > Validation
   | > TotalLoss: 1.18051   PostnetLoss: 0.07172 - 0.07172  DecoderLoss:0.08944 - 0.08944 StopLoss: 1.01935 - 1.01935  AlignScore: 0.0957 : 0.0957
   | > TotalLoss: 0.62240   PostnetLoss: 0.07491 - 0.07226  DecoderLoss:0.09224 - 0.08981 StopLoss: 0.45526 - 0.68541  AlignScore: 0.0224 : 0.0452
warning: audio amplitude out of range, auto clipped.
 | > Synthesizing test sentences
 !! Error creating Test Sentence - 0
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 1
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 2
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 3
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 | > Training Loss: 0.06121   Validation Loss: 0.07319
 > Number of outputs per iteration: 5

 > Epoch 97/100
   | > Step:0/141  GlobalStep:13775  PostnetLoss:0.06069  DecoderLoss:0.07308  StopLoss:0.82969  AlignScore:0.2031  GradNorm:0.23467  GradNormST:0.34403  AvgTextLen:9.1  AvgSpecLen:114.1  StepTime:0.57  LoaderTime:0.23  LR:0.000054
   | > Step:25/141  GlobalStep:13800  PostnetLoss:0.05430  DecoderLoss:0.06320  StopLoss:0.39068  AlignScore:0.0853  GradNorm:0.20467  GradNormST:0.28177  AvgTextLen:23.5  AvgSpecLen:165.0  StepTime:0.81  LoaderTime:0.29  LR:0.000054
   | > Step:50/141  GlobalStep:13825  PostnetLoss:0.06497  DecoderLoss:0.07878  StopLoss:0.68616  AlignScore:0.0598  GradNorm:0.12579  GradNormST:0.48869  AvgTextLen:33.1  AvgSpecLen:205.6  StepTime:0.88  LoaderTime:0.33  LR:0.000054
   | > Step:75/141  GlobalStep:13850  PostnetLoss:0.05773  DecoderLoss:0.06606  StopLoss:0.25092  AlignScore:0.0441  GradNorm:0.15950  GradNormST:0.20841  AvgTextLen:44.0  AvgSpecLen:245.8  StepTime:1.13  LoaderTime:0.37  LR:0.000054
   | > Step:100/141  GlobalStep:13875  PostnetLoss:0.06377  DecoderLoss:0.07580  StopLoss:0.26661  AlignScore:0.0319  GradNorm:0.13651  GradNormST:0.13730  AvgTextLen:61.8  AvgSpecLen:354.8  StepTime:1.47  LoaderTime:0.48  LR:0.000054
   | > Step:125/141  GlobalStep:13900  PostnetLoss:0.06776  DecoderLoss:0.08022  StopLoss:0.24092  AlignScore:0.0209  GradNorm:0.21876  GradNormST:0.12476  AvgTextLen:93.2  AvgSpecLen:496.5  StepTime:1.91  LoaderTime:0.62  LR:0.000054
   | > EPOCH END -- GlobalStep:13916  AvgTotalLoss:0.06130  AvgPostnetLoss:0.07161  AvgDecoderLoss:0.39027  AvgStopLoss:0.05399  EpochTime:187.54  AvgStepTime:1.32  AvgLoaderTime:0.43

 > Validation
   | > TotalLoss: 1.20810   PostnetLoss: 0.07182 - 0.07182  DecoderLoss:0.08552 - 0.08552 StopLoss: 1.05076 - 1.05076  AlignScore: 0.0975 : 0.0975
   | > TotalLoss: 0.54540   PostnetLoss: 0.07758 - 0.07427  DecoderLoss:0.09452 - 0.09072 StopLoss: 0.37330 - 0.63299  AlignScore: 0.0227 : 0.0458
warning: audio amplitude out of range, auto clipped.
 | > Synthesizing test sentences
 !! Error creating Test Sentence - 0
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 1
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 2
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 !! Error creating Test Sentence - 3
Traceback (most recent call last):
  File "train.py", line 482, in evaluate
    style_wav=style_wav)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 103, in synthesis
    inputs = text_to_seqvec(text, CONFIG, use_cuda)
  File "/home/marta/Desktop/inz/test/TTS/utils/synthesis.py", line 12, in text_to_seqvec
    CONFIG.enable_eos_bos_chars),
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 57, in phoneme_to_sequence
    to_phonemes = text2phone(clean_text, language)
  File "/home/marta/Desktop/inz/test/TTS/utils/text/__init__.py", line 31, in text2phone
    ph = phonemize(text, separator=seperator, strip=False, njobs=1, backend='espeak', language=language)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/phonemize.py", line 149, in phonemize
    logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 42, in __init__
    super(self.__class__, self).__init__(language, logger=logger)
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/base.py", line 43, in __init__
    'initializing backend %s-%s', self.name(), self.version())
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 104, in version
    long_version = cls.long_version()
  File "/home/marta/.local/lib/python3.6/site-packages/phonemizer/backend/espeak.py", line 92, in long_version
    '{} --help'.format(cls.espeak_exe()), posix=False)).decode(
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
 | > Training Loss: 0.06130   Validation Loss: 0.07542
 > Number of outputs per iteration: 5

In my thesis I want to present different results from different settings, but still I got some problems with memory, and Google Cloud offers (for free) only 13GB also with 8 threats.

The way you would like to exchange my dataset is up to you.

shad94 · December 17, 2019, 10:11pm

Ugh, the thing is I got my time so limited
I should be done with training within 3 weeks

alchemi5t · December 18, 2019, 4:10am

Oh 3 weeks is going to be hard. I am tied up with my work right now. I thought you’d atleast have a couple of months.

shad94 · December 18, 2019, 5:29am

It doesn’t have to be perfect though.
But yeah, it might be tough. However, set is 10x times smaller than the LJ-speech. So, the sooner we start, the better.
I should start earlier, in November, and approach you then. I’m sorry

shad94 · December 18, 2019, 9:02am

anyway, I believe I should have change also best_model_config.json, since it’s more for English than for Polish and where fs= 20000, not 16000
Where I should do changes, too? Because in TTS there is folder called TTS/tests
and I believe I should do there major updatest, too.
And this one: TTS/mozilla_us_phonemes <- should I do my own phoneme folder?

alchemi5t · December 18, 2019, 11:27am

Yes ,you’ll have to change the config to your dataset.

No changes in tests folder.

I am sorry my replies are taking this long. I am really tied up at work. I’ll try and help you out as soon as I get some breathing space.

The phonemes are created in the first epoch so you dont have to create your own.

shad94 · December 18, 2019, 9:24pm

Anyway, I tried to change best_model_config.json and still, results are poor (just noise), there is problem with creating test sentences; still, I am doing mistake somewhere, but I don’t know, where.
Model itself for 100 epochs, batch size of 6 and test batch of size 2, trains ~9 hours now.

alchemi5t · December 19, 2019, 2:41pm

Oh! Batch_size of 6 isn’t going to get you anywhere. The config has a comment saying anything less than 32 has a hard time converging. Also, your dataset is key to training a tts. Needs to be clean and consistent.

shad94 · December 20, 2019, 2:32am

As you might remember, I cannot do anything with such a batch, because it is too large to my computer. I can share with you my dataset, which (I believe) I prepared well. (pass is alchemist; it will be ‘alive’ for 7 days)
TTS dataset in Polish

Btw, I got 4GB of GPU only on my computer.

alchemi5t · December 20, 2019, 11:31am

I’ve tried downloading it but the download refuses to start. I’ve tried on different networks and browsers, no luck anywhere.

erogol · December 20, 2019, 11:45am

What you can alos do for small RAM GPUs, is to do gradient aggregation. It is not implemented in TTS but it is quite easy to do so. And it’d be a good PR as well.

To be more clear, you run your small batch of instances for n iterations and aggregate the gradients. After you reach N batches, you backprop the model.

shad94 · December 20, 2019, 12:22pm