I need some clarification on ignore-longer-outputs-than-inputs flag

Sushantmkarande · June 17, 2019, 4:45am

@kdavis @reuben I was training data I scraped from youtube and its cc aka vtt aka subtitle as transcript on deepspeech 0.5.0 model when I get this error.

Not enough time for target transition sequence (required: 102, available: 0)0You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs

I gave ignore_longer_outputs_than_inputs=True this flag in tf.nn.ctc_loss and model started training again but I need some clarification on this.

what does it mean?..

why i get this error… it might be true that my transcript is not 100% match to audio but I remember giving this model completely wrong transcript and it still trained on it,
and how to know how many training sample its ignoring after giving this flag. what if its skipping over all of the sample because I am not seeing even slightest effect on model after training all day…

lissyx · June 18, 2019, 6:36pm

So far there’s no better solution than either filtering on min / max length and / or do some binary search to find offending samples.

Sushantmkarande · June 20, 2019, 5:31am

how do i filter on min/max length. Sry I did not fully understand that.
how do i find offending samples error do not specify anything about on which sample it is stuck…

reuben · June 20, 2019, 12:26pm

You can look at the data directly. If the audio is too short for its transcript, it won’t work. Audio windows have a 20ms step between them, so to get the number of windows from an audio file you can just divide its duration by 20ms, and then compare that with the length of the transcript.

SamahZaro · August 12, 2019, 3:35am

Good answer. However, the CTC loss calculation, as far as I know, adds blank character ‘-’ between repetitive characters of the transcript or something like this… this will make comparing with the length of the transcript just an indicator but not accurate. @reuben, what do you think?

reuben · August 12, 2019, 5:08am

I don’t think CTC blanks are relevant here.

agarwalaashish20 · September 14, 2019, 3:00pm

@reuben, @lissyx : I am using Deep Speech v0.5.0, and I am also encountering this error. I have set ignore_longer_outputs_than_inputs=True

total_loss = tf.nn.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len, ignore_longer_outputs_than_inputs=True)

Now, when I run the training my Training Loss is always infinity. Kindly guide, how to resolve it?

agarwalaashish20 · November 26, 2019, 4:17pm

@lissyx, could you please help on the above issue. Even after setting the flag, it didn’t work.

The training loss is inf and validation loss is decreasing. I am using German-Mailabs dataset.