Maximum length

sibtainraza158 · May 25, 2019, 4:09pm

Deepspeech can transcribe any length of audio file. I have used pre trained model to transcribe a 16 minutes audio file and it gives the output which is also longer. But my question is how does it change the shape of its output tensors and input tensors?
Is there a fixed size of its tensor or dynamic?

pete · May 27, 2019, 8:32am

Hello, isnt output sensors always the same, amount of alphabets ? So, thats not going to change depending on audio file size. Input however is padded to meet the size of audio … so its dynamic. Correct me if Im wrong.