Deep speech Uni/Bidirectional LSTM?

s.h.katebi97 · December 31, 2019, 12:09pm

Hi
Is the LSTM layer in Deep Speech v0.6 Unidirectional or Bidirectional?
According to Baidu’s paper, it is bidirectional, but it seems to be unidirectional in Mozilla’s Deep Speech.
Is that correct?

lissyx · December 31, 2019, 2:09pm

That’s correct, this is what we document on https://deepspeech.readthedocs.io/en/v0.6.0/DeepSpeech.html. Please send PR / issues if you think the wording of the documentation needs improvement.

s.h.katebi97 · January 7, 2020, 1:21pm

Thank you for your fast response.

That’s correct

That leads to another question.
Why Deep Speech cannot generate output in real-time?
Why does it need to get the whole audio file to generates output?

reuben · January 7, 2020, 1:26pm

DeepSpeech is capable of streaming and can generate output faster than real time with appropriate hardware. We have extensive documentation on the streaming API as well as several examples.