I’m using Windows, so I don’t believe I can use the deepspeech package, so I downloaded the pretrained model and have loaded it in my script using keras. My question is how do I go about using the model, like what type of input is it expecting, how should audio be preprocessed, and what kind of output does the model give. I don’t quite understand how the output graph works either.
Apologies if this seems like a basic problem.