How to take the intermediate layer embedding of an audio file from a trained deepspeech model.
The easiest way to get started is by just adding the layer of interest e.g. layers['layer_5']
to the first argument of session.run
, and add a corresponding variable to the left of the =
before session.run
1 Like
Is there a code to take inference of an audio file from deepspeech model without using deepspeech binary
Sure, have a look at FLAGS.one_shot_infer
in the DeepSpeech.py
file.