Hello. Tell me where you can watch the video or a description to read what happens to the signal in the steps before calculating the signs. How are the signs and features of the recognized signal calculated on what principles. For example, as in my system ERS VCRS
What algorithms and model are used after identifying features and training the neural network. All this I need to understand what is the difference from other speech recognition systems such as CMU Sphinx. For a quick start and to improve the volume of experiments.