Hello everybody,
When I use the pre-trained model, it outputs the following:
Loaded model in 1.678s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 5.425s.
Running inference.
Inference took 21.084s for 3.990s audio file.
Is there a way to load the model and maybe also the language model only once, before submitting audio files? Because for example, when 10 users want to use the service one after the other, the model gets also loaded 10 times.
I am building a transcription service based on the DeepSpeech library and this could lead to a significant performance improvement.
Regards,
Niklas