Android project with .pb files instead of .tflite

Details on https://github.com/mozilla/DeepSpeech/issues/2612

1 Like

Thank you, I will check it in a while and post the results of a .wav file.

About lm_alpha parameter I see that in this project it gets it from the json file that is included inside .zip file with the models. (anyway I can hardcode it to 2.0 instead of 0.75) BUT in my project it just doesnā€™t use it anywhere!! :grinning: :grinning: I will have to check the original code and find out where I have ommited this :blush:

Donā€™t test the LM hack, just make a new export from current master with released v0.6.0 checkpoints.

You need it when you initialize the language model. Not enabling the language model will yield less good results, and might increase inference time.

1 Like

Yes I can confirm that the transcription is correct now with the last ā€œfixā€ from here

Now I see ā€œwhy should one halt on the wayā€ correctly!!

Thank you againā€¦

Tomorrow I will see about lm_alpha parameter! on android and colab

Again, you donā€™t need to change from the default value.

1 Like

Okā€¦just experimenting a bit.

I am focusing on good quality audio with noise reduction. Until now I had to turn up the volume programmatically to get beter results as I was creating the recording.

    //Higher volume of microphone
                    readBytes = mRecorder.read(buffer, 0, BUFFER_SIZE);
                    if (readBytes > 0) {
                        for (int i = 0; i < readBytes; ++i) {
                            buffer[i] = (short)Math.min((int)(buffer[i] * 5.7), (int)Short.MAX_VALUE);
                        }
                    }

Speak louder? :slight_smile:

You should have a look at the deepspeech implementation in mozillaspeechlibrary from androidspeech you linked earlier, thereā€™s a parameter to set the source input that can have impact.

1 Like

That: https://github.com/mozilla/androidspeech/commit/2bf0774519fa58249e214bfc34b72b1e742d50a1

1 Like

Good morning @lissyx ,

After I ran some examples on colab I come and give you the results of my findings.

A .wav file of 26 seconds takes about 13 seconds for the transcription with cpu and 7 seconds with gpu using the .pb file. On the other hand when I use the same audio file but .tflite the transcription takes 34 seconds (and worse transcription)!

I am lost here! I thought .tflite was for faster results.

Check also our previous messagesā€¦where there is your example with .pb and .tflite model filesā€™ overall proccessing time!

Am I wrong somewhere?

Could you please give feedback on what I requested ? A bit worse transcription is expected, but I need to know if the issue we fixed improved things for you.

Transcription time of 34s seems high, but that also depends on the CPU itself. TensorFlow runtime leverages several CPUs, TFLite only uses one core at once.

And I insist, given your report, and given what I have been able to reproduce and verify after the fix, itā€™s very very much likely the major source of discrepency is fixed.

1 Like

The transcriptions of the .wav files you provide are fine and accurate.
Now I am trying to have better results with my recordings. With your help and your comments there is a major cut of the background noise and I have now better transcriptions.

I just made the comment about the proccessing time that I find odd for the .tflite to be longer that the .pb fileā€¦and that is what makes me wondering if I will be able to build a project using the .pb file inside androidā€¦

I just cannot sleep thinking about thatā€¦ :sweat_smile:

Well, whatā€™s the hardware ? You still have not replied to that.

Here are the specs of colab cpu:

The thing is that with the same beam_width (500) transcription takes the same amount of time in my phone (8 cores with 4GB RAM). And also It uses the same cpu for .pb file

Not sure what you mean here: does it means you get the same, good execution time of tflite on your own CPU as well as on your phone, but itā€™s slower on Colab ?

I donā€™t know the details of Colab, thereā€™s hardly anything we can do from here: who knows how the resources are shared amongts VMs.

Also, thatā€™s not very descriptive. As documented, we only tested and verified faster than realtime on Snapdragon 820 (Sony Xperia Z) and 835 (Google Pixel 2), you might not get the same behavior on other hardware.

Anyway I am happy we have solved the issue with 0.6.0 tflite file.
About 2 minutes ago I updated the dependency of the deepspeech module (android) to 0.6.1.alpha0. Also everything transcribes fine!
A 2 seconds file takes 1.5 seconds to transcribeā€¦so if I have a loop before the creation of the new .wav file I have the transcription of the current!

Sweet!

You mean to 0.6.0 final ?

So does it means your issues are mostly solved and now you have an experience closer to what we expect ?

That does sounds good. TFLite runtime, I hope?

Yes sorryā€¦I meant 0.6.1.alpha0 (I edited my previous comment).

I also found inside native_client how I can build .so file without runtime=tfliteā€¦BUT I will not do that :sweat_smile: :sweat_smile:

Yes .tflite runtime ā€¦I will keep experimenting with good audio recordings.

I will keep monitoring next releases of yours.

Thanks

1 Like

Thanks for taking the time to investigate and help us find that issue on the model.

1 Like