Hello, I am wondering if it’s possible to split the gpu memory up into fractions so that I can run multiple deepspeech instances.
In tensorflow you can configure the session object to only use a fraction of the available memory. Example here. Anyway to configure the same thing in deepspeech?
for reference, I am using the python deepspeech client.