A consensus seems to exist on the idea that meaningless sentences are not an issue (except for the discomfort they cause to the reader).
This indicates that the intended ‘Speech to Text’ programs would not use sentence coherence but only the phonetics of words.
In this case, why not use simple lists of words,
such as lists arranged by frequency of use in each language ?
(for French, the Ministry of Education publishes such lists :