I created a script to get word usage info from the sentence files in the voice-web repository:
It’s an easy way to spot misspelled words and words that need more coverage. You can also link it to a dictionary and it will tell you which dictionary words don’t exist in the corpus.