Interplay between review, validation and actual use in common voice!?

hi there.

i am a total newbie in the community and wonder how i can contribute most effectively. i have written loads of sentences so far but wonder how the whole “end-to-end process” looks like - please let me understand and correct me if i’m wrong:

  1. i write a sentence
  2. the sentence has to be reviewed by others (if wanted including me though)
  3. the sentence needs at least 2 out of 3 votes in favor of being validated
  4. the sentence becomes “validated”
  5. ???

how does the sentence make it to the website and the option to be recorded and listened to by users or visitors?

is there a regular (daily?) batch of validated sentences which are released constantly to the website?

thanks a lot in advance for clarification!

best, tschoerman

  1. The sentence gets exported into the voice.mozilla.org repository (currently done manually, about once a week)
  2. With the next deployment of the voice.mozilla.org website the new sentence will be added to its database (not 100% sure here to be honest)
  3. The sentence now appears for others to record

thanks for explaining.

i wonder why this process is manually at step 5 ? wouldn’t it be feasible and valuable to continuously feed the website queue by validated sentences from sentence collector?

actually, with the german version of common voice we ran out of sentences for recording and now old stuff is re-recorded over and over again! the bottleneck is obviously the person/process of step 5 and 6 and how it is “delayed” compared to actual events regarding recording/listening.

anyway, didn’t want to complain too much but would like to speed up collecting and generating voice data!

best, tschoerman

The sentence collector it’s currently a community-driven project that was created to solve some quality issues we found in the existing sentences. The Common Voice team hasn’t had the time and resources to do a full integration of this feature in the main site yet, but it’s on our future roadmap.

A post was split to a new topic: Enable Swedish for voice collection