2019 is quickly coming to a close and we are working on analyzing data, seeing what worked in 2019 and what we can do better in 2020. Deciding the immediate and future direction of the product based on technical needs, community requests and dataset quality. Thank you to everyone who has been a part of the community in 2019. We look forward to working with you next year!
Community
Campaigns
In H2 we were able to collect hundreds of hours of data through campaigns run by the community team. They were able to test different methods and pull the right levers to ensure contribution across many languages. Below, you can see the number of hours validated in the second half of 2019.
Language and Accent strategy
Currently with the Mozilla Legal team and is expected to have work started on it in February 2020.
App Update
Roadmap
The app team is meeting in the first week of January to solidify and scope what the first half of 2020 will look like and we will be able to share a roadmap shortly.
Working on Dataset optimization in early 2020 and what the parameters are for a quality dataset. This requires us to work with the machine learning team to ensure that we are collecting data in a way that will be useful for everyone who needs it.
Partner Challenge
For those who have been following along with the Open Voice Data Challenge Pilot, we now have some results and are deciding how to move forward once we finish the app infrastructure needs.
Below you can see two of the metrics we looked at which is, week over week engagement as well as level of contribution. Overall the challenge was a success with a few tweaks that need to be made before we release it to a wider audience.
- Contributors who are part of a challenge are much more likely to come back to the Common Voice site and contribute week over week.
- Contributors who were part of a challenge were also more likely to be classified as core contributors. Currently, 2% of Common Voice contributors speak or listen to 250 or more clips. In the challenge, we found that the number jumped to 29% for those involved in the pilot program. 2019 is quickly coming to a close and we are working on analyzing data, seeing what worked in 2019 and what we can do better in 2020. Deciding the immediate and future direction of the product based on technical needs, community requests and dataset quality. Thank you to everyone who has been a part of the community in 2019. We look forward to working with you next year!