Common Voice Project Update - November 5th, 2019

The Open Innovation Voice team has started bi weekly meetings to review the work we are doing and call out blockers we have to progress. We are excited to include updates to the community as part of this series and keep everyone up to date on what we are working on.

Engineering Support

The previous engineer working on Common Voice has moved to another team and we are so excited to bring in Jenny Zhang to the project as the lead engineer and Riley Shaw as a contractor. You will see them start to show up on GitHub as well as other voice channels as they get settled in.

Community

Metrics

We are actively working with a small group of contributors to have a community metrics dashboard that will allow active members of the community a view into many different aspects of the project metrics while still adhering to the privacy practices outlined in our terms of service.

This dashboard is being set up in Kibana and will provide information such as:

  • Current data splits in real time; e.g. sex, age, accent distribution
  • Contribution impact based on time frames or events; e.g. past campaigns set as milestones
  • Quality of contribution:
    • Overall number of validated hours
    • Visualize how many validated hours are repetitions of the same sentence
    • Identify voice clip rejection rate
    • Identify how many clips have been reported, including filter by report attribute
    • Identify clips with multiple reports (from Listen)
  • Sentence health:
    • Visualize the number of sentences a language has left for contribution
    • Identify how many sentences have been reported, including report attribute
    • Identify sentences with multiple reports (from Record)

We are improving how we capture stats directly on Common Voice app to allow communities to fully understand the impact of their individual events and campaigns in a way everyone can visualize.

Campaigns

Konstantina and Ruben ran an extremely successful campaign which brought in over 60,000 new contributors and a huge number of new hours.

Campaign on October 14th (email, snippets (banner on the bottom of the firefox new tab) and social for English, German, French and Spanish was a success

  • German: +18 recorded +15 validated (5x)
  • English: +65 recorded +48 validated (7x)
  • French: +48 recorded +31 validated (6x)
  • Spanish: +50 recorded +30 validated (15x)
  • 11x in account creations
  • Organic grown during week 2 is still higher: English 2x, Spanish 4x, French 1,5x, German 2,5x

App Roadmap

Due to low engineering resourcing for the past 2-3 months, we are unable to get all of the work done that we would like. We are ramping up our new engineers and engaging the community to help move things forward. By combining these efforts, we will be regaining our momentum in the coming months.

Partner Challenge

Much of the work we are doing right now is to build out functionality on Partner challenges. We are currently working on a Pilot to see how different companies can work together and can increase the velocity of data collection.

This will allow us to roll out advanced features that have been tested for Common Voice in the future. We are currently working with small teams from SAP and IBM to implement a pilot for an initial challenge. This will allow us to see what teams desire from the experience and understand the support needed for future iterations leading up to a full challenge release. We will keep you up to date on how this comes together and next steps as we move forward.

Internal IT Support

You may have noticed a Common Voice outage during the campaign due in part to the huge influx of traffic two weeks ago. We have discovered that we need to be upgraded in our internal support tier and have worked with IT infrastructure to ensure we don’t have the same problem again.

Email Implementation

We are working with contractors as well as our internal email team to implement features such as contribution reminders for custom goals and multi-language email support so we can start to email people in a larger set of languages.

Community Campaigns

We’re creating two more campaign opportunities this year as we work to grow the dataset across language and improve the rate of contribution to Common Voice overall.

Dataset Release

Due to resourcing, we are currently reviewing when the next dataset release will be and will update the community soon on the timeline.

Voice Research

Emma Irwin, our Open Innovation team college did great research on how people are using data and the improvements we could make. We expect to be able to release the research to the community by the end of November.

4 Likes

Thanks for the update!

What does “active” mean in this context? Do you have to be invited by a Mozilla employee or is it determined automatically by login frequency, number of recordings/validations etc?

Also, has a date been set for the next marketing campaign? I’d like to cleanup the wiki sentences a little more before the next one if I have time.

We are aiming for November 18th.

1 Like

When the community metrics dashboard is released, you need to have a profile in Common Voice to be able to use it. Contributors with no profile will not have access.