Add Tamil Speech

Please add support for Tamil speech. There is a high need for Indic language support.

2 Likes

We will consider it! We are ramping up to localize the site very soon, and are investigating which languages to look at.

You could help us by looking for a large collection (at least thousands) of Tamil sentences that people can read into Common Voice. The restriction is that these sentences must be in the public domain already (ie. CC0 or no license at all).

Once we have that, it becomes very easy to deploy a Tamil version of Common Voice.

4 Likes

That’s great! I assume it will be same for all other languages than English. Do you have any plan to include other language than English. Diversity and localisation is the thing I love in Mozilla <3

3 Likes

How about the Tamil Wikipedia texts? Seems to be licensed under the CC.

1 Like

@mhenretty how about Vietnamese Speech. I’m try but not work

I was about to say this. Also, do check Tamil Wikisource. When Tamil Wikipedia is relevant for contemporary text, Wikisource might have old text. That will help balance the content as you need both.

http://projectmadurai.org/ has a lot of public domain text for the formal Tamil language which is used in news media, legal documents, Firefox, computers etc. This however varies from a simplified version of Tamil used in informal everyday parlance.

There is a website with many CC0 texts called freetamilebooks.com/. There are many, many Tamil sentences that can be extracted from there. Please let me know what I can do to help bring about support for Tamil speech. For example, I can help extract text if an administrator tells me what formatting to use.