Add new French sentences

Hello,

I am making a list of French sentences, which mainly concern French cities, towns and villages.

How can I add them to the French language project ?
Can I post them on Github ?

Thanks,
Lulucmy

Make sure that your sentences are your own, usually you can’t use sentences you took from books, articles, websites, etc.

To answer your question, I think you can make a pull request to add a new text file on the Common Voice’s Github project, here: https://github.com/mozilla/voice-web/tree/master/server/data/fr . If you don’t know how, it’s explained here: https://help.github.com/articles/creating-a-pull-request/ . Otherwise, someone here could do it for you.

2 Likes

Yep, as @J-b said, open a PR, with a new file named with your nick for example

Thanks !
I took sentences from a 18th century book, and the book is is public domain now.
I’ll just make a pull request on Github :wink:

Do I have to put my sources somewhere?

@lulucmy: If you’re French, the French copyright laws require to cite the name of the original author (see “droit de paternité”), even if the book is in the public domain and the author is dead a long time ago.

I don’t know if the Common Voice team has any particular suggestions regarding this, but I guess you can mention the author’s name in the comment zone, when you make a pull request. Another solution may be to cite her name in the file’s title.

2 Likes

Yes I checked and it’s ok for the rights

1 Like

Good question. Since they are all public domain, there are no requirements for this. Indeed, we don’t have any “best practice” suggestions either. I like @J-b’s suggestions around author name in filename. You could also add a README file to the fr folder mentioning your source. Up to you.

Thanks for your contributions!

1 Like

BTW @lulucmy @J-b, since I already saw some people opening issues about the french sentences, I’d like to suggest you to join https://github.com/mozfr/besogne/wiki/Common-Voice-Fr :-), it’d be better to fix / work together at the root instead of downstream

1 Like

I’d very interested in participating! I just joined the telegram group. Is there anything else to do to join the effort?

BTW, I’m working on some scripts to detect automatically non-French sentences, and sentences with spelling mistakes. Won’t be perfect (particularly on short texts like that), and manual checks will still be necessary, but it may be still interesting to improve the overall quality of sentences.