ranulo (Mostra el perfil) 13 d’agost de 2019 16.45.31
You can do two things there:
- donate your voice - you record short sentences in a language of your choice. It is very important that this project gets a diverse collection of audios. Right now, only 10% of the audio in the esperanto dataset is from female donors and they are also looking for people in the age group below 18 and above 40.
- Listen to the records and validate that they are corect. This part is great to train your listening skills in Esperanto and you learn a lot about different Esperanto accents from different countries.
This could also help sites like Lernu or Duolingo. Many language versions of Duolingo support speak recognition, but the Esperanto tree doesn't. I assume that this is because of the complete lack of Esperanto speech recognition software. The common voice project is the first project I know of that really could change this situation. A lot of startups and big company will use this dataset simply because it is free. If they find a good dataset for Esperanto the chance exists that at least a few of them will train there system also with this data. (for example Google already supports Esperanto in google translate)
So in the end we might get speech recognition for Esperanto for a few services. Plus, this dataset could also be very useful for speech synthesizers that use machine learning.
I find it a lot of fun to donate some time to this project every now and then. What are you thinking about it?