Al la enhavo

Esperanto text-to-speech

de jonsd, 2007-januaro-31

Mesaĝoj: 3

Lingvo: English

jonsd (Montri la profilon) 2007-januaro-31 16:14:27

eSpeak is an open source speech synthesizer which can speak text from several languages (some better than others). Some languages have been improved by feedback and assistance from native speakers.
I've included an Esperanto voice. I don't speak Esperanto myself, so it's based only on what I've read. But since Esperanto has regular spelling and pronunciation rules, that may be OK.
If anyone's interested, it's available in Windows and Linux versions, free from:
http://espeak.sourceforge.net/
I'd be interested in any feedback. Are there sounds which it pronounces wrongly? Is the rhythm OK? Is it useful for speaking text passages for learning purposes?

Kwekubo (Montri la profilon) 2007-januaro-31 17:01:47

For someone who's only read through an introduction to the language, I don't think you've done a bad job at getting the sounds. I do have a few suggestions though:

-The rhythm is a little fast; the voice reads at a fast pace and doesn't tend to stop "to take breaths" where a live speaker would.

-The program tends to elide the letter j when it occurs at the end of a word - I think that this might be a side effect of the quite fast speed.

-If you could make the vowel sounds a little "purer", it might help - try using Italian-style vowel sounds.

jonsd (Montri la profilon) 2007-januaro-31 17:46:01

Kwekubo:
-The rhythm is a little fast; the voice reads at a fast pace and doesn't tend to stop "to take breaths" where a live speaker would.
Thanks for some interesting comments.
You can of course alter the speaking rate and make it slower, but that's probably not what you meant. Pauses are an important point. A simple text-to-speech program doesn't understand what it's speaking, but eSpeak does make a short pause before certain words (eg. conjunctions, some prepositions, some correlatives). I've probably missed some. Probably there are additional words where a pause would be good.
You can see this in the file: dictsource/eo_list (in the Windows version this is in the espeakedit download). Some words have a $pause or $brk (a very short pause) attribute. Also some common words have a $u (unstressed) attribute.
Another possibility is that the stressed (penultimate) syllables should be longer. They are already longer than unstressed syllables. Perhaps they should be more so?
Kwekubo:
-The program tends to elide the letter j when it occurs at the end of a word - I think that this might be a side effect of the quite fast speed.
What exactly do you mean by "elide"? "aj", "oj" etc, are diphthongs. Do you mean that the second part of the diphthong (the "i" part) should be longer, or the whole diphthong should be longer (compared with single vowels), or that there should be a break before the next word if it begins with a vowel?
Kwekubo:
-If you could make the vowel sounds a little "purer", it might help - try using Italian-style vowel sounds.
What do you mean by "purer". Which vowels? All of them?
The vowels "a", "e", "o", "u" are indeed pure monophthongs. "i" closes slightly at the end, but that's probably natural. Perhaps you mean that the actual sound of the vowels should be slightly different? (eg. the "u" should be more "back" ?). Do the vowels in the Italian voice sound any better?
If anybody is interested in this sort of thing, there's a lot of experimenting you can do by changing vowel sounds, adjusting their lengths, etc. In the "espeakedit" download there's a large selection of vowels and diphtnongs you can try, or even make your own. Look at phsource/vowelcharts for a diagram of the vowel positions for each language (many of these are guesses), or phsource/vowel/vowelchart.png for the available set of vowels.

Reen al la supro