去目錄頁

Accented Character Encoding

Harun, 2009年5月28日

讯息: 6

语言: English

Harun (显示个人资料) 2009年5月28日下午4:34:05

Hello, all.

I'm trying to study pdf books on LERNU with my screen reader. It can read Esperanto outloud, using the X-system. However, I have problems with the accented letters. This seems to be if I'm using my screen reader to read the file, or if I copy the text into an editor.

For example, I'm trying to read the Intro to Esperanto (http://en.lernu.net/dosiero.php?id=/komunaj/elsxut...), alphabet section

The uppercase letter "CX" is displayed as the copyright symbol, ©. The lowercase "cx" is a paragraph marker, ¶. These letters seem to be displayed properly on the screen, but perhaps there is an encoding problem?

Does anyone know what the encoding used is, and howto convert these characters?

Thanks in advance for your help.

jchthys (显示个人资料) 2009年5月28日下午6:09:21

Is your screen reader set to an ASCII encoding instead of Unicode?

Harun (显示个人资料) 2009年5月28日下午8:22:12

My screen reader (NVDA) uses Unicode, as so far as I can tell.

I've tried converting the ASCII text to UNICODE with no effect. The problem seems to be mostly with PDFs, like the ELIBRO books and LERNU PDFs.

The "Fratoj Grimm" content is read correctly.
http://en.lernu.net/biblioteko/rakontoj/gfabeloj/i...

jchthys (显示个人资料) 2009年5月28日下午9:35:56

Yes, I noticed that when I tried to copy and paste text from the PDF of Gerda Malaperis! all the letters with hats became different symbols. I think that the PDFs are not encoded properly.

tommjames (显示个人资料) 2009年5月29日下午12:29:01

I too get the problem when copying and pasting the text from Gerda Malaperis. In Adobe Reader, I notice the encoding of the fonts is set to "Ansi", which presumably is wrong. You can view the encoding in the document properties, Font tab.

One other PDF from lernu that works correctly though is the Detala Gramatiko.. In that file the encoding is set to "Built-in". I have no idea what that means but perhaps it helps.

ceigered (显示个人资料) 2009年5月30日上午5:45:10

tommjames:I too get the problem when copying and pasting the text from Gerda Malaperis. In Adobe Reader, I notice the encoding of the fonts is set to "Ansi", which presumably is wrong. You can view the encoding in the document properties, Font tab.

One other PDF from lernu that works correctly though is the Detala Gramatiko.. In that file the encoding is set to "Built-in". I have no idea what that means but perhaps it helps.
Perhaps the 'built-in' setting means that the encoding is on application (e.g. PDF-reader) level as opposed to system level (e.g. encodings such as Unicode and ASCII et al.)

And I don't think 'application level' and 'system level' are correct terminology but nonetheless...

回到上端