Al la enhavo

Dictionary Proposal

de Qwertie, 2005-aŭgusto-02

Mesaĝoj: 19

Lingvo: English

Qwertie (Montri la profilon) 2005-aŭgusto-02 19:53:11

(this is the English version of a message I posted in the Esperanto-language forum)

It's been bothering me for a long time that I don't have a good Esperanto-English dictionary.  I have no paper one, so I've been using only the one here at Lernu, which, if I may say so, is very poor.  It lacks even the most basic information, such as verb transitivity and gramatical classes of roots (/a, /o, /i, etc.)

Despite the dictionary here, I believe I've learned to speak Esperanto fairly well.  Still, the meaning of some words eludes me.  I can easily believe that because of this dearth of good dictionaries, the Esperanto movement suffers.  Although (I've heard) the esperanto-esperanto dictionaries are pretty good, beginners need a good dictionary for translating between Esperanto and their parent language.  I mean, how can you understand a totally Esperanto dictionary until you already know most of the words?

For a long time now, I've wanted a method to improve some internet dictionary.  Have you seen wikipedia (http://www.wikipedia.org)?  I'd certainly want something similar to that so that anybody knowledgeable can add their knowledge of translations to the national-language dictionaries.

Of course, a pure (standard) wiki wouldn't be good for creating a dictionary.  I've seen the "Wiktionary" at http://eo.wiktionary.org and I'm not impressed.  It seems to me that

1. It's too difficult to edit the dictionary (since it's just a wiki, not specifically designed for dictionaries)
2. It isn't easy enough to find words in it.  I really like how you can simply click a word on Lernu.net and get the definition!  It should be a standard feature.
3. The wiki doesn't enforce any particular format for the data, like a database would.  By enforcing a specific format, the data can be more easily extracted and manipulated by computer programs, and it can be presented in different ways according to (for example) the taste of the user.  As a programmer I can tell you that highly structured data can make computer manipulation/analysis much easier.

However, no doubt there is someone out there who has the ability and desire to create some kind of good system to let people improve the Vortaro.

I would like make a request to the editors at Lernu.net: Please open this site's dictionary, and release it under the conditions of a free licence, such as the Gnu Free Documentation Licence (GFDL, see http://www.gnu.org/copyleft/fdl.html) or one of the Creative Commons licences (CC, see http://creativecommons.org/).  Of course, since those licenses are in English, it's probably a good idea to translate the license to Esperanto.

Once the vortaro is made free, someone can create a system under which people can improve it.  I would like to create that system myself, but my knowledge of internet programming isn't very good.  Still, I could probably help someone else, someone with superior knowledge, to make such a system.

It's worth understanding, that an internet dictionary can become fuller, better, more useful and more interesting than a paper dictionary.  The length of definitions is not important, since no one has to pay for paper to put them on; so an internet vortaro can express everything much more fully.  Plus, an internet vortaro can find definitions instantly with a simple click.

And now I hereby publish my personal idea of what a good new dictionary might look like.

-------------------------------------

[_] Prefer Esperanto explanations
[_] Let me make changes and comments
[Esperanto-Angla_____]
[malnova_______] [Go!]

EO
   mal-nov/a:
   1. "old": "malnova papero" => "old paper"
   Note: "malnova" is not to describe people or animals.
      "old man" <=> "maljuna viro"
   See also: maljuna

English
   No hits.

-------------------------------------

[_] Prefer Esperanto explanations
[X] Let me make changes and comments
[Angla-Esperanto_____]
[with__________] [Ek!]

English
   with:
   prep.1. (to express togetherness) "kun": "he came with her" => "li venis kun sxi"
   prep.2. (to express the means by which something is done) "per": "to hit with a hammer" => "bati per martelo"
   (_3_comments_ - _edit_)

   with child: vidu "pregnant"
   (_make_comment_ - _edit_)

EO
   No hits.

-------------------------------------
[_] Prefer Esperanto explanations
[_] Let me make changes and comments
[Esperanto-Angla_____]
[prov________] [Ek!]

EO
   prov/i:
   netr.1. try, attempt, test [1]
   Difference between "provi" and "peni": (the explanation of the difference would go here, but I myself don't know the difference!  In any case, the explanation of the difference between two [or more] words would appear automatically in the definition of both words.)
   See also: peni, klopodi, entrepreni
   [1] This definition is from the old dictionary
   (_1_comment_)

   provo:
   1. ....

English
   proverb:
   n.1. ...

   provide:
   v.tr.1. ...

   province:
   n.1. ...

   ...

   provided that: vidu "if"

-------------------------------------

I have more ideas that I've not expressed above... for example, I'd like to see symbols (or something) to mark the type or language-level of a word - to specify which words are basic, intermediate, advanced; which are neologisms, which are official, which are not official yet often used, and words that are peculiar to a particular field of study (e.g. medical jargon that only an Esperantist doctor would know.)

With some care, the dictionary can become much better than it is now.

Who wants to help me?

BreakALegge (Montri la profilon) 2005-aŭgusto-30 21:02:43

I have started down this same road, and I agree that it is needed.  I am able to program a system like this, I have already started working on a database schema.  We should collaborate.

trojo (Montri la profilon) 2005-aŭgusto-30 22:40:24

The E-o Wiktionary isn't very good, possibly because its structure isn't amenable to Esperanto's root system. Some sort of non-wiki collaborative dicitionary project designed with Esperanto's unique structure in mind would be good.

La Esperanto-Vikivortaro ne estas tre bona, eble ĉar ĝia konstruo ne bone kongruas al la radiko-sistemo de Esperanto. Ia kunlabora vortaro-projekto planata laŭ la unika konstruo de Esperanto estus bona.

Qwertie (Montri la profilon) 2005-septembro-03 23:55:56

The wiktionary is a poor dictionary for many reasons, most of which have nothing to do with Esperanto.

It has no enforced structure - the dictionary entries could have any format at all, and the dictionary conventions differ from one wiktionary to another.
Flexible search queries of the sort you could do on a DB-based dictionary are impossible.  Even a basic word lookup is relatively inconvenient, regardless of the language in question.  To find out the Esperanto version of the word "dictionary" for instance, you must go to http://en.wiktionary.org/wiki/dictionary and search for "dictionary", scroll down a couple pages for the 100 or so cross-language entries, and look around until you see "Esperanto".
Data presentation cannot be customized to users' desires
Making changes is less convenient than necessary
It's inefficient to view entries - a whole web page needs to load to get the definition of one word.

Under these limitations, I'm amazed at how much work has already been put into these wiktionaries.  We definitely need something better.

logixoul (Montri la profilon) 2005-septembro-04 13:45:00

@Qwertie

This is not a defect of the wiki itself. The solution (which, I believe, will be performed at some point in the future) is to give
1. common guidelines/rules concerning the structure of the page
2. tags for marking up things like etymology, definition, etc.
Then all of your listed problems can be fixed almost trivially. Have a look at a related proposal, http://meta.wikimedia.org/wiki/Semantic_MediaWiki .

Qwertie (Montri la profilon) 2005-septembro-05 02:29:18

Can this "semantic wiki" enforce structure?  Can it display the same information in two or more entries? e.g.

Difference between taugi and konveni: <btw, I don't know the answer>

Can you set it up so that when one hovers the mouse over an arbitrary word on a webpage, a minimal definition pops up?

If a user types in a search query (or clicks on a word) such as, oh, let's say, intereskaptajn, can the wiki break it down into component morphemes and define them individually?

In the editing mode, is it simply a plain text box that expects the user to know the expected data format?  How can a wiki help the user input entries in the correct format?

logixoul (Montri la profilon) 2005-septembro-05 11:33:24

>Can this "semantic wiki" enforce structure?

It doesn't need to. All that's needed is to tell everybody to use a particular structure. And when somebody violates the rules, others wikify the structure of the particular page. Just like in Wikipedia you need to enter {{stub}} to indicate a generic stub article. And if somebody ignorantly slaps at the bottom of the page something like "Please add information to this article because it's incomplete", when I see it, I'll wikify it to {{stub}}.

>Can you set it up so that when one hovers the mouse over an arbitrary word on a webpage, a minimal definition pops up?

This will be possible in the future. SemanticWiki is currently just a proposal.

>If a user types in a search query (or clicks on a word) such as, oh, let's say, intereskaptajn, can the wiki break it down into component morphemes and define them individually?

...and have the users guess at the meaning of the compound? No, thanks, I'd prefer a direct translation. Also, there are many words which can't be broken into their parts in just one way. Consider dieto(a diet) vs. dieto(a small god).

>In the editing mode, is it simply a plain text box that expects the user to know the expected data format?

It is simply a plain text box with a link to a short tutorial on top.

>How can a wiki help the user input entries in the correct format.

By providing a tutorial. God, if one can't sacrifice 5 minutes from his invaluable time to read a tutorial, exactly how's he going to be a contributor worth caring about?

Qwertie (Montri la profilon) 2005-septembro-06 15:36:48

>>If a user types in a search query (or clicks on a word) such as, oh, let's say, intereskaptajn, can the wiki break it down into component morphemes and define them individually?

>...and have the users guess at the meaning of the compound? No, thanks, I'd prefer a direct translation.

I can't figure out what you're thinking, dude.  Haven't you ever used the dictionary at lernu.net?  The definitions might not very good, but the lookup system is.  If a compound word has no definition, it is broken up into component morphemes which are defined individually.  Without this feature, the user would get no definition at all.  Esperanto has far too many compound words to define them all in the dictionary, and speakers routinely make up new ones as they see fit.  It would be silly to require every possible compound word be given a separate dictionary entry.

> Also, there are many words which can't be broken into their parts in just one way. Consider dieto(a diet) vs. dieto(a small god).

Obviously, in this case, the dictionary should display the "diet" definition.  If somebody has also input a definition for di-et-o, it would be displayed too.  One possible rule is to choose the breakdown with the least number of components; another is to just ask the Lernu management what they're doing, and copy it.

Qwertie (Montri la profilon) 2005-septembro-06 15:41:58

BreakALegge, did you get my private message?

BreakALegge (Montri la profilon) 2005-septembro-07 19:54:12

I have now (and replied)... 

I think the only thing about Wiki that you're interested in is its community editable pages.  I think we could fairly easily create a database format that would work for what you want it to do and still allow it to be editable by everyone.

Reen al la supro