Į turinį

Esperanto collocations

captainzhang, 2014 m. sausis 9 d.

Žinutės: 86

Kalba: English

Eltwish (Rodyti profilį) 2014 m. sausis 14 d. 21:38:16

This is a very interesting question. From a linguistic perspective, part of what makes Esperanto interesting to me is the sense of confidence one can have that, with correct grammar and logical construction, one is unlikely to make an unnatural or unacceptable utterance - that is, I can't find many examples of phrases that "make perfect sense but you just don't say that," as one can find in any natural language. Does the tekstaro offer an API for search and analysis?

One example of a collocation (or perhaps even an idiom) that comes to mind is "tie kaj tie," meaning here and there / all over. I suspect there will be situations where stari is statistically preferred over esti when either would be acceptable. Prepositions would also be a good place to look, especially to see when je is most used - I suspect mostly with time, though kredi je is arguably a phrasal verb and frapi je la pordo is a phrase I suspect has been somewhat reinforced by usage, where je might not be otherwise as likely.

captainzhang (Rodyti profilį) 2014 m. sausis 14 d. 22:02:00

Thanks for all the input so far. Although I origionally had collocations in mind, it would be interesting if I were to find a significant amount of idiomatic phrases as well. Let me make clear that collocations are simply pairs or groups of words that appear together more often than you would expect from chance alone. It's actually quite interesting to study because it happens even in degrees. Let's say you take a specific phrase, there will be variations with part of those words, each variation being more or less frequent, to the point where you get to some variation that a native would never use.

Though not that relevant to Esperanto, if you wish to sound like a native or even more native-like in a natural language then collocations are of the utmost importance. Even more important is saying something the way a native does doesn't simply make you sound better but actually makes you more comprehensible, because when you say something in a way that the native doesn't expect it often confuses them, even if, they eventually understood you.

sudanglo (Rodyti profilį) 2014 m. sausis 15 d. 10:57:29

Eltwish there is a sort of language for interrogating the Tekstaro. You have to click on Klarigoj kaj helpo and then scroll down to Specialaj serĉsimboloj.

Some searches though are very simple. If you wanted to find out the environments for the occurrence of 'je' you can just type in space je space.

The assumption that je occurs most frequently with time expressions (eg je la tria) is not borne out by the first 1000 hits.

The texts of the corpus are in date order so that you can also compare early usage with later usage.

The rule with je is that you use it when no other preposition quite gets the meaning (or you can use if you have some doubts about the appropriate preposition). And the Tekstaro search seem to bear this out.

An interesting comparison is the incidence of plena de versus plena je. Can you spot a difference?

But as you rightly say it is characteristic of Esperanto that you can't find many examples of phrases that "make perfect sense but you just don't say that," as one can find in any natural language.

And that is totally congruent with the purpose of Esperanto.

Rikat (Rodyti profilį) 2014 m. vasaris 2 d. 17:39:55

There is a book that discusses collocations in Esperanto among other things. It's

Gledhill, Christopher.
The Grammar of Esperanto. A Corpus-based description.
(Languages of the World / Materials 190). München : Lincom Europa.

Here is a quotation from the book that shows how knowledge of collocations might be helpful to people trying to attain fluency rapidly:

The difference between disipi (to waste) and malŝpari (to waste, to un-save) becomes clearer on the basis of contextual evidence based on the corpus... disipi always in fact appears as a noun, collocating with words for strength or national resources (disipo de fortoj waste of strength, disipo de riĉofontoj waste of sources of wealth, disipo de monrimedoj waste of wealth, disipado de la fortostreĉoj waste of resources) while malŝpari is used exclusively as a verb and is reserved for wasted time or breath (malŝparis vortojn wasted words, malŝpariĝis la kritiko criticism was wasted, malŝparis siajn penojn wasted their efforts, malŝparas tempon wastes time).

Of course that kind of guidance might be included in the most excellent Esperanto-Nacilingvo dictionaries, but Esperanto lexicographers face a lot of difficult choices in the area of prescriptivism vs descriptivism.

Another short quotation from the book, to show what a powerful tool corpus analysis can be in the search for Truth:

Zamenhof used [danki] por (for) consistently in his early writing, although he maintained that [danki] pro (because of, due to) was a possible alternative. In this case, the corpus contains uses of both, although there are twice as many occurrences of pro (28) as of por (17). This kind of ambivalent evidence is a useful corrective to those commentators who attempt to claim that usage is wholly consistent, or those who appeal to a sense of 'general usage' without consulting a corpus of authentic texts.

sudanglo (Rodyti profilį) 2014 m. vasaris 3 d. 09:55:49

The linguists error again! - treating Esperanto as though it were a national language.

You simply can't have a rule in Esperanto that a particular root be used in a particular part of speech. This undermines the simplicity of Esperanto and its flexibility.

It may certainly be alright if you want to speak a national language like the natives to recommend that you only use a particular word as a verb or a noun, or in certain contexts. But such recommendations have little place with regard to Esperanto.

Just think of the complexity that this imports.

Anyway to equate disipi with malsxpari just because in some contexts the two words may have the same translation in English (ie waste) is to ignore the actual difference in meaning.

La varmdisipaj turoj of a power station are not there to malsxpari the heat.

It's not same meaning, but different usage. It's different meaning. The same goes for the difference between danki por and danki pro.

Yes, in English you have to say fast food' and not 'quick food'. You have to learn the collocation. But Esperanto is different.

sudanglo (Rodyti profilį) 2014 m. vasaris 3 d. 10:06:52

disipi always in fact appears as a noun
Not in the Tekstaro! Only 2 out of the 27 hits are for 'disipo'. The rest are verbal, or participle (verbal), or adjectival.
while malŝpari is used exclusively as a verb
Equally dubious! If you just select the articles in Monato and Ondo (as representative of modern usage in Esperanto) then its 5 hits for the verb and 10 hits for the noun. Even if you search all texts one quarter are noun usage (15 OF versus 46 VF).

Jeez, linguists!!

Rikat (Rodyti profilį) 2014 m. vasaris 3 d. 15:31:10

sudanglo:

Jeez, linguists!!
I believe Mr. Gledhill's book was published in the year 2000 and he was looking at a smaller corpus than is now available. Regardless, his observations are illustrative of the sorts of things one might investigate with statistical analysis.

bartlett22183 (Rodyti profilį) 2014 m. vasaris 3 d. 19:25:58

sudanglo:The linguists error again! - treating Esperanto as though it were a national language.

You simply can't have a rule in Esperanto that a particular root be used in a particular part of speech. This undermines the simplicity of Esperanto and its flexibility.

It may certainly be alright if you want to speak a national language like the natives to recommend that you only use a particular word as a verb or a noun, or in certain contexts. But such recommendations have little place with regard to Esperanto.

Just think of the complexity that this imports.
No, just the opposite. The idea that a root (or at least most of them) has at base the characteristic of a particular part of speech is a simplicity, not a complexity. To me, the idea that a root fundamentally relates to a particular part of speech and that the standard affixes help make the conversion is a very great strength of Esperanto. Here sudanglo and I simply disagree.

sudanglo (Rodyti profilį) 2014 m. vasaris 4 d. 11:24:45

Hi Bartlett.

As I understand the quoted passage, the author was suggesting that some conclusion could be drawn from the higher instance of noun usage versus verb usage as to a sort of collocational guide to how to speak Esperanto.

I did not understand that he was invoking, or drawing any conclusion about, the unique grammatical class of a root theory.

It seems to me, that whether roots which are of particular class (under the well known theory) are actually used more often to form words of that class rather than words of a different grammatical class is something requiring separate statistical investigation.

I would guess that bona (for example) is more common than bono or bonas or bone, but in many cases I would expect not that much imbalance in the parts of speech of words containing a certain root. In any case, such statistical imbalance cannot be seen as imposing any restriction on which part of speech you should use.

Incidentally, if the author had consulted PIV he would have found that both disip and sxpar are classed as verbal roots. So if he had had this is mind he might have expected verb usage to dominate in both cases, or perhaps even suggested that the classification of disip was wrong (if disipo actually was more frequent than disipi).

Corpus analysis can be valuable, but not for interfering with my right to use 'bonas' if I want to, simply because 'bona' is more common.

erinja (Rodyti profilį) 2014 m. vasaris 4 d. 16:33:33

Eltwish:I suspect there will be situations where stari is statistically preferred over esti when either would be acceptable.
Stari and esti are totally different words. "To stand" versus "to be". They aren't that similar, there aren't many situations where they overlap.

I suppose you could say "the vase is in the corner" or "the vase stands in the corner", but aside from a few of these situations, those two words would be a poor choice for a statistical comparison of similar words.

Atgal į pradžią