Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
user:zeman:interset:brainstorming [2007/10/03 13:20] zeman |
user:zeman:interset:brainstorming [2010/04/14 10:41] (current) zeman Hierarchie zájmen v souvislosti s probíhající diskusí v ISOcatu. |
||
|---|---|---|---|
| Line 16: | Line 16: | ||
| * hromadné (collective) //(každý, všechen)// | * hromadné (collective) //(každý, všechen)// | ||
| * záporné (negative) | * záporné (negative) | ||
| + | |||
| ==== Druhy číslovek ==== | ==== Druhy číslovek ==== | ||
| Line 44: | Line 45: | ||
| (Osobní i přivlastňovací zájmeno může být zvratné. Přivlastňovací může být nejen přivlastňovací zájmeno, ale také vztažné zájmeno (" | (Osobní i přivlastňovací zájmeno může být zvratné. Přivlastňovací může být nejen přivlastňovací zájmeno, ale také vztažné zájmeno (" | ||
| - | Kategorii určitosti a negace asi nemůžeme sloučit, pokud má negace současně sloužit i podstatným jménům, přídavným jménům a slovesům, protože podstatná a přídavná jména mohou být současně určitá i neurčitá. Nanejvýš bychom mohli informaci o záporu zdvojit (byla by u určitosti i ve zvláštní kategorii), ale to je asi blbost. | + | Kategorii určitosti a negace asi nemůžeme sloučit, pokud má negace současně sloužit i podstatným jménům, přídavným jménům a slovesům, protože podstatná a přídavná jména mohou být současně určitá i záporná. Nanejvýš bychom mohli informaci o záporu zdvojit (byla by u určitosti i ve zvláštní kategorii), ale to je asi blbost. |
| Kategorii určitosti a vztažnosti už jsem sloučil a zatím to nevadí. | Kategorii určitosti a vztažnosti už jsem sloučil a zatím to nevadí. | ||
| Line 134: | Line 135: | ||
| Přivlastňovací musí zůstat samostatnou vlastností (nemůžeme ji slít se subpos), protože | Přivlastňovací musí zůstat samostatnou vlastností (nemůžeme ji slít se subpos), protože | ||
| + | |||
| + | |||
| Line 144: | Line 147: | ||
| Prontype mostly encodes what we previously called definiteness. The definiteness feature would be retained. However, now it would be used only for definite/ | Prontype mostly encodes what we previously called definiteness. The definiteness feature would be retained. However, now it would be used only for definite/ | ||
| - | pos and subpos | + | ==== pos and subpos |
| * noun (personal pronouns, some demonstrative pronouns (alternating with adjectives)) | * noun (personal pronouns, some demonstrative pronouns (alternating with adjectives)) | ||
| * possessive adjective (possessive pronouns) | * possessive adjective (possessive pronouns) | ||
| + | * cardinal number (how many) | ||
| + | * ordinal number/ | ||
| * location adverb (where) | * location adverb (where) | ||
| * time adverb (when) | * time adverb (when) | ||
| * manner adverb (how) | * manner adverb (how) | ||
| * other adverb (why) | * other adverb (why) | ||
| + | |||
| + | ==== (pre)determiners ==== | ||
| + | |||
| + | Now de-facto subposes of adjectives. Cannot collide with possessiveness; | ||
| + | |||
| + | * det ... determiner | ||
| + | * pdt ... predeterminer | ||
| + | |||
| + | ==== poss ==== | ||
| + | * Now de-facto subpos of adjectives (normal or referencing). | ||
| + | |||
| + | |||
| + | ==== reflex ==== | ||
| + | * Attribute of referencing (pronomial) nouns and adjectives. Means reflexive reference to itself. Does not apply to numerals and adverbs. Czech examples: sebe, se, sobě, si, sebou (personal), svůj (possessive), | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ==== definiteness ==== | ||
| + | |||
| + | Bulgarian seems to be the reason why we need to separate the lexical definiteness (or demonstrativeness) from the morphological one. Most Bulgarian nouns, adjectives and pronouns allow for suffixes (-at, -ta, -to, -te) that change the default indefinite word forms to definite ones. Even indefinite pronouns (lexical definiteness = indefinite) can distinguish the two states. Thus, we have lexically indefinite morphologically indefinite word forms (нещо, едно), and lexically indefinite morphologically definite word forms (едната, | ||
| + | |||
| + | On the other hand, definiteness is not the same as demonstrativeness. Although I currently do not have an example of a demonstrative that is clearly morphologically indefinite, and although most demonstratives are semantically definite, there are demonstratives that describe the referee without necessarily having one particular (definite) in mind. Example: Czech demonstrative pronouns takový (such), týž, tentýž (same as). | ||
| + | |||
| + | Since having two definitenesses creates room for confusion, we ought to set both in decoders of all " | ||
| + | |||
| + | ==== numtype ==== | ||
| + | * < | ||
| + | * card ... cardinal numbers (most special of all; resemble adjectives; but might deserve their own pos class) | ||
| + | * digit ... arabic numbers, not words (special case of cardinals or ordinals) | ||
| + | * roman ... roman numbers, not words (special case of ordinals, very rarely cardinals) | ||
| + | * ord ... ordinal numbers (adjectives) | ||
| + | * mult ... multiplicative numbers (adverbs: kolikrát, pokolikáté) | ||
| + | * gen ... generic cardinals (kolikero) or adjectives (kolikerý) | ||
| + | * frac ... fractions (nouns: polovina, čtvrtina, sedmina) | ||
| + | |||
| Line 162: | Line 204: | ||
| * dem ... demonstrative pronoun (this, that) or adverb (here, there, now) | * dem ... demonstrative pronoun (this, that) or adverb (here, there, now) | ||
| * two to three levels of distance, similar to persons 1/2/3 (this/that, aqui/ | * two to three levels of distance, similar to persons 1/2/3 (this/that, aqui/ | ||
| - | * 0: distance neutral (?, to, takový, ten/onen, toho, tolik, ?, tehdy/ | + | * 0: distance neutral (?, to, takový, ten/onen, toho, tolik, ?, ?, ?, ?, tehdy/ |
| - | * 1: close to me (?, ?, takovýto/ | + | * 1: close to me (?, ?, takovýto/ |
| * 2: close to you | * 2: close to you | ||
| - | * 3: close to none of us (?, ?, ?, tamten, tamtoho, ?, tam, ?, ?) | + | * 3: close to none of us (?, ?, ?, tamten, tamtoho, ?, tam, tamodtud/ |
| + | * 4: distance neutral, same as something (?, ?, tentýž/ | ||
| * ind ... indefinite pronoun (somebody, something), selective adjective (některý), | * ind ... indefinite pronoun (somebody, something), selective adjective (některý), | ||
| * couple of levels of how many out of the total are included (few, a few, some, several, many) | * couple of levels of how many out of the total are included (few, a few, some, several, many) | ||
| - | * none: NEGATIVE PRONOUNS: no quantity (nikdo, nic, nijaký, žádný, ničí, nula, nikde, nikdy, nijak) | + | * none: NEGATIVE PRONOUNS: no quantity (nikdo, nic, nijaký, žádný, ničí, nula, nikde, odnikud, nikudy, nikam, nikdy, ?, ?, nijak) |
| - | * few: | + | * few: |
| - | * some: quantitatively neutral (někdo, něco, nějaký, některý, něčí, několik, někde, někdy, nějak) | + | * some: quantitatively neutral (někdo, něco, nějaký, některý, něčí, několik, někde, odněkud, někudy, někam, někdy, odněkdy, doněkdy, nějak) |
| - | * many: suggesting large quantity (leckdo, lecco, lecjaký, leckterý, lecčí, hodně/ | + | * many: suggesting large quantity (leckdo, lecco, lecjaký, leckterý, lecčí, hodně/ |
| - | * arb: any you pick (not necessarily all at once, although the distinction is fuzzy) (kdokoli, cokoli, jakýkoli, kterýkoli, číkoli, kolik si vzpomenete, kdekoli, kdykoli, jakkoli) | + | * any: |
| - | * all: | + | * all: |
| * col/tot ... collective/ | * col/tot ... collective/ | ||
| Line 189: | Line 232: | ||
| * Není jisté, zda také hodnota //synpos// u číslovek vždy vyplývá ze //subpos//. Zatím u číslovek rozlišujeme obojí. | * Není jisté, zda také hodnota //synpos// u číslovek vždy vyplývá ze //subpos//. Zatím u číslovek rozlišujeme obojí. | ||
| + | ===== Zájmena a příslovce míry, resp. neurčité aj. číslovky ===== | ||
| + | |||
| + | Portugalština: | ||
| + | |||
| + | # (Indefinite) quantifier pronoun or adverb. | ||
| + | # independent pronouns: algo, tudo, nada | ||
| + | # independent relative pronouns: todo_o_que | ||
| + | # determiners (pronouns): algum, alguma, alguns, algumas, uns, umas, vários, várias, | ||
| + | # qualquer, pouco, poucos, muitos, mais, | ||
| + | # todo, todo_o, todos, todas, ambos, ambas | ||
| + | # adverbs: pouco, menos, muito, mais, mais_de, quase, tanto, mesmo, demais, bastante, suficiente, bem | ||
| + | # demonstrative adverbs: t~ao | ||
| + | # This is not the class of indefinite pronouns. This class contains pronouns and adverbs of quantity. | ||
| + | # The pronouns and adverbs in this class can be indefinite (algo), total (todo), negative (nada), demonstrative (tanto, tao), | ||
| + | # interrogative (quanto), relative (todo_o_que). Many are indefinite, but not all. | ||
| + | |||
| + | Tohle celé by mohlo být zachyceno v nějakém rysu numtype (analogie k prontype), kde by bylo card, ord, mult atd. Přijde mi ale trochu divné označovat neurčité číslovky za kardinální čísla. Další možnost je advtype (popř. reftype), kde by bylo vedle loc, tim a man taky qnt (quantity). Problém s pojmenováním rysu tkví v tom, že v češtině máme druhové číslovky kolikerý apod., které moc nepoužíváme, | ||
| + | |||
| + | ===== Numerals ===== | ||
| + | |||
| + | ===== Approaches taken in various tagsets ===== | ||
| + | |||
| + | ==== cs::pdt ==== | ||
| + | |||
| + | Many types of numerals. Numeral types (e.g. cardinal vs. ordinal) and pronoun types (e.g. indefinite, interrogative) are mixed together. There are following subclasses: | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== cs::multext ==== | ||
| + | |||
| + | There are two orthogonal sets of subclasses: | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== bg::conll ==== | ||
| + | |||
| + | Interrogative, | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== en::penn ==== | ||
| + | |||
| + | Only cardinal numbers have their own tag. Ordinals (" | ||
| + | |||
| + | '' | ||
| + | |||
| + | ==== de::stts ==== | ||
| + | |||
| + | Only cardinal numbers have their own tag. Ordinals (" | ||
| + | |||
| + | '' | ||
| + | |||
| + | ==== da::conll ==== | ||
| + | |||
| + | No top-level class for numerals. They are tagged as a subclass of adjectives. Interrogative numerals are probably classified as pronouns. | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== sv::hajic ==== | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== sv::mamba ==== | ||
| + | |||
| + | Interrogative numerals are probably tagged as pronouns. | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== pt::conll ==== | ||
| + | |||
| + | Interrogative numerals (" | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ==== ar::conll ==== | ||
| + | |||
| + | The tag '' | ||
| + | |||
| + | '' | ||
| + | |||
| + | ==== zh::conll ==== | ||
| + | |||
| + | Determiners and cardinal numbers are in the same group ('' | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | ===== Hlavní rozdíl Intersetu oproti Sašově hierarchii pro Intercorp ===== | ||
| + | |||
| + | Jestliže se dívám na polské slovo " | ||
| + | |||
| + | Navíc tam má tři pohledy na klasifikaci slov: lexikální (sémantickou), | ||
| + | |||
| + | ===== ISOcat a hierarchie druhů zájmen ===== | ||
| + | |||
| + | * pronoun | ||
| + | * adverbialInterrogativeRelativePronoun (de:: | ||
| + | * affixedPersonalPronoun (???) | ||
| + | * allusivePronoun (???) | ||
| + | * conditionalPronoun (???) | ||
| + | * demonstrativePronoun | ||
| + | * attributiveDemonstrativePronoun (de:: | ||
| + | * substitutingDemonstrativePronoun (de:: | ||
| + | * emphaticPronoun (???) | ||
| + | * exclamativePronoun (???) | ||
| + | * impersonalPronoun (???) | ||
| + | * indefinitePronoun | ||
| + | * attributiveIndefinitePronounWithDeterminer (de:: | ||
| + | * attributiveIndefinitePronounWithoutDeterminer (de:: | ||
| + | * substitutingIndefinitePronoun (de:: | ||
| + | * interrogativePronoun | ||
| + | * attributiveInterrogativePronoun (de:: | ||
| + | * substitutingInterrogativePronoun (de:: | ||
| + | * negativePronoun (DZ: although the distinction is not done in de::stts, there are also subclasses of attributives vs. substituting) | ||
| + | * personalPronoun | ||
| + | * irreflexivePersonalPronoun (de:: | ||
| + | * reflexivePersonalPronoun (de:: | ||
| + | * strongPersonalPronoun (???) | ||
| + | * weakPersonalPronoun (???) | ||
| + | * possessivePronoun | ||
| + | * attributivePossessivePronoun (de:: | ||
| + | * substitutingPossessivePronoun (de:: | ||
| + | * reflexivePossessivePronoun (DZ; this could be either attributive or substituting) | ||
| + | * relativePossessivePronoun (DZ; this is probably only attributive) | ||
| + | * reciprocalPronoun | ||
| + | * reflexivePronoun (not personal??? | ||
| + | * relativePronoun | ||
| + | * attributiveRelativePronoun (de:: | ||
| + | * substitutingRelativePronoun (de:: | ||
| + | * existentialTherePronoun (en:: | ||
| + | * collectivePronoun (bg:: | ||
| + | * prepositionWithPronoun (cs: " | ||
| + | * pronounWithAuxiliary (cs: " | ||
