[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
user:zeman:interset:features [2017/01/16 19:44]
zeman Generic numerals are dead.
user:zeman:interset:features [2021/03/01 08:37] (current)
zeman [variant]
Line 101: Line 101:
 | digit | number written using digits ("14") | | digit | number written using digits ("14") |
 | roman | number written using Roman numerals ("XIV") | | roman | number written using Roman numerals ("XIV") |
 +| combi | number written using digits and a suffix ("2009-ųjų") |
  
 ===== numvalue ===== ===== numvalue =====
Line 232: Line 233:
  
 | **Value** | **Description** | | **Value** | **Description** |
-poss | possessive |+yes | possessive |
  
 ===== reflex ===== ===== reflex =====
Line 239: Line 240:
  
 | **Value** | **Description** | | **Value** | **Description** |
-reflex | reflexive |+yes | reflexive |
  
 ===== polarity ===== ===== polarity =====
Line 257: Line 258:
 | **Value** | **Description** | | **Value** | **Description** |
 | ind | indefinite | | ind | indefinite |
 +| spec | specific indefinite ("a certain stick") |
 | def | definite | | def | definite |
 | cons | reduced: used in [[http://en.wikipedia.org/wiki/Status_constructus|construct state]] in Arabic. If two nouns are in genitive relation, the first one (the "nomen regens") has "reduced definiteness," the second is the genitive and can be either definite or indefinite. Reduced form has neither the definite morpheme (article), nor the indefinite morpheme (nunation). For instance: indefinite state: حلوَةٌ //ḥulwatun// “a sweet”; definite state: الحلوَةُ //al-ḥulwatu// “the sweet”; حلوَةُ //ḥulwatu// “sweet of”. | | cons | reduced: used in [[http://en.wikipedia.org/wiki/Status_constructus|construct state]] in Arabic. If two nouns are in genitive relation, the first one (the "nomen regens") has "reduced definiteness," the second is the genitive and can be either definite or indefinite. Reduced form has neither the definite morpheme (article), nor the indefinite morpheme (nunation). For instance: indefinite state: حلوَةٌ //ḥulwatun// “a sweet”; definite state: الحلوَةُ //al-ḥulwatu// “the sweet”; حلوَةُ //ḥulwatu// “sweet of”. |
Line 264: Line 266:
  
 | **Value** | **Description** | | **Value** | **Description** |
-foreign | foreign word (not a loan word but a citation in a foreign language — e.g., the title of a foreign book) |+yes | foreign word (not a loan word but a citation in a foreign language — e.g., the title of a foreign book) |
  
 ===== gender ===== ===== gender =====
Line 306: Line 308:
 | sing | singular | | sing | singular |
 | dual | dual | | dual | dual |
 +| tri  | trial |
 +| pauc | paucal |
 +| grpa | greater paucal |
 | plur | plural | | plur | plural |
 +| grpl | greater plural |
 +| inv  | inverse |
 | ptan | plurale tantum | | ptan | plurale tantum |
 | coll | collective / mass / singulare tantum | | coll | collective / mass / singulare tantum |
 +| count | “counting form”, “count plural” or “quantitative plural” in Bulgarian and Macedonian (Sussex and Cubberley 2006, p. 324). It is a special plural form of nouns if they occur after numerals. (The form originates in the Proto-Slavic dual but it should not be marked as dual because 1. the dual vanished from Bulgarian and 2. the form is no longer semantically tied to the number two.) |
  
 //Pluralia tantum// is a special case of plural, occurring e.g. in Czech. It applies to words that do not have singular forms. They use grammatical plural regardless of semantic number. Czech example: //nůžky// "scissors": //Papír rozstříhejte nůžkami.// "Use scissors to cut the paper to pieces." (semantic singular) vs. //Koupil jsem si dvoje nůžky.// "I bought two pairs of scissors." (semantic plural) //Pluralia tantum// is a special case of plural, occurring e.g. in Czech. It applies to words that do not have singular forms. They use grammatical plural regardless of semantic number. Czech example: //nůžky// "scissors": //Papír rozstříhejte nůžkami.// "Use scissors to cut the paper to pieces." (semantic singular) vs. //Koupil jsem si dvoje nůžky.// "I bought two pairs of scissors." (semantic plural)
Line 390: Line 398:
 | del | delative | Used, chiefly [[http://www.hungarianreference.com/Nouns/r%C3%B3l-rol-delative.aspx|in Hungarian]], to express the movement from the surface of something (like "moved off the table"). hu: az asztalról = off the table. Směr "z, od", ale používá se i v jiných významech (např. "o něčem"). hu: Budapestről vagyok = jsem, přicházím z Budapešti | | del | delative | Used, chiefly [[http://www.hungarianreference.com/Nouns/r%C3%B3l-rol-delative.aspx|in Hungarian]], to express the movement from the surface of something (like "moved off the table"). hu: az asztalról = off the table. Směr "z, od", ale používá se i v jiných významech (např. "o něčem"). hu: Budapestről vagyok = jsem, přicházím z Budapešti |
 | lat | lative | Denotes movement towards/to/into/onto something. Similar case in Basque is called directional allative (Spanish //adlativo direccional//). However, lative is typically thought of as a union of allative, illative and sublative, while in Basque it is derived from allative, which also exists independently. eu: beherantz = down (behe = low) | | lat | lative | Denotes movement towards/to/into/onto something. Similar case in Basque is called directional allative (Spanish //adlativo direccional//). However, lative is typically thought of as a union of allative, illative and sublative, while in Basque it is derived from allative, which also exists independently. eu: beherantz = down (behe = low) |
 +| per | perlative | Denotes movement along something. Used in Warlpiri: yurutu = road; yurutuwana = along the road. Andrews (pp. 161-164) in Shopen: Language Typology vol. 1 |
 | tem | temporal | Určuje čas. hu: hétkor = v sedm, éjfélkor = o půlnoci, karácsonykor = o Vánocích | | tem | temporal | Určuje čas. hu: hétkor = v sedm, éjfélkor = o půlnoci, karácsonykor = o Vánocích |
 | ter | terminative | Specifies where something ends in space or time. Similar case in Basque is called terminal allative (Spanish //adlativo terminal//). ee: jõeni = down to the river; ee: kella kuueni = till six o'clock; hu: a házig = up to the house; hu: hat óráig = till six o'clock; eu: erdiraino = up to the half (erdi = half) | | ter | terminative | Specifies where something ends in space or time. Similar case in Basque is called terminal allative (Spanish //adlativo terminal//). ee: jõeni = down to the river; ee: kella kuueni = till six o'clock; hu: a házig = up to the house; hu: hat óráig = till six o'clock; eu: erdiraino = up to the half (erdi = half) |
Line 396: Line 405:
 | cau | causative / motivative | Noun in this case is the cause of something. hu: Hálás leszekérte. eu: jokaeragatik = because of behavior (jokaera = behavior) | | cau | causative / motivative | Noun in this case is the cause of something. hu: Hálás leszekérte. eu: jokaeragatik = because of behavior (jokaera = behavior) |
 | ben | benefactive / destinative | Corresponds to the preposition "for". eu: mutilarentzat = for boys (mutil = boy) | | ben | benefactive / destinative | Corresponds to the preposition "for". eu: mutilarentzat = for boys (mutil = boy) |
 +| cns | considerative | Denotes something that is given in exchange for something else. Used in Warlpiri: miyi = food; miyiwanawana = in exchange for food. Andrews (pp. 161-164) in Shopen: Language Typology vol. 1 |
 +| equ | equative | “X-like”, “similar to X”, “same as X”. It marks the standard of comparison and it differs from the equative degree, which marks the property being compared. tr: bence = like me (ben = I) |
 +| cmp | comparative | “than X”. It marks the standard of comparison and it differs from the comparative degree, which marks the property being compared. It occurs in Dravidian and Northeast-Caucasian languages. |
  
   * Fine grained locative cases (Uralic languages)   * Fine grained locative cases (Uralic languages)
Line 430: Line 442:
 | sup | superlative, third degree | | sup | superlative, third degree |
 | abs | absolute superlative | | abs | absolute superlative |
 +| equ | equative ("same quality as the other object") |
 | dim | diminutive (used for nouns e.g. in Dutch: "stoeltje", "huisje", "nippertje") | | dim | diminutive (used for nouns e.g. in Dutch: "stoeltje", "huisje", "nippertje") |
 | aug | augmentative (for nouns, opposite of diminutive; both dim and aug are used in the Freeling tagset of Portuguese | | aug | augmentative (for nouns, opposite of diminutive; both dim and aug are used in the Freeling tagset of Portuguese |
Line 436: Line 449:
  
 | **Value** | **Description** | | **Value** | **Description** |
 +| 0 | zero / impersonal construction |
 | 1 | first (I, we) | | 1 | first (I, we) |
 | 2 | second (you) | | 2 | second (you) |
 | 3 | third (he, she, it, they) | | 3 | third (he, she, it, they) |
 +| 4 | fourth (i.e. another third person, morphologically distinguished from the main third person) |
  
 Note that this feature is used also for possessive pronouns, where it means the person of the possessor. E.g. "my" has person=1, "your" has person=2, "their" has person=3. Note that this feature is used also for possessive pronouns, where it means the person of the possessor. E.g. "my" has person=1, "your" has person=2, "their" has person=3.
Line 450: Line 465:
 | 2 | second (your) | | 2 | second (your) |
 | 3 | third (his, her, its, their) | | 3 | third (his, her, its, their) |
 +
 +===== clusivity =====
 +
 +| **Value** | **Description** |
 +| in | inclusive we = I + you (+ optionally they) (Indonesian "kita") |
 +| ex | exclusive we = I + they (excluding you) (Indonesian "kami") |
  
 ===== polite ===== ===== polite =====
Line 491: Line 512:
 | part | participle (present ("doing"), past ("done"), passive (Czech "udělán" distinguished from adjective "udělaný" by variant=short)), gerundive | | part | participle (present ("doing"), past ("done"), passive (Czech "udělán" distinguished from adjective "udělaný" by variant=short)), gerundive |
 | conv | converb, transgressive, adverbial participle (modifies other verbs, behaves like adverb; Czech present "dělaje", past "udělav"; some authors also call it gerund!) | | conv | converb, transgressive, adverbial participle (modifies other verbs, behaves like adverb; Czech present "dělaje", past "udělav"; some authors also call it gerund!) |
-ger | [[http://en.wikipedia.org/wiki/Gerund|gerund]] ([[http://en.wikipedia.org/wiki/Verbal_noun|verbal noun]]). Latin //gerundium:// "amare" => genitive "amandi", dative "amando", accusative "(ad) amandum", ablative "amando". |+vnoun | [[http://en.wikipedia.org/wiki/Verbal_noun|verbal noun]] 
 +| ger | [[http://en.wikipedia.org/wiki/Gerund|gerund]]. Deprecated in cases which are traditionally called gerund but could be plausibly called verbal noun (see above). Latin //gerundium:// "amare" => genitive "amandi", dative "amando", accusative "(ad) amandum", ablative "amando". |
 | gdv | [[http://en.wikipedia.org/wiki/Gerundive|gerundive]] ([[http://en.wikipedia.org/wiki/Attributive_verb|verbal adjective]]). Latin //gerundivum:// "portāre" => "portandus, portanda, portandum" | | gdv | [[http://en.wikipedia.org/wiki/Gerundive|gerundive]] ([[http://en.wikipedia.org/wiki/Attributive_verb|verbal adjective]]). Latin //gerundivum:// "portāre" => "portandus, portanda, portandum" |
  
Line 503: Line 525:
 | sub | subjunctive (conjunctive) (spojovací) | | sub | subjunctive (conjunctive) (spojovací) |
 | jus | jussive (přací) | | jus | jussive (přací) |
 +| prp | purposive (in order to) |
 | qot | quotative (Estonian: denotes direct speech) | | qot | quotative (Estonian: denotes direct speech) |
 | opt | optative (Turkish; "May you have a long life! If only I were rich!") | | opt | optative (Turkish; "May you have a long life! If only I were rich!") |
 | des | desiderative (Turkish; "He wants to come.") | | des | desiderative (Turkish; "He wants to come.") |
 | nec | necessitative (Turkish; "He must come. He should come.") | | nec | necessitative (Turkish; "He must come. He should come.") |
 +| adm | admirative (Albanian; expresses surprise, irony or doubt) |
 ===== tense ===== ===== tense =====
  
Line 520: Line 543:
 | aor | aorist | | aor | aorist |
 | imp | imperfect | | imp | imperfect |
-| nar | narrative (Turkish //miş//-past) | 
 | pqp | pluperfect | | pqp | pluperfect |
  
Line 530: Line 552:
 | imp | imperfect | | imp | imperfect |
 | perf | perfect | | perf | perfect |
-pro | prospective |+prosp | prospective |
 | prog | progressive | | prog | progressive |
 +| hab | habitual |
 +| iter | iterative, frequentative |
  
 ===== voice ===== ===== voice =====
Line 541: Line 565:
 | rcp | reciprocal (Turkish "karıştı", "tutuştular") | | rcp | reciprocal (Turkish "karıştı", "tutuştular") |
 | cau | causative (Turkish "karıştırıyor" ("is confusing")) | | cau | causative (Turkish "karıştırıyor" ("is confusing")) |
 +| antip | antipassive |
 +| dir | direct |
 +| inv | inverse |
  
 {{:user:zeman:treebanks:ttbankkl.pdf|Documentation}} of the METU Sabanci treebank classifies causative as voice (page 26). Note that this is a feature of verbs. There are languages that have also the causative case of nouns. {{:user:zeman:treebanks:ttbankkl.pdf|Documentation}} of the METU Sabanci treebank classifies causative as voice (page 26). Note that this is a feature of verbs. There are languages that have also the causative case of nouns.
  
 +===== evident =====
 +
 +Evidentiality: what is the speaker's source of information?
 +
 +| **Value** | **Description** |
 +| fh | firsthand |
 +| nfh | nonfirsthand |
 ===== abbr ===== ===== abbr =====
  
Line 549: Line 583:
  
 | **Value** | **Description** | | **Value** | **Description** |
-abbr | abbreviation |+yes | abbreviation |
  
 ===== hyph ===== ===== hyph =====
Line 556: Line 590:
  
 | **Value** | **Description** | | **Value** | **Description** |
-hyph | hyphenated prefix ("anglo-" in "anglo-saxon") |+yes | hyphenated prefix ("anglo-" in "anglo-saxon") |
  
 ===== echo ===== ===== echo =====
Line 588: Line 622:
  
 | **Value** | **Description** | | **Value** | **Description** |
-typo | typo, bad spelling, error |+yes | typo, bad spelling, error 
 + 
 +===== strength ===== 
 + 
 +Distinguishes between strong and weak forms of adjectives or pronouns. Used e.g. in Romanian UD. See also the ''variant'' feature. Some tagsets use ''variant=long'' instead of ''strength=strong'', and ''variant=short'' instead of ''strength=weak''. However, the ''strength'' feature has been tentatively added to Interset because it is slightly more specific and also because we want to be able to seamlessly read the features from the UD corpora that use it. 
 + 
 +| **Value** | **Description** | 
 +| weak   | weak form    | 
 +| strong | strong form  |
  
 ===== variant ===== ===== variant =====
Line 607: Line 649:
 | 8 | variant form 8 | | 8 | variant form 8 |
 | 9 | variant form 9 | | 9 | variant form 9 |
 +| a | variant form a (abbreviation in PDT-C) | 
 +| b | variant form b (abbreviation in PDT-C) | 
 +| c | variant form c (abbreviation in PDT-C) |
 ===== tagset, other ===== ===== tagset, other =====
  

[ Back to the navigation ] [ Back to the content ]