[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:interset:features [2014/09/09 17:05]
popel
user:zeman:interset:features [2014/11/17 00:29]
zeman
Line 15: Line 15:
 | part | particle | | part | particle |
 | int | interjection | | int | interjection |
-| punc | punctuation or symbol |+| punc | punctuation 
 +| sym | symbol | 
 + 
 +The difference between punctuation and symbols is that punctuation delimits parts of the sentence while symbols can be substituted for a word. For example, //$// is not a punctuation, it is another form of writing the noun //dollar.// See also [[http://universaldependencies.github.io/docs/u/pos/SYM.html|the definition of SYM]] for the Universal Dependencies.
  
 ===== nountype ===== ===== nountype =====
Line 52: Line 55:
  
 ===== adjtype ===== ===== adjtype =====
 +
 +A deprecated feature. The only value that has not yet been moved elsewhere is ''pdt''.
  
 | **Value** | **Description** | | **Value** | **Description** |
-| pdt | predeterminer (adjectival word that can stand before an article, such as "all" in "all the flowers") | +| pdt | predeterminer (it is a special form of determiner; it is an adjectival word that can stand before an article, such as "all" in "all the flowers") |
-| det | determiner (function word modifying a noun phrase: English "this", "that"); regarded indefinite/demonstrative pronoun in some tagsets; includes articles (see below) in some tagsets | +
-| art | article, i.e. determiner bearing only the feature of definiteness or indefinitess and nothing more (English "a", "an", "the", German "der", "die", "das", Portuguese "um", "uma", "o", "a", "os", "as") |+
  
 ===== prontype ===== ===== prontype =====
  
-This is a new (September 2007) feature applied first to the Bulgarian CoNLL tag set. It takes over the pronoun classification that has been so far kept in the definiteness feature. See the [[brainstorming]] section for further details on lexical and morphological definiteness. +Although it reads as "pronoun type" (and we use the word "pronoun" for simplicity), it is also applied to words that are usually not considered pronouns, such as determiners, interrogative/indefinite adverbs (where, there, when, then, how, why) etc.
- +
-Although it reads as "pronoun type" (and we use the word "pronoun" for simplicity), it is also applied to words that are usually not considered pronouns, such as interrogative/indefinite adverbs (where, there, when, then, how, why).+
  
 | **Value** | **Description** | | **Value** | **Description** |
 | | Empty value means that this is not a pronoun but a real noun, adjective, adverb etc. | | | Empty value means that this is not a pronoun but a real noun, adjective, adverb etc. |
-| prn | The word is pronominal but we do not know the exact type. |+| prn | The word is pronominal (or determiner) but we do not know the exact type. |
 | prs | Personal or possessive pronoun. Possessives are recognizable by the value of their poss feature. | | prs | Personal or possessive pronoun. Possessives are recognizable by the value of their poss feature. |
 | rcp | Reciprocal pronoun (German "einander", Danish "hinanden"). Similar to personal pronouns but occurs as special case in object position. | | rcp | Reciprocal pronoun (German "einander", Danish "hinanden"). Similar to personal pronouns but occurs as special case in object position. |
-| int | Interrogative pronoun ("who", "what", "which"). | +| art | Article, i.e. determiner bearing only the feature of definiteness or indefinitess and nothing more (English "a", "an", "the", German "der", "die", "das", Portuguese "um", "uma", "o", "a", "os", "as"). | 
-| rel | Relative pronoun. Many interrogative pronouns in many languages can also be used as relative pronouns. However, in some languages there are pronouns that fall in one of the categories but not both (Czech "jenž" is only relative; in Bulgarian, relatives are completely separated from interrogatives). For words that can be both interrogative and relative, "int" is the default value. | +| int | Interrogative pronoun / determiner / adverb ("who", "what", "which"). | 
-| dem | Demonstrative pronoun ("this", "that"). Being a demonstrative pronoun is not the same as being definite (definiteness=def), although the two feature-values are similar. | +| rel | Relative pronoun / determiner / adverb. Many interrogative pronouns in many languages can also be used as relative pronouns. However, in some languages there are pronouns that fall in one of the categories but not both (Czech "jenž" is only relative; in Bulgarian, relatives are completely separated from interrogatives). For words that can be both interrogative and relative, "int" is the default value. | 
-| neg | Negative pronoun ("nobody, nothing, none"). This is not the same as the negativeness feature. Unlike e.g. negative and positive adjectives or verbs, negative pronouns are not complements of some "positive" pronouns. Instead, they usually correspond to zero, nothing. | +| dem | Demonstrative pronoun / determiner / adverb ("this", "that"). Being a demonstrative pronoun is not the same as being definite (definiteness=def), although the two feature-values are similar. | 
-| ind | Indefinite pronoun ("somebody", "something", "anybody", "anything"). Being an indefinite pronoun is not the same as being morphologically indefinite (definiteness=ind). For instance, in Bulgarian there are morphologically definite lexically indefinite pronouns ("едната", "едното", "едните", "нещата"). In some languages, we could subclassify the indefinite pronouns into "few" ("málokdo"), "ind" ("někdo"), "mny" ("leckdo"), "any" ("kdokoli" - anybody you pick but you pick only one, not all at once; this is the difference from the totality pronouns) | +| neg | Negative pronoun / determiner / adverb ("nobody, nothing, none"). This is not the same as the negativeness feature. Unlike e.g. negative and positive adjectives or verbs, negative pronouns are not complements of some "positive" pronouns. Instead, they usually correspond to zero, nothing. | 
-| tot | Totality (universal) pronoun ("everybody", "everything") |+| ind | Indefinite pronoun / determiner / adverb ("somebody", "something", "anybody", "anything"). Being an indefinite pronoun is not the same as being morphologically indefinite (definiteness=ind). For instance, in Bulgarian there are morphologically definite lexically indefinite pronouns ("едната", "едното", "едните", "нещата"). In some languages, we could subclassify the indefinite pronouns into "few" ("málokdo"), "ind" ("někdo"), "mny" ("leckdo"), "any" ("kdokoli" - anybody you pick but you pick only one, not all at once; this is the difference from the totality pronouns) | 
 +| tot | Totality (universal) pronoun / determiner / adverb ("everybody", "everything") |
  
 ===== numtype ===== ===== numtype =====
Line 188: Line 190:
 | semi | semicolon | | semi | semicolon |
 | dash | dash | | dash | dash |
-| symb | symbol | 
 | root | artificial sentence root node, beginning of sentence | | root | artificial sentence root node, beginning of sentence |
  
Line 262: Line 263:
 | **Value** | **Description** | | **Value** | **Description** |
 | foreign | foreign word (not a loan word but a citation in a foreign language — e.g., the title of a foreign book) | | foreign | foreign word (not a loan word but a citation in a foreign language — e.g., the title of a foreign book) |
 +| fscript | foreign word written in a foreign script, e.g. "सगरमाथा" in English text |
 +| tscript | foreign word transcribed from a foreign script, e.g. "Sagaramāthā" in English text |
  
 ===== gender ===== ===== gender =====
Line 300: Line 303:
 | sing | singular | | sing | singular |
 | dual | dual | | dual | dual |
-plu | plural |+plur | plural |
 | ptan | plurale tantum | | ptan | plurale tantum |
 | coll | collective / mass / singulare tantum | | coll | collective / mass / singulare tantum |
Line 315: Line 318:
 | sing | singular | | sing | singular |
 | dual | dual | | dual | dual |
-plu | plural |+plur | plural |
  
 It applies e.g. to possessive pronouns and it can be different from their grammatical number, which is governed by agreement with the modified (possessed) noun phrase. Czech example: //můj pes// "my dog" (grammatical singular, possessor singular), //mí psi// "my dogs" (grammatical plural, possessor singular), //náš pes// "our dog" (grammatical singular, possessor plural), //naši psi// "our dogs" (grammatical plural, possessor plural). It applies e.g. to possessive pronouns and it can be different from their grammatical number, which is governed by agreement with the modified (possessed) noun phrase. Czech example: //můj pes// "my dog" (grammatical singular, possessor singular), //mí psi// "my dogs" (grammatical plural, possessor singular), //náš pes// "our dog" (grammatical singular, possessor plural), //naši psi// "our dogs" (grammatical plural, possessor plural).
Line 326: Line 329:
 | sing | singular | | sing | singular |
 | dual | dual | | dual | dual |
-plu | plural |+plur | plural |
  
 In Hungarian, possession can be marked on the possessor or on the possessed. It is possible, though rare, that a noun has three distinct number features: its own grammatical number, number of its possessor and number of its possession. Examples from the Multext-East Hungarian lexicon: In Hungarian, possession can be marked on the possessor or on the possessed. It is possible, though rare, that a noun has three distinct number features: its own grammatical number, number of its possessor and number of its possession. Examples from the Multext-East Hungarian lexicon:
Line 449: Line 452:
 | inf | informal (Czech "ty/vy", German "du/ihr", Spanish "tú/vosotros") | | inf | informal (Czech "ty/vy", German "du/ihr", Spanish "tú/vosotros") |
 | pol | polite (Czech "vy", German "Sie", Spanish "usted") | | pol | polite (Czech "vy", German "Sie", Spanish "usted") |
 +
 +===== (abs|erg|dat)(person|number|politeness|gender) =====
 +
 +In quite a few languages, finite verb forms agree in person and number with the subject. In Basque, a subset of verbs agree with up to three arguments: one in the absolutive case, one in ergative and one in dative. To distinguish the different values of person, number (and politeness and rarely even gender), there are special features for each of the three arguments. Their names contain the three-letter code of the case of the argument: ''absperson'', ''absnumber'', ''ergperson'', ''ergnumber'' etc. The value range is identical to the base features. That is, ''absnumber'', ''ergnumber'' and ''datnumber'' may get the same values as ''number''.
  
 ===== subcat ===== ===== subcat =====

[ Back to the navigation ] [ Back to the content ]