[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:interset:to-do [2008/03/31 18:10]
zeman Removed values of definiteness that are now in prontype.
user:zeman:interset:to-do [2008/04/04 13:20]
zeman New feature: prepcase.
Line 14: Line 14:
   * Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist.   * Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist.
     * Remove ''pos="det"''. Instead, ''det'' will be a ''subpos'' of adjectives, similarly to ''pdt''. Setting ''prontype'' or leaving it empty determines how determiners will be treated in tagsets where there is no such category. With empty ''prontype'', they will become adjectives. If ''prontype'' is set, they will become pronouns.     * Remove ''pos="det"''. Instead, ''det'' will be a ''subpos'' of adjectives, similarly to ''pdt''. Setting ''prontype'' or leaving it empty determines how determiners will be treated in tagsets where there is no such category. With empty ''prontype'', they will become adjectives. If ''prontype'' is set, they will become pronouns.
-    * Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''.+    * Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''Remove subposes of pronouns (''pers'', ''clit''...)
     * Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below).     * Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below).
   * Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21".   * Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21".
Line 29: Line 29:
   * Přejmenovat number = plu na plur?    * Přejmenovat number = plu na plur?
   * Zrušit ''subpos = voc''. Zatím se používá pro vokalizované tvary českých předložek v cs::pdt (a odvozeném cs::conll; nikde jinde). Místo toho by se ale dalo využít ''variant = long''. U tříd předložek to teď narušuje členění na předložky, záložky a "okololožky" (cirkumpozice). **Problém:** jak vokalizované, tak nevokalizované předložky se také vyskytují s ''variant = 1''. Nemůžu do jednoho rysu nacpat současně ''long'' a ''1'', a nemůžu ani říct, že z ''1'' taky plyne vokalizovanost.   * Zrušit ''subpos = voc''. Zatím se používá pro vokalizované tvary českých předložek v cs::pdt (a odvozeném cs::conll; nikde jinde). Místo toho by se ale dalo využít ''variant = long''. U tříd předložek to teď narušuje členění na předložky, záložky a "okololožky" (cirkumpozice). **Problém:** jak vokalizované, tak nevokalizované předložky se také vyskytují s ''variant = 1''. Nemůžu do jednoho rysu nacpat současně ''long'' a ''1'', a nemůžu ani říct, že z ''1'' taky plyne vokalizovanost.
 +
  
 ===== Specific drivers ===== ===== Specific drivers =====
  
   * cs::pdt - reimplement "type L" pronouns as collective pronouns (introduced due to Bulgarian)   * cs::pdt - reimplement "type L" pronouns as collective pronouns (introduced due to Bulgarian)
 +  * cs::pdt - use the new feature ''prepcase'' (introduced due to Portuguese) in distinguishing pronoun forms "jemu" vs. "němu"

[ Back to the navigation ] [ Back to the content ]