[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:interset:to-do [2008/04/29 17:48]
zeman pos = det removed.
user:zeman:interset:to-do [2008/04/29 23:17]
zeman Subposes to remove.
Line 14: Line 14:
   * Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist.   * Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist.
     * Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''. Remove subposes of pronouns (''pers'', ''clit''...)     * Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''. Remove subposes of pronouns (''pers'', ''clit''...)
 +    * Remove ''subpos = pers'' and ''subpos = recip''. These features should now be captured by ''prontype''.
     * Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below).     * Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below).
   * Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21".   * Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21".

[ Back to the navigation ] [ Back to the content ]