Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
user:zeman:interset:to-do [2008/04/04 16:50] zeman Verb forms, moods and tenses. |
user:zeman:interset:to-do [2008/04/29 23:17] zeman Subposes to remove. |
| |
* Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist. | * Normalize processing of pronouns, determiners, interrogative adverbs etc. Old drivers use a different approach from the new ones (beginning with Bulgarian). Pronoun as an independent part of speech will cease to exist. |
* Remove ''pos="det"''. Instead, ''det'' will be a ''subpos'' of adjectives, similarly to ''pdt''. Setting ''prontype'' or leaving it empty determines how determiners will be treated in tagsets where there is no such category. With empty ''prontype'', they will become adjectives. If ''prontype'' is set, they will become pronouns. | |
* Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''. Remove subposes of pronouns (''pers'', ''clit''...) | * Remove ''pos="pron"''. Distribute pronouns to nouns, adjectives and adverbs. When encoding into a tagset that distinguishes pronouns, detect pronouns by non-empty ''prontype''. Remove subposes of pronouns (''pers'', ''clit''...) |
| * Remove ''subpos = pers'' and ''subpos = recip''. These features should now be captured by ''prontype''. |
* Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below). | * Move ''subpos=clit'' to an independent feature so that it is easier to ask whether a pronoun is personal. Or remove the feature. This is connected to the problem of changed processing of pronouns, and of the processing of contracted word forms (see below). |
* Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21". | * Find more fine-grained classification of punctuation and symbols. Danish has punctuation proper, symbols (+, $), and strange strings like "U-21". |