Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
user:zeman:interset:versions [2009/02/20 10:37] zeman vytvořeno |
user:zeman:interset:versions [2014/06/11 10:52] zeman Interset 2.0. |
||
---|---|---|---|
Line 18: | Line 18: | ||
! Various maintenance changes took place, too. Version control has been migrated to network-accessible (though not publicly accessible) SVN repository, together with Trac project management interface. Website now includes information on [[License|licensing]], | ! Various maintenance changes took place, too. Version control has been migrated to network-accessible (though not publicly accessible) SVN repository, together with Trac project management interface. Website now includes information on [[License|licensing]], | ||
+ | |||
+ | ? 1.1 | ||
+ | ! 8 September 2009. Three new incarnations of Czech, English and German CoNLL tagsets, reflecting the 2009 changes in format. Most interestingly, | ||
+ | |||
+ | ? 1.2 | ||
+ | ! 27 June 2011. New drivers: Prague Spoken Corpus (Pražský mluvený korpus, PMK) long and short tags ('' | ||
+ | |||
+ | ! New test: For all tags in all drivers now must hold that deleting the value of the '' | ||
+ | |||
+ | ! New usage: Interset in Treex (TectoMT). | ||
+ | |||
+ | ? Changes since then | ||
+ | ! I am working on Interset 2.0, to be released in the second half of 2014. It will be a complete rewrite of Interset, using Moose, the object-oriented extension of Perl 5. I also plan exportable conversion tables that will bring Interset functionality to programming languages other than Perl. | ||
+ | |||
+ | ! Feature changes: | ||
+ | * The '' | ||
+ | * The '' | ||
+ | * I am considering removal of the feature '' | ||
+ | * The features '' | ||
+ | * I am considering further changes in partition of numerals, in a similar spirit as with pronouns. Many words that are considered numerals in Czech are tagged as nouns, adjectives, pronouns, determiners or adverbs in other tagsets. I may decide to keep a separate part of speech for cardinal numbers but I have not arrived at a clear opinion yet. |