[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
user:zeman:interset:versions [2011/06/27 17:03]
zeman Version 1.2. released.
user:zeman:interset:versions [2014/06/11 10:54]
zeman CPAN.
Line 30: Line 30:
  
 ? Changes since then ? Changes since then
 +! I am working on Interset 2.0, to be released in the second half of 2014. It will be a complete rewrite of Interset, using Moose, the object-oriented extension of Perl 5; it will be published at [[http://www.cpan.org/|CPAN]] as ''Lingua::Interset''. I also plan exportable conversion tables that will bring Interset functionality to programming languages other than Perl.
  
 +! Feature changes:
 +  * The ''prep'' value of the ''pos'' feature (preposition) will be renamed to ''adp'' (adposition) because it covers prepositions, postpositions and circumpositions.
 +  * The ''subpos'' feature will be partially divided in several new features that reflect the main part of speech: ''nountype'', ''adjtype'', ''verbtype'' and ''conjtype''. This is a logical extension of previously created ''prontype'', ''advtype'' etc. I have not yet decided whether ''subpos'' will disappear completely or there will be a small set of values that will remain in ''subpos''.
 +  * I am considering removal of the feature ''synpos''. Investigation is needed to what extent it is actually used in what tagsets and whether or not it overlaps with information stored elsewhere.
 +  * The features ''tense'' and ''subtense'' have been merged. Their separation in the early years of Interset was driven by problems with encoding tagsets that lacked specialized tenses; later on however, Interset got the algorithms for strict encoding and feature replacement. Now there are other features whose values form a hierarchy, so it seems logical to treat tenses the same way.
 +  * I am considering further changes in partition of numerals, in a similar spirit as with pronouns. Many words that are considered numerals in Czech are tagged as nouns, adjectives, pronouns, determiners or adverbs in other tagsets. I may decide to keep a separate part of speech for cardinal numbers but I have not arrived at a clear opinion yet.

[ Back to the navigation ] [ Back to the content ]