[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:interset:how-to-write-a-driver [2008/03/14 10:15]
zeman Mutual positions of list() and BEGIN.
user:zeman:interset:how-to-write-a-driver [2008/03/14 10:21]
zeman Conversion testing.
Line 140: Line 140:
  
 See [[user:zeman:interset:Common Problems]] for a list of suggestions for phenomena difficult to match between tagsets and the Interset. See [[user:zeman:interset:Common Problems]] for a list of suggestions for phenomena difficult to match between tagsets and the Interset.
 +
  
  
Line 152: Line 153:
  
 <code>driver-test.pl ar::conll <code>driver-test.pl ar::conll
-driver-test.pl -a</code>+driver-test.pl -a 
 +driver-test.pl bg::conll cs::pdt 
 +driver-test.pl -A</code>
  
-Running ''driver-test.pl'' without arguments will list the drivers available on the system. Running it with the ''-a'' option will test all the drivers.+Running ''driver-test.pl'' without arguments will list the drivers available on the system. Running it with the ''-a'' option will test all the drivers. Two arguments test both drivers separately and then conversions from driver A to driver B and vice versa. The ''-A'' option tests all conversions between all pairs of drivers.
  
 Note that only drivers implementing the ''list()'' function can be tested. Most testing involves generating the list of all possible tags and testing the driver on each tag separately. Note that only drivers implementing the ''list()'' function can be tested. Most testing involves generating the list of all possible tags and testing the driver on each tag separately.
  
-The following tests will be performed:+The following tests will be performed for a single driver:
  
   * Decode each tag and check that only known features and values are set. In addition to a built-in list, every feature can have an empty value, and the features "tagset" and "other" can have any value.   * Decode each tag and check that only known features and values are set. In addition to a built-in list, every feature can have an empty value, and the features "tagset" and "other" can have any value.
   * Check for each tag that ''encode(decode($tag)) eq $tag''. While sometimes it can be annoying to try to preserve some obscure information hidden in the tags, this test can also reveal many unwanted bugs. Besides, you should preserve information of your own tagset because people may want to use your driver merely to //access// the tags, instead of //converting// them.   * Check for each tag that ''encode(decode($tag)) eq $tag''. While sometimes it can be annoying to try to preserve some obscure information hidden in the tags, this test can also reveal many unwanted bugs. Besides, you should preserve information of your own tagset because people may want to use your driver merely to //access// the tags, instead of //converting// them.
 +
 +The following tests will be performed for a pair of drivers:
 +
 +  * Decode every tag of the first driver, encode it using the second driver and check whether the result is a known tag in the second tagset.
  

[ Back to the navigation ] [ Back to the content ]