[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
user:zeman:interset:how-to-write-a-driver [2007/03/07 10:33]
zeman
user:zeman:interset:how-to-write-a-driver [2007/10/01 13:50]
zeman Minor changes.
Line 11: Line 11:
 This function has one string argument, the tag. The function returns a reference to a hash of features (feature names are hash keys to the feature values). This function has one string argument, the tag. The function returns a reference to a hash of features (feature names are hash keys to the feature values).
  
-The decoder is not obliged to set any feature. If the decoder decides to set a feature, it should be one of the pre-defined values. This can be checked by a central procedure. However, it is not mandatory, so if the appropriate value is not available, you can use your own, but please do **[[zeman@ufal.mff.cuni.cz|let me know]]** so I can update the central value pool accordingly.+The decoder is not obliged to set any feature. If the decoder decides to set a feature, it should be one of the pre-defined values. This can be checked by a central procedure. However, it is not mandatory, so if the appropriate value is not available, you can use your own, but please do **[[zeman@ufal.mff.cuni.cz|let me know]]** so I can update the [[features|central value pool]] accordingly. (If you set a value that is not documented as a part of the universal set, no one else can benefit from it. If you combine your driver with another driver to convert from your tag set to the other, the other driver's encode() will not take your invented value into account. It may even behave worse than if the value was empty.)
  
 If the tagset encodes features separately (e.g., each character is a value of a particular feature): The decoder should be tolerant to unexpected combinations of features (or should be able to be tolerant if asked for it). If the tagset encodes features separately (e.g., each character is a value of a particular feature): The decoder should be tolerant to unexpected combinations of features (or should be able to be tolerant if asked for it).
 +
  
 ===== encode() ===== ===== encode() =====
Line 19: Line 20:
 This function has one argument, a reference to a hash of features (feature names are hash keys to the feature values). The function returns a string - the tag. This function has one argument, a reference to a hash of features (feature names are hash keys to the feature values). The function returns a string - the tag.
  
-The encoder should be able to process all possible values from the central pool. If the tagset does not recognize a value, the most appropriate substitute should be chosen.+The encoder should be able to process all possible values from the [[features|central pool]]. If the tagset does not recognize a value, the most appropriate substitute should be chosen.
  
-Since any feature can in theory have an array of values instead of a single value, the encoder should either be prepared to arrays (more precisely: array references) anywhere, or call tagset::single_values() to get rid of the arrays (or some of them).+Since any feature can in theory have an array of values instead of a single value, the encoder should either be prepared to arrays (more precisely: array references) anywhere, or call tagset::single_values() to get rid of the arrays (or some of them). See [[#Alternate values]] for more details.
  
-**WARNING:** Before modifying the contents of ''%f'', you should make a //deep// copy of it. You cannot assume that the user of the driver will not need the values in ''%f'' after encoding.+**WARNING:** Before modifying the contents of ''%f'', you should make a //deep// copy of it. You cannot assume that the user of the driver will not need the values in ''%f'' after encoding. If you have called ''single_values()'', it made the copy for you.
  
 ===== list() ===== ===== list() =====
Line 101: Line 102:
  
 To perform the test, run the script ''driver-test.pl'' in the ''tagset'' root folder. Note that the name of the driver to test is currently hard-coded into the source. In future, it will be changed to a command-line argument. To perform the test, run the script ''driver-test.pl'' in the ''tagset'' root folder. Note that the name of the driver to test is currently hard-coded into the source. In future, it will be changed to a command-line argument.
- 

[ Back to the navigation ] [ Back to the content ]