Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
external:tectomt:tutorial [2009/01/21 11:09] kravalova |
external:tectomt:tutorial [2009/01/21 11:33] kravalova |
| |
TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. | TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. |
| |
| |
| |
| |
| |
* Your system is Linux | * Your system is Linux |
* Your shell is bash | * Your shell is bash |
* You have basic experience bash and you can read Perl | * You have basic experience with bash and can read basic Perl |
| |
| |
==== Installation and setup ==== | ==== Installation and setup ==== |
| |
* Checkout SVN repository. If you are running this installation in a computer lab in Prague, you have to checkout the repository into directory ''/home/BIG'' (because data quotas don't apply here): | * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have to checkout the repository into directory ''/home/BIG'' (because data quotas don't apply here): |
| |
<code bash> | <code bash> |
* For debugging, a method returning surface word order of a node is useful: ''$node<nowiki>-></nowiki>get_attr('ord')''. It can be used to print out nodes sorted by attribute ''ord''. | * For debugging, a method returning surface word order of a node is useful: ''$node<nowiki>-></nowiki>get_attr('ord')''. It can be used to print out nodes sorted by attribute ''ord''. |
* Once you have node ''$object'' and node ''$verb'', use method ''$object<nowiki>-></nowiki>shift_before_node($verb)''. This method takes the whole subtree under node ''$object'' and counts the attributes ''ord'' (surface word order) so that all nodes in subtree under ''$object'' have smaller ''ord'' than ''$verb''. That is, the method rearranges the surface word order from VO to OV. | * Once you have node ''$object'' and node ''$verb'', use method ''$object<nowiki>-></nowiki>shift_before_node($verb)''. This method takes the whole subtree under node ''$object'' and counts the attributes ''ord'' (surface word order) so that all nodes in subtree under ''$object'' have smaller ''ord'' than ''$verb''. That is, the method rearranges the surface word order from VO to OV. |
| |
| |
| |
| |
You are going to need these new methods: | You are going to need these new methods: |
* ''my @children = $node<nowiki>-></nowiki>get_children'' | * ''my @children = $node<nowiki>-></nowiki>get_children()'' |
* ''my $parent = $node<nowiki>-></nowiki>get_parent'' | * ''my $parent = $node<nowiki>-></nowiki>get_parent()'' |
* ''$node<nowiki>-></nowiki>set_parent($parent)'' | * ''$node<nowiki>-></nowiki>set_parent($parent)'' |
| |