[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
external:tectomt:tutorial [2009/01/22 11:05]
kravalova
external:tectomt:tutorial [2009/01/22 11:49]
kravalova
Line 1: Line 1:
 ====== TectoMT Tutorial ====== ====== TectoMT Tutorial ======
  
-Welcome at TectoMT Tutorial. This tutorial should take about hours.+Welcome at TectoMT Tutorial. This tutorial should take about hours. 
  
  
Line 7: Line 8:
 ===== What is TectoMT ===== ===== What is TectoMT =====
  
-TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. +TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to facilitate and significantly accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. 
  
  
Line 20: Line 21:
   * Your shell is bash   * Your shell is bash
   * You have basic experience with bash and can read basic Perl   * You have basic experience with bash and can read basic Perl
 +
  
  
Line 34: Line 36:
 ==== Installation and setup ==== ==== Installation and setup ====
  
-  * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have to checkout the repository into directory ''/home/BIG'' (because data quotas don't apply here):+  * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have to checkout the repository into directory ''/home/BIG'' (because bigger disk quota applies here):
  
 <code bash> <code bash>
Line 343: Line 345:
 ==== Task ==== ==== Task ====
 A block which, given an analytical tree (''SEnglishA''), fills each ''a-node'' with boolean attribute ''is_clause_head'' which is set to ''1'' if the ''a-node'' corresponds to a finite verb, and to ''0'' otherwise. A block which, given an analytical tree (''SEnglishA''), fills each ''a-node'' with boolean attribute ''is_clause_head'' which is set to ''1'' if the ''a-node'' corresponds to a finite verb, and to ''0'' otherwise.
 +
  
  
Line 394: Line 397:
  
 //Note//: ''get_children()'' returns topological node children in a tree, while ''get_eff_children()'' returns node children in a linguistic sense. Mostly, these do not differ. If interested, see Figure 1 in [[http://ufal.mff.cuni.cz/pdt2.0/doc/tools/tred/bn-tutorial.html|btred tutorial]]. //Note//: ''get_children()'' returns topological node children in a tree, while ''get_eff_children()'' returns node children in a linguistic sense. Mostly, these do not differ. If interested, see Figure 1 in [[http://ufal.mff.cuni.cz/pdt2.0/doc/tools/tred/bn-tutorial.html|btred tutorial]].
 +
 +//Hint//: Finite clauses in English usually require grammatical subject to be present.
  
 ==== Advanced version ==== ==== Advanced version ====
Line 402: Line 407:
  
 ===== Your turn: more tasks ===== ===== Your turn: more tasks =====
 +
  
  
Line 424: Line 430:
 **Instructions**:  **Instructions**: 
  
 +  * You can use block template in ''libs/blocks/BlockTemplate.pm''
   * To find an object to a verb, look for objects among effective children of a verb (''$child<nowiki>-></nowiki>get_attr('afun') eq 'Obj' ''). That implies working on analytical layer.   * To find an object to a verb, look for objects among effective children of a verb (''$child<nowiki>-></nowiki>get_attr('afun') eq 'Obj' ''). That implies working on analytical layer.
   * For debugging, a method returning surface word order of a node is useful: ''$node<nowiki>-></nowiki>get_attr('ord')''. It can be used to print out nodes sorted by attribute ''ord''.   * For debugging, a method returning surface word order of a node is useful: ''$node<nowiki>-></nowiki>get_attr('ord')''. It can be used to print out nodes sorted by attribute ''ord''.
Line 429: Line 436:
  
 **Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method ''$node->shift_after_subtree($root_of_that_subtree)''. Subjects should have attribute '''afun' eq 'Sb'''. **Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method ''$node->shift_after_subtree($root_of_that_subtree)''. Subjects should have attribute '''afun' eq 'Sb'''.
 +
  
  
Line 479: Line 487:
 //Hint//:  //Hint//: 
   * On analytical layer, you can use this test to recognize prepositions: ''$node<nowiki>-></nowiki>get_attr('afun') eq 'AuxP' ''    * On analytical layer, you can use this test to recognize prepositions: ''$node<nowiki>-></nowiki>get_attr('afun') eq 'AuxP' '' 
-  * You can use block template in ''libs/blocks/BlockTemplate.pm'' 
   * To see the results, you can again use TrEd (''tmttred sample.tmt'')   * To see the results, you can again use TrEd (''tmttred sample.tmt'')
  

[ Back to the navigation ] [ Back to the content ]