[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
external:tectomt:tutorial [2009/01/20 12:37]
kravalova
external:tectomt:tutorial [2009/01/20 15:32]
kravalova
Line 8: Line 8:
  
 TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces.  TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces. 
 +
  
 ===== Prerequisities ===== ===== Prerequisities =====
 +
 +In this tutorial, we assume 
 +
 +  * Your system is Linux
 +  * Your shell is bash
 +  * You have basic experience bash and you can read Perl
 +
 +
 +
 +
  
  
Line 16: Line 27:
 ==== Installation and setup ==== ==== Installation and setup ====
  
-TODO popsat instalaci+  * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have checkout the repository into directory /home/BIG (because data quotas don't apply here):
  
-Before running any experiments with TectoMT, you have to set up your environment by running+<code bash> 
 +    cd ~/BIG 
 +    svn --username <username> co https://svn.ms.mff.cuni.cz/svn/tectomt_devel/trunk tectomt 
 +</code> 
 + 
 +  * In ''tectomt/install/'' run ./install.sh:
  
 <code bash> <code bash>
-source config/init_devel_environ.sh+    cd tectomt/install 
 +    ./install.sh
 </code> </code>
 +
 +  * In your ''.bashrc'' file, add line (or source this file every time before experimenting with TectoMT):
 +
 +<code bash>
 +    source ~/BIG/tectomt/config/init_devel_environ.sh
 +</code>
 +
 +
 +
 +
  
  
  
  
-==== Theoretical background ==== 
  
-TODO obrazek 
  
  
Line 36: Line 61:
  
  
-==== TrEd ==== 
  
-TODO malicko o TrEdu a obrazek 
  
  
Line 56: Line 79:
  
 This tutorial itself has its blocks in ''libs/blocks/Tutorial'' and the application in ''applications/tutorial''. This tutorial itself has its blocks in ''libs/blocks/Tutorial'' and the application in ''applications/tutorial''.
 +
  
  
Line 61: Line 85:
  
 ==== Layers of Linguistic Structures ==== ==== Layers of Linguistic Structures ====
 +
 +{{ external:tectomt:pyramid.gif?300x190|MT pyramid in terms of PDT layers}}
  
 TectoMT blocks repository is saved in ''libs/blocks/''. In correspondence with ..., the blocks are located in directories describing their purpose.  TectoMT blocks repository is saved in ''libs/blocks/''. In correspondence with ..., the blocks are located in directories describing their purpose. 
Line 305: Line 331:
  
  
-==== SVO typology ==== 
  
-TODO+ 
 + 
 + 
 + 
 +==== SVO to SOV ==== 
 + 
 +**Motivation**: During translation from an SVO based language (English) to an SOV based language (Korean) we might need to change the word order from SVO to SOV.  
 + 
 +**Task**: On analytical layer, change the word order from SVO to SOV. 
 + 
 +**Instructions**: To find an object to a verb, look for ''$afun eq 'Obj' '' among effective children of a verb. 
 + 
 + 
 + 
 + 
  
  
Line 323: Line 363:
 ==== Prepositions ==== ==== Prepositions ====
  
-In dependency approach a question "where to hang prepositions" arises. In praguian style (PDT), prepositions are heads of the subtree and the noun/pronoun is dependent on the preposition. However, another ordering might be preferable: The noun/pronoun might be the head of subtree, while the preposition would take the role of a modifier.+**Motivation**: In dependency approach a question "where to hang prepositions" arises. In praguian style (PDT), prepositions are heads of the subtree and the noun/pronoun is dependent on the preposition. However, another ordering might be preferable: The noun/pronoun might be the head of subtree, while the preposition would take the role of a modifier.
  
 TODO obrazek TODO obrazek
  
-The task is to rehang all prepositions as indicated at the picture. You may assume that prepositions have at most 1 child.+**Task**: The task is to rehang all prepositions as indicated at the picture. You may assume that prepositions have at most 1 child. 
 + 
 +** Instructions**:
  
 You are going to need these new methods: You are going to need these new methods:
Line 334: Line 376:
   * ''$node->set_parent($parent)''   * ''$node->set_parent($parent)''
  
-You can use block template in ''libs/blocks/BlockTemplate.pm''. To see the results, you can again use TrEd (''tmttred sample.tmt'')+//Hint//:  
 +  * On analytical layer, you can use this test to recognize prepositions: ''$node->get_attr('afun') eq 'AuxP' ''  
 +  * You can use block template in ''libs/blocks/BlockTemplate.pm''. To see the results, you can again use TrEd (''tmttred sample.tmt'')
  
-//Hint//: On analytical layer, you can use ''$afun eq 'AuxP'' test to recognize prepositions. 
  
 //Advanced version//: What happens in case of multiword prepositions? For example, ''because of'', ''instead of''. Can you handle it? //Advanced version//: What happens in case of multiword prepositions? For example, ''because of'', ''instead of''. Can you handle it?

[ Back to the navigation ] [ Back to the content ]