[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
external:tectomt:tutorial [2009/01/22 14:06]
kravalova
external:tectomt:tutorial [2009/04/01 10:45]
ptacek
Line 1: Line 1:
 ====== TectoMT Tutorial ====== ====== TectoMT Tutorial ======
  
-Welcome at TectoMT Tutorial. This tutorial should take about 3 hours.+Welcome to the TectoMT Tutorial. This tutorial should take about 3 hours.
  
  
Line 21: Line 21:
   * Your shell is bash   * Your shell is bash
   * You have basic experience with bash and can read basic Perl   * You have basic experience with bash and can read basic Perl
 +
  
  
Line 43: Line 44:
 <code bash> <code bash>
     cd ~/BIG     cd ~/BIG
-    svn --username mtm co https://svn.ms.mff.cuni.cz/svn/tectomt_devel/trunk tectomt+    svn --username public co https://svn.ms.mff.cuni.cz/svn/tectomt_devel/trunk tectomt
 </code> </code>
 +
 +  * accept the certificate and provide a password which is same as the username ie. : public
  
   * In ''tectomt/install/'' run ''./install.sh'':   * In ''tectomt/install/'' run ''./install.sh'':
Line 146: Line 149:
 Once you have TectoMT installed on your machine, you can find this tutorial in ''applications/tutorial/''. After you ''cd'' into this directory, you can see our plain text sample data in ''sample.txt'' Once you have TectoMT installed on your machine, you can find this tutorial in ''applications/tutorial/''. After you ''cd'' into this directory, you can see our plain text sample data in ''sample.txt''
  
-Most applications are defined in Makefiles, which describe sequence of blocks to be applied on our data. In our particular ''Makefile'', four blocks are going to be applied on our sample text: sentence segmentation, tokenization, tagging and lemmatization. Since we have our input text in plain text format, the file is going to be converted into ''tmt'' format beforehand (the ''in'' target in the Makefile).+Most applications are defined in ''Makefiles'', which describe sequence of blocks to be applied on our data. In our particular ''Makefile'', four blocks are going to be applied on our sample text: sentence segmentation, tokenization, tagging and lemmatization. Since we have our input text in plain text format, the file is going to be converted into ''tmt'' format beforehand (the ''in'' target in the ''Makefile'').
  
 We can run the application: We can run the application:
Line 160: Line 163:
   * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example ''sample.tmt'' we have morphological tree (''SEnglishM'') in each bundle. Later on, also an analytical layer (''SEnglishA'') will appear in each bundle as we proceed with our analysis.    * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example ''sample.tmt'' we have morphological tree (''SEnglishM'') in each bundle. Later on, also an analytical layer (''SEnglishA'') will appear in each bundle as we proceed with our analysis. 
   * Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node.   * Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node.
 +
  
  
Line 220: Line 224:
 </code> </code>
  
-//Note//: Makefiles use tabulators to mark command lines. Make sure your lines start with a tabulator (or two tabulators) and not, for example, with 4 spaces.+//Note//: ''Makefiles'' use tabulators to mark command lines. Make sure your lines start with a tabulator (or two tabulators) and not, for example, with 4 spaces.
  
 After running After running
Line 252: Line 256:
 //Note//: For more information about tree editor TrEd, see [[http://ufal.mff.cuni.cz/~pajas/tred/ar01-toc.html|TrEd User's Manual]]. //Note//: For more information about tree editor TrEd, see [[http://ufal.mff.cuni.cz/~pajas/tred/ar01-toc.html|TrEd User's Manual]].
  
-If you are not familiar with Makefile syntax, another way of running a scenario in TectoMT is using ''.scen'' file (see ''applications/tutorial.scen''). This file lists the blocks to be run - one block on a single line. +If you are not familiar with ''Makefile'' syntax, another way of running a scenario in TectoMT is using ''.scen'' file (see ''applications/tutorial.scen''). This file lists the blocks to be run - one block on a single line. 
  
 <code bash> <code bash>
-eval ${TMT_ROOT}/tools/format_convertors/plaintext_to_tmt/plaintext_to_tmt.pl English sample.txt+$TMT_ROOT/tools/format_convertors/plaintext_to_tmt/plaintext_to_tmt.pl English sample.txt
 brunblocks -S -o --scen tutorial.scen -- sample.tmt brunblocks -S -o --scen tutorial.scen -- sample.tmt
 </code> </code>
Line 328: Line 332:
 </code> </code>
  
-Our tutorial block ''Print_node_info.pm'' is ready to use. You only need to add this block to our scenario, e.g. as a new Makefile target:+Our tutorial block ''Print_node_info.pm'' is ready to use. You only need to add this block to our scenario, e.g. as a new ''Makefile'' target:
  
 <code bash> <code bash>
Line 366: Line 370:
 ==== Task ==== ==== Task ====
 A block which, given an analytical tree (''SEnglishA''), fills each ''a-node'' with boolean attribute ''is_clause_head'' which is set to ''1'' if the ''a-node'' corresponds to a finite verb, and to ''0'' otherwise. A block which, given an analytical tree (''SEnglishA''), fills each ''a-node'' with boolean attribute ''is_clause_head'' which is set to ''1'' if the ''a-node'' corresponds to a finite verb, and to ''0'' otherwise.
 +
  
  

[ Back to the navigation ] [ Back to the content ]