Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:tectomt:tutorial [2009/01/22 11:55] kravalova |
external:tectomt:tutorial [2010/11/10 16:39] (current) popel SEnglishM_to_SEnglishA::Clone_MTree is needed now |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== TectoMT Tutorial ====== | ====== TectoMT Tutorial ====== | ||
- | Welcome | + | Welcome |
- | + | ||
Line 9: | Line 7: | ||
TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, | TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, | ||
- | |||
- | |||
- | |||
Line 21: | Line 16: | ||
* Your shell is bash | * Your shell is bash | ||
* You have basic experience with bash and can read basic Perl | * You have basic experience with bash and can read basic Perl | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== Installation and setup ==== | ==== Installation and setup ==== | ||
- | * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have to checkout the repository into directory '' | + | * Checkout SVN repository. If you are running this installation in computer lab in Prague, you have to checkout the repository into directory '' |
<code bash> | <code bash> | ||
cd ~/BIG | cd ~/BIG | ||
- | svn --username | + | svn --username |
</ | </ | ||
+ | |||
+ | * accept the certificate and provide a password which is same as the username ie. : //public// | ||
* In '' | * In '' | ||
Line 63: | Line 47: | ||
source .bashrc | source .bashrc | ||
</ | </ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
===== TectoMT Architecture ===== | ===== TectoMT Architecture ===== | ||
- | |||
- | |||
- | |||
- | |||
==== Blocks, scenarios and applications ==== | ==== Blocks, scenarios and applications ==== | ||
Line 95: | Line 55: | ||
In TectoMT, there is the following hierarchy of processing units (software components that process data): | In TectoMT, there is the following hierarchy of processing units (software components that process data): | ||
- | * The basic units are blocks. They serve for some very limited, well defined, and often linguistically interpretable tasks (e.g., tokenization, | + | * The basic units are **blocks**. They serve for some very limited, well defined, and often linguistically interpretable tasks (e.g., tokenization, |
- | * To solve a more complex task, selected blocks can be chained into a block sequence, called | + | * To solve a more complex task, selected blocks can be chained into a block sequence, called |
- | * The highest unit is called application. Applications correspond to end-to-end tasks, be they real end-user applications (such as machine translation), | + | * The highest unit is called |
This tutorial itself has its blocks in '' | This tutorial itself has its blocks in '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 130: | Line 79: | ||
There are also other directories for other purpose blocks, for example blocks which only print out some information go to '' | There are also other directories for other purpose blocks, for example blocks which only print out some information go to '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 145: | Line 85: | ||
Once you have TectoMT installed on your machine, you can find this tutorial in '' | Once you have TectoMT installed on your machine, you can find this tutorial in '' | ||
- | Most applications are defined in Makefiles, which describe sequence of blocks to be applied on our data. In our particular | + | Most applications are defined in '' |
We can run the application: | We can run the application: | ||
Line 156: | Line 96: | ||
* One physical '' | * One physical '' | ||
- | * A document consists of a sequence of bundles (''< | + | * A document consists of a sequence of bundles (element |
- | * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example '' | + | * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example '' |
* Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node. | * Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node. | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
===== Changing the scenario ===== | ===== Changing the scenario ===== | ||
- | We'll now add a syntax analysis (dependency parsing) to our scenario by adding | + | We'll now add a syntax analysis (dependency parsing) to our scenario by adding |
- | < | + | < |
- | analyze: | + | SEnglishW_to_SEnglishM:: |
- | brunblocks -S -o \ | + | SEnglishW_to_SEnglishM:: |
- | | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | |
- | SEnglishW_to_SEnglishM:: | + | |
- | -- sample.tmt | + | |
</ | </ | ||
Line 204: | Line 115: | ||
<code bash> | <code bash> | ||
- | analyze: | + | SEnglishW_to_SEnglishM:: |
- | brunblocks -S -o \ | + | SEnglishW_to_SEnglishM:: |
- | | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | -- sample.tmt | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
- | |||
- | //Note//: Makefiles use tabulators to mark command lines. Make sure your lines start with a tabulator (or two tabulators) and not, for example, with 4 spaces. | ||
After running | After running | ||
Line 229: | Line 138: | ||
<code bash> | <code bash> | ||
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
Line 235: | Line 144: | ||
<code bash> | <code bash> | ||
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
Line 248: | Line 157: | ||
//Note//: For more information about tree editor TrEd, see [[http:// | //Note//: For more information about tree editor TrEd, see [[http:// | ||
+ | If you are not familiar with '' | ||
- | + | <code bash> | |
- | + | ./ | |
- | + | </ | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
Line 308: | Line 195: | ||
* '' | * '' | ||
- | Some interesting attributes on morphologic layer are '' | + | Some interesting attributes on morphologic layer are '' |
<code bash> | <code bash> | ||
Line 314: | Line 201: | ||
</ | </ | ||
- | Our tutorial block '' | + | Our tutorial block '' |
<code bash> | <code bash> | ||
print_info: | print_info: | ||
- | brunblocks | + | brunblocks -o Tutorial:: |
</ | </ | ||
Line 328: | Line 215: | ||
Try to change the block so that it prints out the information only for verbs. (You need to look at an attribute '' | Try to change the block so that it prints out the information only for verbs. (You need to look at an attribute '' | ||
- | |||
- | |||
- | |||
- | |||
===== Advanced block: finite clauses ===== | ===== Advanced block: finite clauses ===== | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== Motivation ==== | ==== Motivation ==== | ||
It is assumed that finite clauses can be translated independently, | It is assumed that finite clauses can be translated independently, | ||
- | |||
- | |||
- | |||
- | |||
==== Task ==== | ==== Task ==== | ||
A block which, given an analytical tree ('' | A block which, given an analytical tree ('' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== Instructions ==== | ==== Instructions ==== | ||
Line 390: | Line 232: | ||
<code bash> | <code bash> | ||
finite_clauses: | finite_clauses: | ||
- | brunblocks -S -o \ | + | brunblocks -S -o Tutorial:: |
- | | + | |
- | | + | |
- | | + | |
</ | </ | ||
Line 403: | Line 242: | ||
* '' | * '' | ||
- | //Note//: '' | + | //Note//: '' |
//Hint//: Finite clauses in English usually require grammatical subject to be present. | //Hint//: Finite clauses in English usually require grammatical subject to be present. | ||
Line 410: | Line 249: | ||
The output of our block might still be incorrect in special cases - we don't solve coordination (see the second sentence in sample.txt) and subordinate conjunctions. | The output of our block might still be incorrect in special cases - we don't solve coordination (see the second sentence in sample.txt) and subordinate conjunctions. | ||
- | |||
===== Your turn: more tasks ===== | ===== Your turn: more tasks ===== | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== SVO to SOV ==== | ==== SVO to SOV ==== | ||
Line 444: | Line 266: | ||
* Once you have the node '' | * Once you have the node '' | ||
- | **Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method '' | + | **Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method '' |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
Line 500: | Line 289: | ||
**Advanced version**: What happens in case of multiword prepositions? | **Advanced version**: What happens in case of multiword prepositions? | ||
- | |||
- | |||
- | |||
===== Further information ===== | ===== Further information ===== | ||
- | * [[http://ufallab2.ms.mff.cuni.cz/ | + | * [[http://ufal.mff.cuni.cz/ |
* Questions? Ask '' | * Questions? Ask '' | ||
* Solutions to this tutorial tasks are in '' | * Solutions to this tutorial tasks are in '' | ||
* [[http:// | * [[http:// | ||
+ | If you are missing some files from //share//, you can download it from [[http:// |