Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:tectomt:tutorial [2009/04/01 10:46] ptacek |
external:tectomt:tutorial [2010/11/10 16:39] (current) popel SEnglishM_to_SEnglishA::Clone_MTree is needed now |
||
---|---|---|---|
Line 2: | Line 2: | ||
Welcome to the TectoMT Tutorial. This tutorial should take about 3 hours. | Welcome to the TectoMT Tutorial. This tutorial should take about 3 hours. | ||
- | |||
- | |||
Line 9: | Line 7: | ||
TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, | TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, | ||
- | |||
- | |||
- | |||
Line 21: | Line 16: | ||
* Your shell is bash | * Your shell is bash | ||
* You have basic experience with bash and can read basic Perl | * You have basic experience with bash and can read basic Perl | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 68: | Line 47: | ||
source .bashrc | source .bashrc | ||
</ | </ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
===== TectoMT Architecture ===== | ===== TectoMT Architecture ===== | ||
- | |||
- | |||
- | |||
- | |||
==== Blocks, scenarios and applications ==== | ==== Blocks, scenarios and applications ==== | ||
Line 100: | Line 55: | ||
In TectoMT, there is the following hierarchy of processing units (software components that process data): | In TectoMT, there is the following hierarchy of processing units (software components that process data): | ||
- | * The basic units are blocks. They serve for some very limited, well defined, and often linguistically interpretable tasks (e.g., tokenization, | + | * The basic units are **blocks**. They serve for some very limited, well defined, and often linguistically interpretable tasks (e.g., tokenization, |
- | * To solve a more complex task, selected blocks can be chained into a block sequence, called | + | * To solve a more complex task, selected blocks can be chained into a block sequence, called |
- | * The highest unit is called application. Applications correspond to end-to-end tasks, be they real end-user applications (such as machine translation), | + | * The highest unit is called |
This tutorial itself has its blocks in '' | This tutorial itself has its blocks in '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 135: | Line 79: | ||
There are also other directories for other purpose blocks, for example blocks which only print out some information go to '' | There are also other directories for other purpose blocks, for example blocks which only print out some information go to '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 150: | Line 85: | ||
Once you have TectoMT installed on your machine, you can find this tutorial in '' | Once you have TectoMT installed on your machine, you can find this tutorial in '' | ||
- | Most applications are defined in '' | + | Most applications are defined in '' |
We can run the application: | We can run the application: | ||
Line 161: | Line 96: | ||
* One physical '' | * One physical '' | ||
- | * A document consists of a sequence of bundles (''< | + | * A document consists of a sequence of bundles (element |
- | * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example '' | + | * Each bundle contains tree shaped sentence representations on various linguistic layers. In our example '' |
* Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node. | * Trees are formed by nodes and edges. Attributes can be attached only to nodes. Edge's attributes must be stored as the lower node's attributes. Tree's attributes must be stored as attributes of the root node. | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
===== Changing the scenario ===== | ===== Changing the scenario ===== | ||
- | We'll now add a syntax analysis (dependency parsing) to our scenario by adding | + | We'll now add a syntax analysis (dependency parsing) to our scenario by adding |
- | < | + | < |
- | analyze: | + | SEnglishW_to_SEnglishM:: |
- | brunblocks -S -o \ | + | SEnglishW_to_SEnglishM:: |
- | | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | |
- | SEnglishW_to_SEnglishM:: | + | |
- | -- sample.tmt | + | |
</ | </ | ||
Line 213: | Line 115: | ||
<code bash> | <code bash> | ||
- | analyze: | + | SEnglishW_to_SEnglishM:: |
- | brunblocks -S -o \ | + | SEnglishW_to_SEnglishM:: |
- | | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishW_to_SEnglishM:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishW_to_SEnglishM:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
- | -- sample.tmt | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
- | |||
- | //Note//: '' | ||
After running | After running | ||
Line 238: | Line 138: | ||
<code bash> | <code bash> | ||
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
Line 244: | Line 144: | ||
<code bash> | <code bash> | ||
- | SEnglishM_to_SEnglishA:: | + | SEnglishM_to_SEnglishA:: |
</ | </ | ||
Line 257: | Line 157: | ||
//Note//: For more information about tree editor TrEd, see [[http:// | //Note//: For more information about tree editor TrEd, see [[http:// | ||
- | If you are not familiar with '' | + | If you are not familiar with '' |
- | + | ||
- | <code bash> | + | |
- | $TMT_ROOT/ | + | |
- | brunblocks -S -o --scen tutorial.scen -- sample.tmt | + | |
- | </ | + | |
- | + | ||
- | Finally, yet another way is to use a simple '' | + | |
<code bash> | <code bash> | ||
./ | ./ | ||
</ | </ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 337: | Line 205: | ||
<code bash> | <code bash> | ||
print_info: | print_info: | ||
- | brunblocks | + | brunblocks -o Tutorial:: |
</ | </ | ||
Line 347: | Line 215: | ||
Try to change the block so that it prints out the information only for verbs. (You need to look at an attribute '' | Try to change the block so that it prints out the information only for verbs. (You need to look at an attribute '' | ||
- | |||
- | |||
- | |||
- | |||
===== Advanced block: finite clauses ===== | ===== Advanced block: finite clauses ===== | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== Motivation ==== | ==== Motivation ==== | ||
It is assumed that finite clauses can be translated independently, | It is assumed that finite clauses can be translated independently, | ||
- | |||
- | |||
- | |||
- | |||
==== Task ==== | ==== Task ==== | ||
A block which, given an analytical tree ('' | A block which, given an analytical tree ('' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== Instructions ==== | ==== Instructions ==== | ||
Line 412: | Line 232: | ||
<code bash> | <code bash> | ||
finite_clauses: | finite_clauses: | ||
- | brunblocks -S -o \ | + | brunblocks -S -o Tutorial:: |
- | | + | |
- | | + | |
- | | + | |
</ | </ | ||
Line 432: | Line 249: | ||
The output of our block might still be incorrect in special cases - we don't solve coordination (see the second sentence in sample.txt) and subordinate conjunctions. | The output of our block might still be incorrect in special cases - we don't solve coordination (see the second sentence in sample.txt) and subordinate conjunctions. | ||
- | |||
===== Your turn: more tasks ===== | ===== Your turn: more tasks ===== | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
==== SVO to SOV ==== | ==== SVO to SOV ==== | ||
Line 467: | Line 267: | ||
**Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method '' | **Advanced version**: This solution shifts object (or more objects) of a verb just in front of that verb node. So f.e.: //Mr. Brown has urged MPs.// changes to: //Mr. Brown has MPs urged.// You can try to change this solution, so the final sentence would be: //Mr. Brown MPs has urged.// You may need a method '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
Line 522: | Line 289: | ||
**Advanced version**: What happens in case of multiword prepositions? | **Advanced version**: What happens in case of multiword prepositions? | ||
- | |||
- | |||
- | |||
===== Further information ===== | ===== Further information ===== | ||
- | * [[http://ufallab2.ms.mff.cuni.cz/ | + | * [[http://ufal.mff.cuni.cz/ |
* Questions? Ask '' | * Questions? Ask '' | ||
* Solutions to this tutorial tasks are in '' | * Solutions to this tutorial tasks are in '' | ||
* [[http:// | * [[http:// | ||
+ | If you are missing some files from //share//, you can download it from [[http:// |