[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
padt:start [2011/03/24 23:48]
smrz project completion guide
padt:start [2011/06/01 01:32]
smrz
Line 4: Line 4:
  
 ===== Overview ===== ===== Overview =====
 +
 +===== Setup =====
 +
 +Install [[http://ufal.mff.cuni.cz/~pajas/tred/|TrEd]] including the [[http://ufal.mff.cuni.cz/~pajas/tred/extensions/padt/documentation/|padt]] and [[http://ufal.mff.cuni.cz/~pajas/tred/extensions/elixir/documentation/|elixir]] extensions from the default TrEd repository http://ufal.mff.cuni.cz/~pajas/tred/extensions/.
 +
 +The SVN repository of the PADT project is https://svn.ms.mff.cuni.cz/svn/padt/. A working copy is accessible at /net/projects/ace/data/arabic/PADT/ on the UFAL network.
 +
 +The project's data are stored in the main subdirectory ''data'', which is split further into ''Prague'', ''Penn'', and ''ElixirFM'', explained below.
 +
 +Try opening a PADT file to check if your setup is complete. Run TrEd and open the following files. They should automatically set their editing contexts and stylesheets to PADT::Morpho and PADT::Syntax, respectively:
 +
 +<code bash>
 +tred /net/projects/ace/data/arabic/PADT/data/Prague/AEP/UMH_ARB_20040407.0001.{morpho,syntax}.pml
 +</code>
 +
 +For improved quality of display of the various scripts and trees types, you can use the following setup in TrEd's config file, or similar:
 +
 +<file>
 +Font = "family:DejaVu Sans Condensed, size:14, weight:normal"
 +
 +NodeXSkip = 30;
 +NodeYSkip = 10;
 +</file>
  
 ===== Locations ===== ===== Locations =====
 +
 +The SVN repository of the PADT project is https://svn.ms.mff.cuni.cz/svn/padt/. The main subdirectory ''data'' is split into ''ElixirFM'', ''Prague'', and ''Penn''. Further:
 +
 +data/ElixirFM/
 +
 +data/Penn/1v3/
 +data/Penn/2v2/
 +data/Penn/3v2/
 +data/Penn/4v1/
 +
 +data/Prague/AEP/
 +data/Prague/ASB/
 +data/Prague/EAT/
 +data/Prague/HYT/
 +data/Prague/NHR/
 +data/Prague/XIN/
 +
 +The project's contributors are ''smrz'', ''bielicky'', and ''zabokrtsky'', the rest of ''ufal'' have just the read rights.
 +
 +There is also the 'tools' directory which contains some useful scripts.
 +
 +The code base for the PADT project, i.e. for annotation, display, and processing of the data, is the TrEd's ''padt'' extension, and its ''elixir'' extension that is a dependency for ''padt''.
  
 ===== Agenda ===== ===== Agenda =====
 +
 +Focus on paragraphs/sentences that miss PADT-Morpho annotation, esp. non-annotated headlines:
 +
 +<code bash>
 +btred -QTe '@w = $this->children(); @n = grep { $_->children() } @w; print ThisAddress() . "\n" if @n < 0.9 * @w' Penn/???/*.morpho*.pml 
 +</code>
 +
 +
 +Focus on nodes in PADT-Syntax that do not have a valid ''afun'' annotation:
 +
 +<code bash>
 +btred -QTNe 'print ThisAddress() . "\n" if exists $this->{"afun"} and $this->{"afun"} eq "???"' Prague/???/*.syntax*.pml
 +</code>
  
 ===== References ===== ===== References =====
  

[ Back to the navigation ] [ Back to the content ]