[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
pml-haters [2007/05/28 00:25]
pajas
pml-haters [2007/06/05 06:35] (current)
bojar jen formatovani
Line 14: Line 14:
  
 Please strongly prefer SAX-based tools to DOM-based tools. Please strongly prefer SAX-based tools to DOM-based tools.
 +
  
 ===== Validation ===== ===== Validation =====
  
-Given a PML file, how do I validate it? I always forget... Please provide me with the one-liner to do the validation.+Given a PML file, how do I validate it?
  
-See [[user:ptacek:tectomt|this snippet]] for some vague hints.+For most purposes, a libxml2 (DOM) based validator 
 +<code>/f/common/exec/validate_pml --pml-dir /f/common/share/pml --path /f/common/share/tred file_to_validate</code>should work fine and fast.
  
-PP: there is a validation script at the [[http://ufal.mff.cuni.cz/jazz/pml/index_en.html|PML homepage]], but it uses DOM. A streaming variant that uses trang can be found in ''/home/pajas/bin/validate_pml_stream'' (no resource path support yet - you need to have your PML schemas next to your files).+For huge files, use<code>/f/common/exec/validate_pml_stream --path /f/common/share/tred file_to_validate</code>which is based on Jing (SAX); Jing has no Zlib or stdin support, so some space in /tmp will be needed for temporary files
 +Both scripts have decent user documentation. See inside the scripts if interested in the implementation details.
  
 ===== XSH Won't Work: Blame XML Namespaces ===== ===== XSH Won't Work: Blame XML Namespaces =====
Line 102: Line 105:
 How do I create a suite of files with just the problematic sentence 345, i.e. files test-w.xml, test-m.xml, test-a.xml and test-t.xml, all properly referenced? A XML-Reader based script by Petr Pajas demonstrates that: How do I create a suite of files with just the problematic sentence 345, i.e. files test-w.xml, test-m.xml, test-a.xml and test-t.xml, all properly referenced? A XML-Reader based script by Petr Pajas demonstrates that:
  
-<code>~pajas/projects/pml/separate_t_tree.pl file-t.xml 345+<code>~pajas/projects/pml/tools/separate_t_tree.pl file-t.xml 345
 </code> </code>
  

[ Back to the navigation ] [ Back to the content ]