Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
pml-haters [2007/05/28 00:16] pajas |
pml-haters [2007/05/31 14:43] pajas |
||
---|---|---|---|
Line 19: | Line 19: | ||
Given a PML file, how do I validate it? I always forget... Please provide me with the one-liner to do the validation. | Given a PML file, how do I validate it? I always forget... Please provide me with the one-liner to do the validation. | ||
- | See [[user: | + | For most purposes, |
- | + | < | |
- | PP: there is a validation script at the [[http://ufal.mff.cuni.cz/jazz/pml/index_en.html|PML homepage]], but it uses DOM. A streaming variant that uses trang can be found in '' | + | Both scripts have decent user documentation. See inside the scripts if interested in the implementation details. |
===== XSH Won't Work: Blame XML Namespaces ===== | ===== XSH Won't Work: Blame XML Namespaces ===== | ||
Line 52: | Line 52: | ||
Most probably you'll still face problems when accessing attributes of XML elements, because namespacing rules apply differently to attributes and elements. You'll need to read XML (Namespaces) specification. | Most probably you'll still face problems when accessing attributes of XML elements, because namespacing rules apply differently to attributes and elements. You'll need to read XML (Namespaces) specification. | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
===== Number of Sentences ===== | ===== Number of Sentences ===== | ||
Line 68: | Line 61: | ||
This XPath would quickly give you the number of sentences: | This XPath would quickly give you the number of sentences: | ||
- | < | + | < |
</ | </ | ||
Line 77: | Line 70: | ||
< | < | ||
| xsh -I - -C "regns pml http:// | | xsh -I - -C "regns pml http:// | ||
+ | </ | ||
+ | |||
+ | or just the following, if you have the regns command in your ~/.xsh2rc: | ||
+ | |||
+ | < | ||
+ | | xsh -I - -C " | ||
</ | </ | ||
Line 92: | Line 91: | ||
</ | </ | ||
+ | Here is a one-liner in Perl that does not load the whole file into memory: | ||
+ | < | ||
+ | </ | ||
===== Restricting a Suite of PML Files to Contain only a Specific Sentence ===== | ===== Restricting a Suite of PML Files to Contain only a Specific Sentence ===== | ||
Line 101: | Line 102: | ||
How do I create a suite of files with just the problematic sentence 345, i.e. files test-w.xml, test-m.xml, test-a.xml and test-t.xml, all properly referenced? A XML-Reader based script by Petr Pajas demonstrates that: | How do I create a suite of files with just the problematic sentence 345, i.e. files test-w.xml, test-m.xml, test-a.xml and test-t.xml, all properly referenced? A XML-Reader based script by Petr Pajas demonstrates that: | ||
- | < | + | < |
</ | </ | ||