[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:joshua [2009/07/13 14:47]
zeman Odkaz na dokumentaci Chrise Callisona-Burche.
user:zeman:joshua [2010/03/07 22:37]
zeman Troubleshooter.
Line 305: Line 305:
     $HINDI/mert/zmert-config.txt \     $HINDI/mert/zmert-config.txt \
     > $HINDI/mert/zmert.out</code>     > $HINDI/mert/zmert.out</code>
 +
 +===== Troubleshooter =====
 +
 +==== Grammar extraction: Negative array size ====
 +
 +If you encounter this exception during corpus binarization or (in older releases of Joshua) during grammar extraction, check your alignment file whether it matches your source and target corpus. Did you switch translation direction accidentially? The alignment file must have the same number of lines as your source and target corpus, one line per sentence (segment) pair. The "tokens" on each line are pairs of numbers, such as "0-0 1-2 2-2 3-5". The first number in each pair is the index to the source sentence (first token has index 0) and the second number is index to the target sentence. By switching the source and the target, you are likely to cause some indices to point out of the sentence, and you are in trouble.
 +
 +==== ZMERT: corrupted temp file ====
 +
 +Hi all,
 +
 +does the following ZMERT exception look familiar to anyone? My only idea was that the nbest output from the decoder is corrupted somehow. However, I cannot find anything strange in it, such as sequence of more then three "|||" etc.
 +
 +Thanks,
 +Dan
 +
 +zmert.out:
 +-----
 +<code>tmpDirPrefix: /ha/work/people/zeman/wmt/experiments/obo-max/mert/ZMERT.
 +Processed the following args array:
 + -dir /ha/work/people/zeman/wmt/experiments/obo-max/mert -s src.txt -r ref.txt -rps 1 -p params.txt -m BLEU 4 closest -maxIt 5 -ipi 20 -cmd ./decoder.pl -decOut nbest.txt -dcfg decoder-config.txt -N 300 -v 1 -seed 12341234
 +
 +----------------------------------------------------
 +Initializing...
 +----------------------------------------------------
 +
 +Random number generator initialized using seed: 12341234
 +
 +Number of sentences: 2051
 +Number of documents: 1
 +Optimizing BLEU
 +docSubsetInfo: {0, 1, 1, 1, 1, 0, 0}
 +Number of features: 5
 +Feature names: {"lm","phrasemodel pt 0","phrasemodel pt 1","phrasemodel pt 2","wordpenalty"}
 +
 +c    Default value    Optimizable?    Crit. val. range    Rand. val. range
 +1     1.0000         Yes         [0.1,Infinity]         [0.5,1.5]
 +2     1.0669         Yes         [-Infinity,Infinity]         [-1.0,1.0]
 +3     0.7522         Yes         [-Infinity,Infinity]         [-1.0,1.0]
 +4     0.5898         Yes         [-Infinity,Infinity]         [-1.0,1.0]
 +5     -2.8448         Yes         [-Infinity,Infinity]         [-5.0,0.0]
 +
 +Weight vector normalization method: weights will be scaled so that the "lm" weight has an absolute value of 1.0.
 +
 +----------------------------------------------------
 +
 +----------------------------------------------------
 +Z-MERT run started @ Sat Mar 06 23:52:57 CET 2010
 +----------------------------------------------------
 +
 +Initial lambda[]: {1.0, 1.066893, 0.752247, 0.589793, -2.844814}
 +
 +--- Starting Z-MERT iteration #1 @ Sat Mar 06 23:52:57 CET 2010 ---
 +Decoding using initial weight vector {1.0, 1.066893, 0.752247, 0.589793, -2.844814}
 +Running external decoder...
 +...finished decoding @ Sun Mar 07 00:02:33 CET 2010
 +Reading candidate translations from iterations 1-1
 +(and computing BLEU sufficient statistics for previously unseen candidates)
 + Progress:
 +Exception in thread "main" java.lang.NumberFormatException: For input string: "||||||"
 +   at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 +   at java.lang.Integer.parseInt(Integer.java:449)
 +   at java.lang.Integer.parseInt(Integer.java:499)
 +   at joshua.zmert.MertCore.run_single_iteration(MertCore.java:1071)
 +   at joshua.zmert.MertCore.main(MertCore.java:3129)
 +Z-MERT exiting prematurely (MertCore returned 1)...</code>
 +----- 
 +
 +Omar's response:
 +
 +Hi Dan,
 +
 +The "||||||" sequence is in a temp file, not the decoder's output.  If
 +if there are any *temp* (or *tmp*) files in the folder from earlier
 +runs, make sure you delete them first, then try launching Z-MERT
 +again.  Such files are left over from runs that crash.  Z-MERT does
 +not delete them because they can be used to restart Z-MERT from the
 +point where it crashed.  But that assumes the crash is due to power
 +loss or an interrupted job, etc.  In your case, I think what happened
 +is that a prior run crashed because of an external problem in the
 +setup itself, which you fixed and tried to restart Z-MERT.  For that
 +reason, Z-MERT should not be using those temp files in the first
 +place, but when it sees them there, it assumes it can use them because
 +the user did not delete them.
 +
 +Let me know if that's not the case.
 +
 +O.Z.
 +

[ Back to the navigation ] [ Back to the content ]