Differences

This shows you the differences between two versions of the page.

--- user:zeman:self-training [2008/07/09 10:55]
zeman Upravená konverze z MediaWiki zvládá lépe tabulky.
+++ user:zeman:self-training [2008/07/09 16:43] (current)
zeman
@@ Line 1: / Line 1: @@
-This page describes an experiment conducted by [[User:Zeman|Dan Zeman]] in November and December 2006.
+This page describes an experiment conducted by [[User:Zeman:start|Dan Zeman]] in November and December 2006.
 I am trying to repeat the experiment of David McClosky, Eugene Charniak, and Mark Johnson ([[http://www.cog.brown.edu/~mj/papers/naacl06-self-train.pdf|NAACL 2006, New York]]) with self-training a parser. The idea is that you train a parser on small data, run it over big data, re-train it on its own output for the big data, and have a better-performing parser. The folks at Brown University used Charniak's reranking parser, i.e. a parser-reranker sequence. The big data was parsed by the whole reranking parser but only the first-stage parser was retrained on it. The reranker only saw the small data.
@@ Line 9: / Line 9: @@
 Note: I am going to move around some stuff, especially that in my home folder.
-  * ''$PARSINGROOT'' - working copy of the parsers and related scripts. See [[Parsing]] on how to create your own.
+  * ''$PARSINGROOT'' - working copy of the parsers and related scripts. See [[:parsery|Parsing]] on how to create your own.
   * ''/fs/clip-corpora/ptb/processed'' - [[Penn Treebank]] (referred to as ''$PTB'')
   * ''/fs/clip-corpora/north_american_news'' - [[North American News Text Corpus]], including everything I made of it
@@ Line 69: / Line 69: @@
 See [[North American News Text Corpus]] for more information on the data and its preparation.
 =====Parsing NANTC using P<sub>0</sub>=====
-See [[Parsers|here]] for more information on the Brown Reranking Parser. We parsed the LATWP part of NANTC on the C cluster using the following command:
+See [[:Parsery|here]] for more information on the Brown Reranking Parser. We parsed the LATWP part of NANTC on the C cluster using the following command:
 <code>
@@ Line 243: / Line 244: @@
 | Brown | PTB WSJ | Brown | 5 × WSJ + 1750k NANTC |  |  89.9  |  92.1  |  90.0  |
 | Brown | PTB WSJ | Brown | 5 × WSJ + 3143k NANTC |  |  90.3  |  |  90.5  |
-[[Category:Experiments]]
-[[Category:English]]
-[[Category:Parsing]]

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences