Differences

This shows you the differences between two versions of the page.

--- courses:rg:2012:segments [2012/12/29 15:25]
bilek
+++ courses:rg:2012:segments [2013/01/03 22:38] (current)
popel
@@ Line 21: / Line 21: @@
 The most important question, though, is why do we do all this, because the data from the PDT tree are more thorough than the segments that we want to create! So, what is the exact reason?
-) it is not because there are the training data
+) To prepare training data?
-) it is not because we use it as a testing data, because it has only 70% accuracy
+> Probably no, because they don't use any machine learning approach.
-) It can be to show that it is difficult to create from the analytical tree, too
+) To prepare testing data?
-) We can use it as a "oracle experiment" - how far can we go with plaintext?
+> No. Because they already have some manually annotated sentences. Moreover, the described approach (using PDT gold a-trees on input) has only 70% accuracy.
+) As an "oracle experiment" - using gold a-trees is an upper bound for using plaintext only.
+> Probably no. There are better algorithms (with higher precision than 70%) exploiting gold a-trees.
+) To show some difficult cases with creating segmentation charts (even when gold a-trees are available).
+> Maybe.
 ) It can be just to fill up the space :)
+> ?
 ====How to Obtain Segments from Plain Text?====

Institute of Formal and Applied Linguistics Wiki