# Differences

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

courses:rg:2013:convolution-kernels [2013/03/11 18:31] dusek |
courses:rg:2013:convolution-kernels [2013/03/12 11:27] popel <latex>x</latex> was not rendered |
||
---|---|---|---|

Line 19: | Line 19: | ||

* They are able to " | * They are able to " | ||

* Examples: Naive Bayes, Mixtures of Gaussians, HMM, Bayesian Networks, Markov Random Fields | * Examples: Naive Bayes, Mixtures of Gaussians, HMM, Bayesian Networks, Markov Random Fields | ||

- | * **Diskriminative models** do everything in one-step -- they learn the posterior < | + | * **Discriminative models** do everything in one-step -- they learn the posterior < |

* They are simpler and can use many more features, but are prone to missing inputs. | * They are simpler and can use many more features, but are prone to missing inputs. | ||

- | * Examples: SVM, Logistic Regression, Neuron. sítě, k-NN, Conditional Random Fields | + | * Examples: SVM, Logistic Regression, Neural network, k-NN, Conditional Random Fields |

- | - | + | - Each CFG rule generates just one level of the derivation tree. Therefore, using " |

+ | * '' | ||

+ | * It could be modelled with an augmentation of the nonterminal labels. | ||

+ | * CFGs can't generate non-projective sentences. | ||

+ | * But they can be modelled using traces. | ||

+ | - The derivation is actually quite simple: | ||

+ | - < | ||

+ | - < | ||

+ | - < | ||

+ | - < | ||

+ | - < | ||

+ | - Convolution is defined like this: < | ||

+ | - There is a (tiny) error in the last formula of Section 3. You cannot actually multiply tree parses, so it should read: < | ||

+ | ==== Report ==== | ||

+ | |||

+ | We discussed the answers to the questions most of the time. Other issues raised in the discussion were: | ||

+ | |||

+ | * **Usability** -- the approach is only usable for // | ||

+ | * **Scalability** -- they only use 800 sentences and 20 candidates per sentence for training. We believe that for large data (milions of examples) this will become too complex. | ||

+ | * **Evaluation** -- it looks as if they used a non-standard evaluation metric to get " |