Table of Contents

Comments

-We discussed the history and why we are analyzing the agreement thresholds between .67 and .8
-Went through drawing the graphs relating to the relationship between the strength of relationships and the accuracy. For true annotation, the greater the relationship strength the greater the accuracy.
-The paper indicates that when disagreement is caused by random noise it has little affect on the overall agreement.
-We also went into alot of detail on different ways of computing annotator agreement in the two sessions.

What do we dislike about the paper

-The paper was clear in its experiments on random noise but how kappa was determined in the cases of overusing a label was less clear.
-The paper did not indicate that when values were changed, if the system made sure the new values did not match.

What do we like about the paper

-The paper was interesting since the results were counter-intuitive to what most of us would first assume.