Native Language Identification Shared Task 2013
A shared task in Native Language Identification to identify the native language of a writer based solely on a sample of their writing.
- Team: Barbora Hladka (contact person, related projects, data, ML), Martin Holub (algorithms, ML), Vincent Kríž (ML)
- * Important Dates:
January 13 Registration January 14 Training Data Release
- March 11 Test Data Release
- March 18 Submissions Due
- March 25 Results Announcement
- April 08 Papers Due
- April 10 Revision Requests Sent
- April 12 Camera Ready Version Due
- June 13 or 14 NLI Shared Task Presentations @ BEA8 Workshop, Atlanta, GA, USA
- Data: TBA
- !!! Brooke, Julian, Greme Hirst. Robust lexicalized native language identification. COLING 2012. (pdf)
- Brooke, Julian, Greme Hirst. Native language detectin with 'cheap' learner corpora. In Proceedings of the Conference on Learner Corpus Research, Louvain-la-Neuve. 2011.
- learner corpora review;
- they discuss topic bias - Do we have to care about it in the NLI task?
- Feature set: [character|POS|word] n-grams, function words, features from machine translation, features from L1 corpora. How many features: ???
- Machine learning algorithm: ??
- Wong, Sze-Meng Jojo, Dras Mark. Exploiting parse structures for native language identification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, p.1600-1610. 2011. (pdf)
- Wong, Sze-Meng Jojo, Dras Mark, Johnson, Mark. Topic Modeling for Native Language Identification. In Proceedings of Australasian Language Technology Association Workshop, pp. 115-124. 2011. (pdf).
- !!! Wong, Sze-Meng Jojo, Dras Mark, Johnson, Mark. Exploring Adaptor Grammars for Native Language Identification. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 699-709. 2012. (pdf)