[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
user:zeman:interset:tagsets:urdu [2010/04/24 18:50]
zeman vytvořeno
user:zeman:interset:tagsets:urdu [2010/05/05 13:32] (current)
zeman Tagger taky nefunguje.
Line 7: Line 7:
   * http://www.crulp.org/Downloads/ling_resources/parallelcorpus/Urdu%20POS%20Tagset.pdf   * http://www.crulp.org/Downloads/ling_resources/parallelcorpus/Urdu%20POS%20Tagset.pdf
   * POS Tagged Urdu Corpus: http://www.crulp.org/Downloads/ling_resources/parallelcorpus/Urdu%20Tagged%20Corpus%20(100k).zip   * POS Tagged Urdu Corpus: http://www.crulp.org/Downloads/ling_resources/parallelcorpus/Urdu%20Tagged%20Corpus%20(100k).zip
-  * Urdu Stemmer: http://www.crulp.org/software/langproc/UrduStemmer.htm +  * [[http://www.crulp.org/software/langproc/UrduStemmer.htm|Urdu Stemmer.]] This is a Windows GUI program. It requires that some files be in a fixed path but it works. However, its precision is questionable. For example, it segments "ناموں" as "نا|موں" (prefix|stem). 
-  * Urdu Finite State Morphological Analyzer: http://www.crulp.org/software/langproc/MorphologicalAnalyzer.htm +  * [[http://www.crulp.org/software/langproc/MorphologicalAnalyzer.htm|Urdu Finite State Morphological Analyzer.]] This is a Windows program. I have not been able to run it because it requires Microsoft Visual C++, particularly the ''mfc42ud.dll'' library (Unicode debug version). However, there is a text file with the lexicon that could be potentially converted for PC Kimmo. 
-  * Urdu Statistical POS Tagger: http://www.crulp.org/software/langproc/POS_tagger.htm+  * [[http://www.crulp.org/software/langproc/POS_tagger.htm|Urdu Statistical POS Tagger.]] This is a Windows program. I have not been able to run it on Emille data. There was an exception. However, there are text files with lexical data that could be potentially used to implement another tagger.
   * English-to-Urdu MT (based on LFG): http://www.crulp.org/software/langproc/E2UMachineTranslationSystem.htm   * English-to-Urdu MT (based on LFG): http://www.crulp.org/software/langproc/E2UMachineTranslationSystem.htm
   * Hassan Sajjad, Helmut Schmid: Tagging Urdu Text with Parts of Speech: A Tagger Comparison (EACL 2009 Athens): http://portal.acm.org/citation.cfm?id=1609067.1609144, http://www.aclweb.org/anthology/E/E09/E09-1079.pdf   * Hassan Sajjad, Helmut Schmid: Tagging Urdu Text with Parts of Speech: A Tagger Comparison (EACL 2009 Athens): http://portal.acm.org/citation.cfm?id=1609067.1609144, http://www.aclweb.org/anthology/E/E09/E09-1079.pdf
   * Urdu Emille POS Tagset: http://www.lancs.ac.uk/staff/hardiea/cl03_urdu.pdf   * Urdu Emille POS Tagset: http://www.lancs.ac.uk/staff/hardiea/cl03_urdu.pdf
   * Urdu Tagging Challenges (presentation): http://www.panl10n.net/Presentations/Laos/RegionalConference/CorpusCollection/Tagset_and_Tagging_Urdu_Corpus.pdf   * Urdu Tagging Challenges (presentation): http://www.panl10n.net/Presentations/Laos/RegionalConference/CorpusCollection/Tagset_and_Tagging_Urdu_Corpus.pdf
 +  * Mohmil words: http://www.crulp.org/Publication/Crulp_report/CR02_28E.pdf
 +  * http://www.crulp.org/Downloads/langproc/UrduPOStagger/UrduPOStagset.pdf
 +  * http://aclweb.org/aclwiki/index.php?title=List_of_resources_by_language#U

[ Back to the navigation ] [ Back to the content ]