[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Table of Contents

Hindi (hi)

Hyderabad Dependency Treebank (HyDT-Hindi)

Versions

There has been no official release of the treebank yet. There have been three as-is sample releases for the purposes of the NLP tools contests in parsing Indian languages, attached to the ICON 2009 and 2010 conferences and the MTPIL workshop of COLING 2012.

Obtaining and License

There is no standard distribution channel for the treebank after the shared task evaluation period. Inquire at the LTRC (ltrc (at) iiit (dot) ac (dot) in) about the possibility of getting the data. The ICON 2010 and HPST 2012 license in short:

HyDT-Hindi is being created by members of the Language Technologies Research Centre, International Institute of Information Technology, Gachibowli, Hyderabad, 500032, India.

References

Domain

News domain corpus from ISI Kolkata.

Size

HyDT-Hindi contains dependencies on two levels: between chunks and inside chunks. The ICON 2009 CoNLL-formatted version contained only dependencies between chunks, thus the node/tree ratio was much lower than in other treebanks. The ICON 2009 version came with a data split into three parts: training, development and test:

Part Sentences Chunks Ratio
Training 1501 13779 9.18
Development 150 1250 8.33
Test 150 1156 7.71
TOTAL 1801 16185 8.99

The ICON 2010 version came with a data split into three parts: training, development and test. The intra-chunk dependencies have been added:

Part Sentences Chunks Ratio Words Ratio
Training 2972 64452 21.69
Development 543 12616 23.23
Test 321 6588 20.52
TOTAL 3836 83656 21.81

I have counted the sentences and tokens (words) on the .conll files; there are slight differences from the statistics presented in (Husain et al., 2010).

The HTB 0.5 (2012) version came with a data split into three parts: training, development and test. The intra-chunk dependencies have been added:

Part Sentences Chunks Ratio Words Ratio
Training 12041 268093 22.27
Development 1233 26416 21.42
Test
TOTAL

Inside

HTB 0.5 is distributed in Devanagari UTF-8 and in the WX encoding (see below), both in SSF and CoNLL formats, each with gold-standard and automatic morphology.

The rest of this section applies to the ICON datasets. It may or may not still be valid for HTB 0.5.

The text uses the WX encoding of Indian letters. If we know what the original script is (Devanagari in this case) we can map the WX encoding to the original characters in UTF-8. WX uses English letters so if there was embedded English (or other string using Latin letters) it will probably get lost during the conversion. Note that there are (not infrequent) broken characters (\x{FFFD} REPLACEMENT CHARACTER) in the WX encoding and the correct characters cannot be recovered automatically.

Occasionally there are NULL nodes that do not correspond to any surface chunk or token. They represent ellided participants.

The syntactic tags (dependency relation labels) are karaka relations, i.e. deep syntactic roles according to the Pāṇinian grammar. There are separate versions of the treebank with fine-grained and coarse-grained syntactic tags.

According to (Husain et al., 2010), in the ICON 2010 version, the chunk tags, POS tags, lemma, morphosyntactic features and inter-chunk dependencies (topology + tags) were annotated manually. The rest (intra-chunk dependencies, headword of chunk) was marked automatically. The tool for intra-chunk dependency parsing achieves about 96% accuracy.

Note: There have been cycles in the Hindi part of HyDT.

Sample

The first two sentences of the ICON 2010 training data (with fine-grained syntactic tags) in the Shakti format:

<document docid="hi">
<head>
<title>  </title>			
<author>			
<firstname>  </firstname>			
<middlename>    </middlename>			
<lastname></lastname>			
</author>			
<availability format="electronic" />			
<bibl>			
</bibl>			
<bytecount>8.0K</bytecount>			
<domain name="general" />			
<creation creationdate="19/06/2007" institutename="IIIT Hyderabad">			
<creatorname>			
<lastname>Dipti</lastname>			
<middlename>			
</middlename>			
<firstname>Sharma</firstname>			
</creatorname>			
</creation>			
<distributor>CLIA Consortia, DIT</distributor>			
<edition number="1.0" />			
<encodingdesc>			
<newencoding>Unicode(UTF-8)</newencoding>			
<originalencoding>UTF-8</originalencoding>			
</encodingdesc>			
<sentencemarker marker=".">Specify Marker</sentencemarker>			
<language name="hi" writingsystem="LTR" script="Devanagari" />			
<normalization normalized="no">			
<utilityname>xxx.exe</utilityname>			
</normalization>			
<projectdesc name="ILMT" />			
<pubaddress addresstype="web">			
</pubaddress>			
<pubdate>			
<dateofpublication></dateofpublication>			
</pubdate>			
<publicationstmt type="copyrightfree">			
</publicationstmt>			
<publisher>			
<name></name>			
<url>xxx.com</url>			
</publisher>			
<pubplace place="books" />			
<wordcount>2  </wordcount>			
<caption>xuvryavahAra se biParIM bipASA Pilma mahowsava se vApasa lOta gaI bipASA govA. </caption>			
</caption>			
 
<annotated-resource name="HyDT-Hindi" version="2.0" type="dep-words" layers="morph,pos,chunk,dep-word" language="hin" date-of-release="20100823">
    <annotation-standard>
        <morph-standard name="Anncorra-morph" version="1.31" date="20080920" />
        <pos-standard name="Anncorra-pos" version="" date="20061215" />
        <chunk-standard name="Anncorra-chunk" version="" date="20061215" />
        <intrachunk-dependency-standard name="Anncorra-intrachunk-dep" version="1.0" date="" dep-tagset-granularity="5" />
        <dependency-standard name="Anncorra-dep" version="2.0" date="" dep-tagset-granularity="6" />
    </annotation-standard>
</annotated-resource>
</head>
<body>
<tb number="1" segment="no" bullet="no">
<foreign language="select" writingsystem="LTR"></foreign>
<text>
<Sentence id="1">
1	bAwa	NN	<fs af='bAwa,n,f,sg,3,d,0,0' drel='k1:ho' posn='10' name='bAwa' chunkId='NP' chunkType='head:NP'>
2	galawa	JJ	<fs af='galawa,adj,any,any,,any,,' drel='k1s:ho' posn='20' name='galawa' chunkId='JJP' chunkType='head:JJP'>
3	ho	VM	<fs af='ho,v,any,any,any,,0,0' drel='vmod:hE' stype='declarative' posn='30' voicetype='active' name='ho' chunkId='VGF' chunkType='head:VGF'>
4	wo	CC	<fs af='wo,avy,,,,,,' posn='40' name='wo' chunkId='CCP' chunkType='head:CCP'>
5	gussA	NN	<fs af='gussA,n,m,sg,3,d,0,0' drel='pof:AnA' posn='50' name='gussA' chunkId='NP2' chunkType='head:NP2'>
6	selebritija	NN	<fs af='selebritija,unk,,,,,0_ko,' drel='k4a:AnA' posn='60' vpos='vib_2_RP' name='selebritija' chunkId='NP3' chunkType='head:NP3'>
7	ko	PSP	<fs af='ko,psp,,,,,,' posn='70' drel='lwg__psp:selebritija' chunkType='child:NP3' name='ko'>
8	BI	RP	<fs af='BI,avy,,,,,,' posn='80' drel='lwg__rp:selebritija' chunkType='child:NP3' name='BI'>
9	AnA	VM	<fs af='A,v,any,any,any,d,nA,nA' drel='k1:hE' posn='90' name='AnA' chunkId='VGNN' chunkType='head:VGNN'>
10	lAjamI	JJ	<fs af='lAjamI,adj,any,any,,,,' drel='pof:hE' posn='100' name='lAjamI' chunkId='JJP2' chunkType='head:JJP2'>
11	hE	VM	<fs af='hE,v,any,sg,3,,hE,hE' drel='ccof:wo' stype='declarative' posn='110' voicetype='active' name='hE' chunkId='VGF2' chunkType='head:VGF2'>
12	.	SYM	<fs af='.,punc,,,,,,' posn='120' drel='rsym:hE' chunkType='child:VGF2' name='.'>
</Sentence>
 
 
<Sentence id="2">
1	bqhaspawivAra	NNP	<fs af='bqhaspawivAra,n,m,sg,3,o,0_ko,0' drel='k7t:hue' posn='10' vpos='vib_2' name='bqhaspawivAra' chunkId='NP' chunkType='head:NP'>
2	ko	PSP	<fs af='ko,psp,,,,,,' posn='20' drel='lwg__psp:bqhaspawivAra' chunkType='child:NP' name='ko'>
3	jZI	NNP	<fs af='jI,n,m,sg,3,o,0_meM,0' drel='k7:hue' posn='30' vpos='vib_2' name='jZI' chunkId='NP2' chunkType='head:NP2'>
4	meM	PSP	<fs af='meM,psp,,,,,,' posn='40' drel='lwg__psp:jZI' chunkType='child:NP2' name='meM'>
5	SurU	NN	<fs af='SurU,n,m,sg,3,d,0,0' drel='pof:hue' posn='50' name='SurU' chunkId='NP3' chunkType='head:NP3'>
6	hue	VM	<fs af='ho,v,m,sg,any,,eM,eM' drel='nmod__k1inv:mahowsava' posn='60' name='hue' chunkId='VGNF' chunkType='head:VGNF'>
7	��veM	XC	<fs af='��veM,n,m,sg,3,d,0,0' posn='70' drel='mod:mahowsava' chunkType='child:NP4' name='��veM'>
8	aMwarrARtrIya	XC	<fs af='aMwarrARtrIya,n,m,sg,3,d,0,0' posn='80' drel='mod:mahowsava' chunkType='child:NP4' name='aMwarrARtrIya'>
9	Pilma	XC	<fs af='Pilma,n,f,sg,3,d,0,0' posn='90' drel='mod:mahowsava' chunkType='child:NP4' name='Pilma'>
10	mahowsava	NNP	<fs af='mahowsava,n,m,sg,,o,0_kA,0' drel='r6:raMga' posn='100' vpos='vib_5' name='mahowsava' chunkId='NP4' chunkType='head:NP4'>
11	ke	PSP	<fs af='kA,psp,m,sg,,o,,' posn='110' drel='lwg__psp:mahowsava' chunkType='child:NP4' name='ke'>
12	raMga	NN	<fs af='raMga,n,m,sg,3,o,0_meM,0' drel='k7:padZA' posn='120' vpos='vib_2' name='raMga' chunkId='NP5' chunkType='head:NP5'>
13	meM	PSP	<fs af='meM,psp,,,,,,' posn='130' drel='lwg__psp:raMga' chunkType='child:NP5' name='meM2'>
14	BaMga	JJ	<fs af='BaMga,adj,any,any,,any,,' drel='pof:padZA' posn='140' name='BaMga' chunkId='JJP' chunkType='head:JJP'>
15	usa	DEM	<fs af='vaha,pn,any,sg,3,o,,' posn='150' drel='nmod__adj:samaya' chunkType='child:NP6' name='usa'>
16	samaya	NN	<fs af='samaya,n,any,sg,3,d,0,0' drel='k7t:padZA' posn='160' name='samaya' chunkId='NP6' chunkType='head:NP6'>
17	padZA	VM	<fs af='pada,v,any,any,any,,yA,yA' stype='declarative' posn='170' voicetype='active' name='padZA' chunkId='VGF' chunkType='head:VGF'>
18	jaba	PRP	<fs af='jaba,pn,,,,,,' drel='k7t:kiyA' posn='180' coref='samaya' name='jaba' chunkId='NP7' chunkType='head:NP7'>
19	vahAM	PRP	<fs af='vahAz,pn,,,,,0_para,' drel='jjmod:wEnAwa' posn='190' vpos='vib_2' name='vahAM' chunkId='NP8' chunkType='head:NP8'>
20	para	PSP	<fs af='para,psp,,,,,,' posn='200' drel='lwg__psp:vahAM' chunkType='child:NP8' name='para'>
21	wEnAwa	JJ	<fs af='wEnAwa,adj,any,any,,o,,' drel='nmod:surakRAkarmiyoM' posn='210' name='wEnAwa' chunkId='JJP2' chunkType='head:JJP2'>
22	surakRAkarmiyoM	NN	<fs af='surakRAkarmI,n,m,pl,3,o,0_ne,0' drel='k1:kiyA' posn='220' vpos='vib_2' name='surakRAkarmiyoM' chunkId='NP9' chunkType='head:NP9'>
23	ne	PSP	<fs af='ne,psp,,,,,,' posn='230' drel='lwg__psp:surakRAkarmiyoM' chunkType='child:NP9' name='ne'>
24	bOYlIvuda	NN	<fs af='bOYlIvuda,n,m,sg,3,o,0_kA,0' drel='r6:basu' posn='240' vpos='vib_2' name='bOYlIvuda' chunkId='NP10' chunkType='head:NP10'>
25	kI	PSP	<fs af='kA,psp,f,sg,,o,,' posn='250' drel='lwg__psp:bOYlIvuda' chunkType='child:NP10' name='kI'>
26	aBinewrI	NN	<fs af='aBinewrI,n,f,sg,3,o,0,0' posn='260' drel='nmod:bipASA' chunkType='child:NP11' name='aBinewrI'>
27	bipASA	NN	<fs af='bipASA,n,f,sg,3,d,0,0' posn='270' drel='nmod:basu' chunkType='child:NP11' name='bipASA'>
28	basu	NNP	<fs af='basu,n,f,sg,3,o,0_ke_sAWa,0' drel='k2:kiyA' posn='280' vpos='vib_vib_vib_4_5' name='basu' chunkId='NP11' chunkType='head:NP11'>
29	ke	PSP	<fs af='ke,psp,,,,,,' posn='290' drel='lwg__psp:basu' chunkType='child:NP11' name='ke2'>
30	sAWa	NST	<fs af='sAWa,nst,m,sg,3,d,,' posn='300' drel='lwg__psp:basu' chunkType='child:NP11' name='sAWa'>
31	xuvyarvahAra	NN	<fs af='xuvyarvahAra,n,m,sg,3,d,0,0' drel='pof:kiyA' posn='310' name='xuvyarvahAra' chunkId='NP12' chunkType='head:NP12'>
32	kiyA	VM	<fs af='kara,v,m,sg,any,,yA,yA' drel='nmod__relc:samaya' stype='declarative' posn='320' voicetype='active' name='kiyA' chunkId='VGF2' chunkType='head:VGF2'>
33	.	SYM	<fs af='.,punc,,,,,,' posn='330' drel='rsym:kiyA' chunkType='child:VGF2' name='.'>
</Sentence>

The same two sentences converted to the CoNLL format, WX characters decoded back to Devanagari in UTF-8:

1 बात बात NN n lex-bAwa|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-10|name-bAwa|chunkId-NP|chunkType-head:NP 3 k1 _ _
2 गलत गलत JJ adj lex-galawa|cat-adj|gend-any|num-any|pers-|case-any|vib-|tam-|posn-20|name-galawa|chunkId-JJP|chunkType-head:JJP 3 k1s _ _
3 हो हो VM v lex-ho|cat-v|gend-any|num-any|pers-any|case-|vib-0|tam-0|stype-declarative|posn-30|voicetype-active|name-ho|chunkId-VGF|chunkType-head:VGF 11 vmod _ _
4 तो तो CC avy lex-wo|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-40|name-wo|chunkId-CCP|chunkType-head:CCP 0 main _ _
5 गुस्सा गुस्सा NN n lex-gussA|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-50|name-gussA|chunkId-NP2|chunkType-head:NP2 9 pof _ _
6 सेलेब्रिटिज सेलेब्रिटिज NN unk lex-selebritija|cat-unk|gend-|num-|pers-|case-|vib-0_ko|tam-|posn-60|vpos-vib_2_RP|name-selebritija|chunkId-NP3|chunkType-head:NP3 9 k4a _ _
7 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-70|chunkType-child:NP3|name-ko 6 lwg__psp _ _
8 भी भी RP avy lex-BI|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-80|chunkType-child:NP3|name-BI 6 lwg__rp _ _
9 आना VM v lex-A|cat-v|gend-any|num-any|pers-any|case-d|vib-nA|tam-nA|posn-90|name-AnA|chunkId-VGNN|chunkType-head:VGNN 11 k1 _ _
10 लाजमी लाजमी JJ adj lex-lAjamI|cat-adj|gend-any|num-any|pers-|case-|vib-|tam-|posn-100|name-lAjamI|chunkId-JJP2|chunkType-head:JJP2 11 pof _ _
11 है है VM v lex-hE|cat-v|gend-any|num-sg|pers-3|case-|vib-hE|tam-hE|stype-declarative|posn-110|voicetype-active|name-hE|chunkId-VGF2|chunkType-head:VGF2 4 ccof _ _
12 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-120|chunkType-child:VGF2|name-. 11 rsym _ _
1 बृहस्पतिवार बृहस्पतिवार NNP n lex-bqhaspawivAra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-10|vpos-vib_2|name-bqhaspawivAra|chunkId-NP|chunkType-head:NP 6 k7t _ _
2 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-20|chunkType-child:NP|name-ko 1 lwg__psp _ _
3 ज़ी जी NNP n lex-jI|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-30|vpos-vib_2|name-jZI|chunkId-NP2|chunkType-head:NP2 6 k7 _ _
4 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-40|chunkType-child:NP2|name-meM 3 lwg__psp _ _
5 शुरू शुरू NN n lex-SurU|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-50|name-SurU|chunkId-NP3|chunkType-head:NP3 6 pof _ _
6 हुए हो VM v lex-ho|cat-v|gend-m|num-sg|pers-any|case-|vib-eM|tam-eM|posn-60|name-hue|chunkId-VGNF|chunkType-head:VGNF 10 nmod__k1inv _ _
7 ��वें ��वें XC n lex-��veM|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-70|chunkType-child:NP4|name-��veM 10 mod _ _
8 अंतर्राष्ट्रीय अंतर्राष्ट्रीय XC n lex-aMwarrARtrIya|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-80|chunkType-child:NP4|name-aMwarrARtrIya 10 mod _ _
9 फिल्म फिल्म XC n lex-Pilma|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-90|chunkType-child:NP4|name-Pilma 10 mod _ _
10 महोत्सव महोत्सव NNP n lex-mahowsava|cat-n|gend-m|num-sg|pers-|case-o|vib-0_kA|tam-0|posn-100|vpos-vib_5|name-mahowsava|chunkId-NP4|chunkType-head:NP4 12 r6 _ _
11 के का PSP psp lex-kA|cat-psp|gend-m|num-sg|pers-|case-o|vib-|tam-|posn-110|chunkType-child:NP4|name-ke 10 lwg__psp _ _
12 रंग रंग NN n lex-raMga|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-120|vpos-vib_2|name-raMga|chunkId-NP5|chunkType-head:NP5 17 k7 _ _
13 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-130|chunkType-child:NP5|name-meM2 12 lwg__psp _ _
14 भंग भंग JJ adj lex-BaMga|cat-adj|gend-any|num-any|pers-|case-any|vib-|tam-|posn-140|name-BaMga|chunkId-JJP|chunkType-head:JJP 17 pof _ _
15 उस वह DEM pn lex-vaha|cat-pn|gend-any|num-sg|pers-3|case-o|vib-|tam-|posn-150|chunkType-child:NP6|name-usa 16 nmod__adj _ _
16 समय समय NN n lex-samaya|cat-n|gend-any|num-sg|pers-3|case-d|vib-0|tam-0|posn-160|name-samaya|chunkId-NP6|chunkType-head:NP6 17 k7t _ _
17 पड़ा पड VM v lex-pada|cat-v|gend-any|num-any|pers-any|case-|vib-yA|tam-yA|stype-declarative|posn-170|voicetype-active|name-padZA|chunkId-VGF|chunkType-head:VGF 0 main _ _
18 जब जब PRP pn lex-jaba|cat-pn|gend-|num-|pers-|case-|vib-|tam-|posn-180|coref-samaya|name-jaba|chunkId-NP7|chunkType-head:NP7 32 k7t _ _
19 वहां वहाँ PRP pn lex-vahAz|cat-pn|gend-|num-|pers-|case-|vib-0_para|tam-|posn-190|vpos-vib_2|name-vahAM|chunkId-NP8|chunkType-head:NP8 21 jjmod _ _
20 पर पर PSP psp lex-para|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-200|chunkType-child:NP8|name-para 19 lwg__psp _ _
21 तैनात तैनात JJ adj lex-wEnAwa|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-210|name-wEnAwa|chunkId-JJP2|chunkType-head:JJP2 22 nmod _ _
22 सुरक्षाकर्मियों सुरक्षाकर्मी NN n lex-surakRAkarmI|cat-n|gend-m|num-pl|pers-3|case-o|vib-0_ne|tam-0|posn-220|vpos-vib_2|name-surakRAkarmiyoM|chunkId-NP9|chunkType-head:NP9 32 k1 _ _
23 ने ने PSP psp lex-ne|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-230|chunkType-child:NP9|name-ne 22 lwg__psp _ _
24 बॉलीवुड बॉलीवुड NN n lex-bOYlIvuda|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_kA|tam-0|posn-240|vpos-vib_2|name-bOYlIvuda|chunkId-NP10|chunkType-head:NP10 28 r6 _ _
25 की का PSP psp lex-kA|cat-psp|gend-f|num-sg|pers-|case-o|vib-|tam-|posn-250|chunkType-child:NP10|name-kI 24 lwg__psp _ _
26 अभिनेत्री अभिनेत्री NN n lex-aBinewrI|cat-n|gend-f|num-sg|pers-3|case-o|vib-0|tam-0|posn-260|chunkType-child:NP11|name-aBinewrI 27 nmod _ _
27 बिपाशा बिपाशा NN n lex-bipASA|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-270|chunkType-child:NP11|name-bipASA 28 nmod _ _
28 बसु बसु NNP n lex-basu|cat-n|gend-f|num-sg|pers-3|case-o|vib-0_ke_sAWa|tam-0|posn-280|vpos-vib_vib_vib_4_5|name-basu|chunkId-NP11|chunkType-head:NP11 32 k2 _ _
29 के के PSP psp lex-ke|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-290|chunkType-child:NP11|name-ke2 28 lwg__psp _ _
30 साथ साथ NST nst lex-sAWa|cat-nst|gend-m|num-sg|pers-3|case-d|vib-|tam-|posn-300|chunkType-child:NP11|name-sAWa 28 lwg__psp _ _
31 दुव्यर्वहार दुव्यर्वहार NN n lex-xuvyarvahAra|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-310|name-xuvyarvahAra|chunkId-NP12|chunkType-head:NP12 32 pof _ _
32 किया कर VM v lex-kara|cat-v|gend-m|num-sg|pers-any|case-|vib-yA|tam-yA|stype-declarative|posn-320|voicetype-active|name-kiyA|chunkId-VGF2|chunkType-head:VGF2 16 nmod__relc _ _
33 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-330|chunkType-child:VGF2|name-. 32 rsym _ _

The first sentence of the ICON 2010 development data (with fine-grained syntactic tags) in the Shakti format:

<document docid="fullnews_id_2489467">
<head>
	<caption>jela meM svasWa hE sarabajIwa xo BArawIya aXikAriyoM ne mulAkAwa kI pre isalAmAbAxa.</caption>
	<language>Hindi </language>
	<domain_name>News Articles </domain_name>
	<word_count>524</word_count>
	<byte_count>64554</byte_count>
	<availability>
		<format>CML/SSF</format>
	<sentence_marker>.</sentence_marker>
	<normalization>No</normalization>
	</availability>
	<encoding_description>
		<original_encoding>ISO 8859</format>
		<new_encoding>Unicode UTF8</new_encoding>
	</encoding_description>
	<distributor>LTRC, IIIT Hyderabad</distributor>
	<project_description>NSF Hindi/Urdu Dependency Treebanking Project</place>
	<creation>
		</raw_corpus creation_date="" institute_name="IIIT Hyderabad">
		</annotated_corpus creation_date="06/01/2009" institute_name="IIIT Hyderabad">
	<edition_number>1.0</edition_number>
	</creation>
	<publication>
		<place>New Delhi</place>
		<date>30/5/2004</date>
		<type>Newspaper</type>
		<publisher>
			<name>Amar Ujala</name>
			<url>http://www.amarujala.com</url>
		</publisher>
	</publication>
 
<annotated-resource name="HyDT-Hindi" version="2.0" type="dep-words" layers="morph,pos,chunk,dep-word" language="hin" date-of-release="20100831">
    <annotation-standard>
        <morph-standard name="Anncorra-morph" version="1.31" date="20080920" />
        <pos-standard name="Anncorra-pos" version="" date="20061215" />
        <chunk-standard name="Anncorra-chunk" version="" date="20061215" />
        <intrachunk-dependency-standard name="Anncorra-intrachunk-dep" version="1.0" date="" dep-tagset-granularity="5" />
        <dependency-standard name="Anncorra-dep" version="2.0" date="" dep-tagset-granularity="6" />
    </annotation-standard>
</annotated-resource>
</head>
<body>
<tb number="1" segment="no" bullet="no">
<foreign language="select" writingsystem="LTR"></foreign>
<text>
<Sentence id="1">
1	kota	XC	<fs af='kota,n,m,sg,3,d,0,0' posn='10' drel='mod:lAhOra' chunkType='child:NP' name='kota'>
2	laKapawa	XC	<fs af='laKapawa,n,m,sg,3,d,0,0' posn='20' drel='mod:lAhOra' chunkType='child:NP' name='laKapawa'>
3	jela	XC	<fs af='jela,n,m,sg,3,d,0,0' posn='30' drel='mod:lAhOra' chunkType='child:NP' name='jela'>
4	lAhOra	NNP	<fs af='lAhOra,n,m,sg,3,o,0_meM,0' drel='jjmod:baMxa' posn='40' vpos='vib_5' name='lAhOra' chunkId='NP' chunkType='head:NP'>
5	meM	PSP	<fs af='meM,psp,,,,,,' posn='50' drel='lwg__psp:lAhOra' chunkType='child:NP' name='meM'>
6	baMxa	JJ	<fs af='baMxa,adj,any,any,,o,,' drel='nmod:siMha' posn='60' name='baMxa' chunkId='JJP' chunkType='head:JJP'>
7	sarabajIwa	XC	<fs af='sarabajIwa,n,m,sg,3,d,0,0' posn='70' drel='mod:siMha' chunkType='child:NP2' name='sarabajIwa'>
8	siMha	NNP	<fs af='siMha,n,m,sg,3,o,0_ne,0' drel='k1:xIM' posn='80' vpos='vib_3' name='siMha' chunkId='NP2' chunkType='head:NP2'>
9	ne	PSP	<fs af='ne,psp,,,,,,' posn='90' drel='lwg__psp:siMha' chunkType='child:NP2' name='ne'>
10	maMgalavAra	NNP	<fs af='maMgalavAra,n,m,sg,3,o,0_ko,0' drel='k7t:xIM' posn='100' vpos='vib_2' name='maMgalavAra' chunkId='NP3' chunkType='head:NP3'>
11	ko	PSP	<fs af='ko,psp,,,,,,' posn='110' drel='lwg__psp:maMgalavAra' chunkType='child:NP3' name='ko'>
12	BArawIya	JJ	<fs af='BArawIya,adj,any,any,,o,,' posn='120' drel='nmod__adj:xUwAvAsa' chunkType='child:NP4' name='BArawIya'>
13	xUwAvAsa	NN	<fs af='xUwAvAsa,n,m,sg,3,o,0_kA,0' drel='r6:aXikAriyoM' posn='130' vpos='vib_3' name='xUwAvAsa' chunkId='NP4' chunkType='head:NP4'>
14	ke	PSP	<fs af='kA,psp,m,pl,,o,,' posn='140' drel='lwg__psp:xUwAvAsa' chunkType='child:NP4' name='ke'>
15	xo	QC	<fs af='xo,num,any,pl,,o,,' posn='150' drel='nmod__adj:aXikAriyoM' chunkType='child:NP5' name='xo'>
16	aXikAriyoM	NN	<fs af='aXikArI,n,m,pl,3,o,0_ko,0' drel='k4:xIM' posn='160' vpos='vib_3' name='aXikAriyoM' chunkId='NP5' chunkType='head:NP5'>
17	ko	PSP	<fs af='ko,psp,,,,,,' posn='170' drel='lwg__psp:aXikAriyoM' chunkType='child:NP5' name='ko2'>
18	apane	PRP	<fs af='apanA,pn,any,sg,1,o,0_bAre_meM,0' drel='k7:xIM' posn='180' vpos='vib_2_3' name='apane' chunkId='NP6' chunkType='head:NP6'>
19	bAre	PSP	<fs af='bAre,psp,,,,,,' posn='190' drel='lwg__psp:apane' chunkType='child:NP6' name='bAre'>
20	meM	PSP	<fs af='meM,psp,,,,,,' posn='200' drel='lwg__psp:apane' chunkType='child:NP6' name='meM2'>
21	wamAma	JJ	<fs af='wamAma,adj,any,any,,d,,' posn='210' drel='nmod__adj:jAnakAriyAM' chunkType='child:NP7' name='wamAma'>
22	vyakwigawa	JJ	<fs af='vyakwigawa,adj,any,any,,d,,' posn='220' drel='nmod__adj:jAnakAriyAM' chunkType='child:NP7' name='vyakwigawa'>
23	jAnakAriyAM	NN	<fs af='jAnakAriyAM,n,f,pl,3,d,0,0' drel='k2:xIM' posn='230' name='jAnakAriyAM' chunkId='NP7' chunkType='head:NP7'>
24	xIM	VM	<fs af='xe,v,f,pl,3,,yA,yA' stype='declarative' posn='240' voicetype='active' name='xIM' chunkId='VGF' chunkType='head:VGF'>
25	ki	CC	<fs af='ki,avy,,,,,,' drel='rs:jAnakAriyAM' posn='250' name='ki' chunkId='CCP' chunkType='head:CCP'>
26	kina	WQ	<fs af='kOna,pn,any,pl,3,o,,' posn='260' drel='mod__wq:parisWiwiyoM' chunkType='child:NP8' name='kina'>
27	parisWiwiyoM	NN	<fs af='parisWiwi,n,f,pl,3,o,0_meM,0' drel='k7:kiyA' posn='270' vpos='vib_3' name='parisWiwiyoM' chunkId='NP8' chunkType='head:NP8'>
28	meM	PSP	<fs af='meM,psp,,,,,,' posn='280' drel='lwg__psp:parisWiwiyoM' chunkType='child:NP8' name='meM3'>
29	use	PRP	<fs af='vaha,pn,any,sg,3,o,ko,ko' drel='k2:kiyA' posn='290' name='use' chunkId='NP9' chunkType='head:NP9'>
30	giraPwAra	JJ	<fs af='giraPwAra,adj,any,any,,,,' drel='pof:kiyA' posn='300' name='giraPwAra' chunkId='JJP2' chunkType='head:JJP2'>
31	kiyA	VM	<fs af='kara,v,m,sg,3,,yA_jA+yA�,yA' drel='ccof:Ora' stype='declarative' posn='310' voicetype='passive' vpos='tam_2' name='kiyA' chunkId='VGF2' chunkType='head:VGF2'>
32	gayA	VAUX	<fs af='jA,v,m,sg,3,,yA�,yA1' posn='320' drel='lwg__vaux:kiyA' chunkType='child:VGF2' name='gayA'>
33	,	SYM	<fs af=',s,punc,,,,,' posn='330' drel='rsym:kiyA' chunkType='child:VGF2' name=','>
34	mukaxamA	NN	<fs af='mukaxamA,n,m,sg,3,d,0,0' drel='k1:calA' posn='340' name='mukaxamA' chunkId='NP10' chunkType='head:NP10'>
35	calA	VM	<fs af='cala,v,m,sg,3,,yA,yA' hlt='true' drel='ccof:Ora' stype='declarative' posn='350' voicetype='active' name='calA' chunkId='VGF3' chunkType='head:VGF3'>
36	Ora	CC	<fs af='Ora,avy,,,,,,' drel='ccof:ki' posn='360' name='Ora' chunkId='CCP2' chunkType='head:CCP2'>
37	sajA	NN	<fs af='sajA,n,f,sg,3,d,0,0' drel='k1:huI' posn='370' name='sajA' chunkId='NP11' chunkType='head:NP11'>
38	huI	VM	<fs af='ho,v,f,sg,3,,yA,yA' drel='ccof:Ora' stype='declarative' posn='380' voicetype='active' name='huI' chunkId='VGF4' chunkType='head:VGF4'>
39	.	SYM	<fs af='.,punc,,,,,,' posn='390' drel='rsym:huI' chunkType='child:VGF4' name='.'>
</Sentence>

And in the CoNLL format:

1 kota kota XC n lex-kota|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-10|chunkType-child:NP|name-kota 4 mod _ _
2 laKapawa laKapawa XC n lex-laKapawa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-20|chunkType-child:NP|name-laKapawa 4 mod _ _
3 jela jela XC n lex-jela|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-30|chunkType-child:NP|name-jela 4 mod _ _
4 lAhOra lAhOra NNP n lex-lAhOra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-40|vpos-vib_5|name-lAhOra|chunkId-NP|chunkType-head:NP 6 jjmod _ _
5 meM meM PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-50|chunkType-child:NP|name-meM 4 lwg__psp _ _
6 baMxa baMxa JJ adj lex-baMxa|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-60|name-baMxa|chunkId-JJP|chunkType-head:JJP 8 nmod _ _
7 sarabajIwa sarabajIwa XC n lex-sarabajIwa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-70|chunkType-child:NP2|name-sarabajIwa 8 mod _ _
8 siMha siMha NNP n lex-siMha|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ne|tam-0|posn-80|vpos-vib_3|name-siMha|chunkId-NP2|chunkType-head:NP2 24 k1 _ _
9 ne ne PSP psp lex-ne|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-90|chunkType-child:NP2|name-ne 8 lwg__psp _ _
10 maMgalavAra maMgalavAra NNP n lex-maMgalavAra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-100|vpos-vib_2|name-maMgalavAra|chunkId-NP3|chunkType-head:NP3 24 k7t _ _
11 ko ko PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-110|chunkType-child:NP3|name-ko 10 lwg__psp _ _
12 BArawIya BArawIya JJ adj lex-BArawIya|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-120|chunkType-child:NP4|name-BArawIya 13 nmod__adj _ _
13 xUwAvAsa xUwAvAsa NN n lex-xUwAvAsa|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_kA|tam-0|posn-130|vpos-vib_3|name-xUwAvAsa|chunkId-NP4|chunkType-head:NP4 16 r6 _ _
14 ke kA PSP psp lex-kA|cat-psp|gend-m|num-pl|pers-|case-o|vib-|tam-|posn-140|chunkType-child:NP4|name-ke 13 lwg__psp _ _
15 xo xo QC num lex-xo|cat-num|gend-any|num-pl|pers-|case-o|vib-|tam-|posn-150|chunkType-child:NP5|name-xo 16 nmod__adj _ _
16 aXikAriyoM aXikArI NN n lex-aXikArI|cat-n|gend-m|num-pl|pers-3|case-o|vib-0_ko|tam-0|posn-160|vpos-vib_3|name-aXikAriyoM|chunkId-NP5|chunkType-head:NP5 24 k4 _ _
17 ko ko PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-170|chunkType-child:NP5|name-ko2 16 lwg__psp _ _
18 apane apanA PRP pn lex-apanA|cat-pn|gend-any|num-sg|pers-1|case-o|vib-0_bAre_meM|tam-0|posn-180|vpos-vib_2_3|name-apane|chunkId-NP6|chunkType-head:NP6 24 k7 _ _
19 bAre bAre PSP psp lex-bAre|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-190|chunkType-child:NP6|name-bAre 18 lwg__psp _ _
20 meM meM PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-200|chunkType-child:NP6|name-meM2 18 lwg__psp _ _
21 wamAma wamAma JJ adj lex-wamAma|cat-adj|gend-any|num-any|pers-|case-d|vib-|tam-|posn-210|chunkType-child:NP7|name-wamAma 23 nmod__adj _ _
22 vyakwigawa vyakwigawa JJ adj lex-vyakwigawa|cat-adj|gend-any|num-any|pers-|case-d|vib-|tam-|posn-220|chunkType-child:NP7|name-vyakwigawa 23 nmod__adj _ _
23 jAnakAriyAM jAnakAriyAM NN n lex-jAnakAriyAM|cat-n|gend-f|num-pl|pers-3|case-d|vib-0|tam-0|posn-230|name-jAnakAriyAM|chunkId-NP7|chunkType-head:NP7 24 k2 _ _
24 xIM xe VM v lex-xe|cat-v|gend-f|num-pl|pers-3|case-|vib-yA|tam-yA|stype-declarative|posn-240|voicetype-active|name-xIM|chunkId-VGF|chunkType-head:VGF 0 main _ _
25 ki ki CC avy lex-ki|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-250|name-ki|chunkId-CCP|chunkType-head:CCP 23 rs _ _
26 kina kOna WQ pn lex-kOna|cat-pn|gend-any|num-pl|pers-3|case-o|vib-|tam-|posn-260|chunkType-child:NP8|name-kina 27 mod__wq _ _
27 parisWiwiyoM parisWiwi NN n lex-parisWiwi|cat-n|gend-f|num-pl|pers-3|case-o|vib-0_meM|tam-0|posn-270|vpos-vib_3|name-parisWiwiyoM|chunkId-NP8|chunkType-head:NP8 31 k7 _ _
28 meM meM PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-280|chunkType-child:NP8|name-meM3 27 lwg__psp _ _
29 use vaha PRP pn lex-vaha|cat-pn|gend-any|num-sg|pers-3|case-o|vib-ko|tam-ko|posn-290|name-use|chunkId-NP9|chunkType-head:NP9 31 k2 _ _
30 giraPwAra giraPwAra JJ adj lex-giraPwAra|cat-adj|gend-any|num-any|pers-|case-|vib-|tam-|posn-300|name-giraPwAra|chunkId-JJP2|chunkType-head:JJP2 31 pof _ _
31 kiyA kara VM v lex-kara|cat-v|gend-m|num-sg|pers-3|case-|vib-yA_jA+yA�|tam-yA|stype-declarative|posn-310|voicetype-passive|vpos-tam_2|name-kiyA|chunkId-VGF2|chunkType-head:VGF2 36 ccof _ _
32 gayA jA VAUX v lex-jA|cat-v|gend-m|num-sg|pers-3|case-|vib-yA�|tam-yA1|posn-320|chunkType-child:VGF2|name-gayA 31 lwg__vaux _ _
33 , , SYM s lex-|cat-s|gend-punc|num-|pers-|case-|vib-|tam-|posn-330|chunkType-child:VGF2|name-, 31 rsym _ _
34 mukaxamA mukaxamA NN n lex-mukaxamA|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-340|name-mukaxamA|chunkId-NP10|chunkType-head:NP10 35 k1 _ _
35 calA cala VM v lex-cala|cat-v|gend-m|num-sg|pers-3|case-|vib-yA|tam-yA|hlt-true|stype-declarative|posn-350|voicetype-active|name-calA|chunkId-VGF3|chunkType-head:VGF3 36 ccof _ _
36 Ora Ora CC avy lex-Ora|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-360|name-Ora|chunkId-CCP2|chunkType-head:CCP2 25 ccof _ _
37 sajA sajA NN n lex-sajA|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-370|name-sajA|chunkId-NP11|chunkType-head:NP11 38 k1 _ _
38 huI ho VM v lex-ho|cat-v|gend-f|num-sg|pers-3|case-|vib-yA|tam-yA|stype-declarative|posn-380|voicetype-active|name-huI|chunkId-VGF4|chunkType-head:VGF4 36 ccof _ _
39 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-390|chunkType-child:VGF4|name-. 38 rsym _ _

And after conversion of the WX encoding to the Devanagari script in UTF-8:

1 कोट कोट XC n lex-kota|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-10|chunkType-child:NP|name-kota 4 mod _ _
2 लखपत लखपत XC n lex-laKapawa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-20|chunkType-child:NP|name-laKapawa 4 mod _ _
3 जेल जेल XC n lex-jela|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-30|chunkType-child:NP|name-jela 4 mod _ _
4 लाहौर लाहौर NNP n lex-lAhOra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-40|vpos-vib_5|name-lAhOra|chunkId-NP|chunkType-head:NP 6 jjmod _ _
5 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-50|chunkType-child:NP|name-meM 4 lwg__psp _ _
6 बंद बंद JJ adj lex-baMxa|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-60|name-baMxa|chunkId-JJP|chunkType-head:JJP 8 nmod _ _
7 सरबजीत सरबजीत XC n lex-sarabajIwa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-70|chunkType-child:NP2|name-sarabajIwa 8 mod _ _
8 सिंह सिंह NNP n lex-siMha|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ne|tam-0|posn-80|vpos-vib_3|name-siMha|chunkId-NP2|chunkType-head:NP2 24 k1 _ _
9 ने ने PSP psp lex-ne|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-90|chunkType-child:NP2|name-ne 8 lwg__psp _ _
10 मंगलवार मंगलवार NNP n lex-maMgalavAra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-100|vpos-vib_2|name-maMgalavAra|chunkId-NP3|chunkType-head:NP3 24 k7t _ _
11 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-110|chunkType-child:NP3|name-ko 10 lwg__psp _ _
12 भारतीय भारतीय JJ adj lex-BArawIya|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-120|chunkType-child:NP4|name-BArawIya 13 nmod__adj _ _
13 दूतावास दूतावास NN n lex-xUwAvAsa|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_kA|tam-0|posn-130|vpos-vib_3|name-xUwAvAsa|chunkId-NP4|chunkType-head:NP4 16 r6 _ _
14 के का PSP psp lex-kA|cat-psp|gend-m|num-pl|pers-|case-o|vib-|tam-|posn-140|chunkType-child:NP4|name-ke 13 lwg__psp _ _
15 दो दो QC num lex-xo|cat-num|gend-any|num-pl|pers-|case-o|vib-|tam-|posn-150|chunkType-child:NP5|name-xo 16 nmod__adj _ _
16 अधिकारियों अधिकारी NN n lex-aXikArI|cat-n|gend-m|num-pl|pers-3|case-o|vib-0_ko|tam-0|posn-160|vpos-vib_3|name-aXikAriyoM|chunkId-NP5|chunkType-head:NP5 24 k4 _ _
17 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-170|chunkType-child:NP5|name-ko2 16 lwg__psp _ _
18 अपने अपना PRP pn lex-apanA|cat-pn|gend-any|num-sg|pers-1|case-o|vib-0_bAre_meM|tam-0|posn-180|vpos-vib_2_3|name-apane|chunkId-NP6|chunkType-head:NP6 24 k7 _ _
19 बारे बारे PSP psp lex-bAre|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-190|chunkType-child:NP6|name-bAre 18 lwg__psp _ _
20 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-200|chunkType-child:NP6|name-meM2 18 lwg__psp _ _
21 तमाम तमाम JJ adj lex-wamAma|cat-adj|gend-any|num-any|pers-|case-d|vib-|tam-|posn-210|chunkType-child:NP7|name-wamAma 23 nmod__adj _ _
22 व्यक्तिगत व्यक्तिगत JJ adj lex-vyakwigawa|cat-adj|gend-any|num-any|pers-|case-d|vib-|tam-|posn-220|chunkType-child:NP7|name-vyakwigawa 23 nmod__adj _ _
23 जानकारियां जानकारियां NN n lex-jAnakAriyAM|cat-n|gend-f|num-pl|pers-3|case-d|vib-0|tam-0|posn-230|name-jAnakAriyAM|chunkId-NP7|chunkType-head:NP7 24 k2 _ _
24 दीं दे VM v lex-xe|cat-v|gend-f|num-pl|pers-3|case-|vib-yA|tam-yA|stype-declarative|posn-240|voicetype-active|name-xIM|chunkId-VGF|chunkType-head:VGF 0 main _ _
25 कि कि CC avy lex-ki|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-250|name-ki|chunkId-CCP|chunkType-head:CCP 23 rs _ _
26 किन कौन WQ pn lex-kOna|cat-pn|gend-any|num-pl|pers-3|case-o|vib-|tam-|posn-260|chunkType-child:NP8|name-kina 27 mod__wq _ _
27 परिस्थितियों परिस्थिति NN n lex-parisWiwi|cat-n|gend-f|num-pl|pers-3|case-o|vib-0_meM|tam-0|posn-270|vpos-vib_3|name-parisWiwiyoM|chunkId-NP8|chunkType-head:NP8 31 k7 _ _
28 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-280|chunkType-child:NP8|name-meM3 27 lwg__psp _ _
29 उसे वह PRP pn lex-vaha|cat-pn|gend-any|num-sg|pers-3|case-o|vib-ko|tam-ko|posn-290|name-use|chunkId-NP9|chunkType-head:NP9 31 k2 _ _
30 गिरफ्तार गिरफ्तार JJ adj lex-giraPwAra|cat-adj|gend-any|num-any|pers-|case-|vib-|tam-|posn-300|name-giraPwAra|chunkId-JJP2|chunkType-head:JJP2 31 pof _ _
31 किया कर VM v lex-kara|cat-v|gend-m|num-sg|pers-3|case-|vib-yA_jA+yA�|tam-yA|stype-declarative|posn-310|voicetype-passive|vpos-tam_2|name-kiyA|chunkId-VGF2|chunkType-head:VGF2 36 ccof _ _
32 गया जा VAUX v lex-jA|cat-v|gend-m|num-sg|pers-3|case-|vib-yA�|tam-yA1|posn-320|chunkType-child:VGF2|name-gayA 31 lwg__vaux _ _
33 , , SYM s lex-|cat-s|gend-punc|num-|pers-|case-|vib-|tam-|posn-330|chunkType-child:VGF2|name-, 31 rsym _ _
34 मुकदमा मुकदमा NN n lex-mukaxamA|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-340|name-mukaxamA|chunkId-NP10|chunkType-head:NP10 35 k1 _ _
35 चला चल VM v lex-cala|cat-v|gend-m|num-sg|pers-3|case-|vib-yA|tam-yA|hlt-true|stype-declarative|posn-350|voicetype-active|name-calA|chunkId-VGF3|chunkType-head:VGF3 36 ccof _ _
36 और और CC avy lex-Ora|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-360|name-Ora|chunkId-CCP2|chunkType-head:CCP2 25 ccof _ _
37 सजा सजा NN n lex-sajA|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-370|name-sajA|chunkId-NP11|chunkType-head:NP11 38 k1 _ _
38 हुई हो VM v lex-ho|cat-v|gend-f|num-sg|pers-3|case-|vib-yA|tam-yA|stype-declarative|posn-380|voicetype-active|name-huI|chunkId-VGF4|chunkType-head:VGF4 36 ccof _ _
39 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-390|chunkType-child:VGF4|name-. 38 rsym _ _

The first sentence of the ICON 2010 test data (with fine-grained syntactic tags) in the Shakti format:

<document docid="fullnews_id_2484368">
<head>
	<caption>elaosI Kolane para hogI bAwacIwa pre isalAmAbAxa.</caption>
	<language>Hindi </language>
	<domain_name>News Articles </domain_name>
	<word_count>313</word_count>
	<byte_count>37563</byte_count>
	<availability>
		<format>CML/SSF</format>
	<sentence_marker>.</sentence_marker>
	<normalization>No</normalization>
	</availability>
	<encoding_description>
		<original_encoding>ISO 8859</format>
		<new_encoding>Unicode UTF8</new_encoding>
	</encoding_description>
	<distributor>LTRC, IIIT Hyderabad</distributor>
	<project_description>NSF Hindi/Urdu Dependency Treebanking Project</place>
	<creation>
		</raw_corpus creation_date="" institute_name="IIIT Hyderabad">
		</annotated_corpus creation_date="06/01/2009" institute_name="IIIT Hyderabad">
	<edition_number>1.0</edition_number>
	</creation>
	<publication>
		<place>New Delhi</place>
		<date>28/5/2004</date>
		<type>Newspaper</type>
		<publisher>
			<name>Amar Ujala</name>
			<url>http://www.amarujala.com</url>
		</publisher>
	</publication>
 
<annotated-resource name="HyDT-Hindi" version="2.0" type="dep-words" layers="morph,pos,chunk,dep-word" language="hin" date-of-release="20101013">
    <annotation-standard>
        <morph-standard name="Anncorra-morph" version="1.31" date="20080920" />
        <pos-standard name="Anncorra-pos" version="" date="20061215" />
        <chunk-standard name="Anncorra-chunk" version="" date="20061215" />
        <intrachunk-dependency-standard name="Anncorra-intrachunk-dep" version="1.0" date="" dep-tagset-granularity="5" />
        <dependency-standard name="Anncorra-dep" version="2.0" date="" dep-tagset-granularity="6" />
    </annotation-standard>
</annotated-resource>
</head>
<body>
<tb number="1" segment="no" bullet="no">
<foreign language="select" writingsystem="LTR"></foreign>
<text>
<Sentence id="1">
1	pAkiswAna	XC	<fs af='pAkiswAna,n,m,sg,3,d,0,0' posn='10' drel='mod:kaSmIra' chunkType='child:NP' name='pAkiswAna'>
2	aXikqwa	XC	<fs af='aXikqwa,adj,any,any,,o,,' posn='20' drel='mod:kaSmIra' chunkType='child:NP' name='aXikqwa'>
3	kaSmIra	NNP	<fs af='kaSmIra,n,m,sg,3,o,0_meM,0' drel='k7p:Ae' posn='30' vpos='vib_4' name='kaSmIra' chunkId='NP' chunkType='head:NP'>
4	meM	PSP	<fs af='meM,psp,,,,,,' posn='40' drel='lwg__psp:kaSmIra' chunkType='child:NP' name='meM'>
5	�	XC	<fs af='�,num,m,sg,3,d,,' posn='50' drel='mod:akwUbara' chunkType='child:NP2' name='�'>
6	akwUbara	NNP	<fs af='akwUbara,n,m,sg,3,o,0_ko,0' drel='k7t:Ae' posn='60' vpos='vib_3' name='akwUbara' chunkId='NP2' chunkType='head:NP2'>
7	ko	PSP	<fs af='ko,psp,,,,,,' posn='70' drel='lwg__psp:akwUbara' chunkType='child:NP2' name='ko'>
8	Ae	VM	<fs af='A,v,m,sg,any,,yA,yA' drel='nmod__k1inv:BUkaMpa' posn='80' name='Ae' chunkId='VGNF' chunkType='head:VGNF'>
9	BUkaMpa	NN	<fs af='BUkaMpa,n,m,sg,3,o,0_se,0' drel='rh:macI' posn='90' vpos='vib_2' name='BUkaMpa' chunkId='NP3' chunkType='head:NP3'>
10	se	PSP	<fs af='se,psp,,,,,,' posn='100' drel='lwg__psp:BUkaMpa' chunkType='child:NP3' name='se'>
11	macI	VM	<fs af='maca,v,f,sg,any,,yA,yA' drel='nmod__k1inv:wabAhI' posn='110' name='macI' chunkId='VGNF2' chunkType='head:VGNF2'>
12	wabAhI	NN	<fs af='wabAhI,n,f,sg,3,o,0_kA_bAxa,0' drel='k7t:kareMge' posn='120' vpos='vib_2_3' name='wabAhI' chunkId='NP4' chunkType='head:NP4'>
13	ke	PSP	<fs af='kA,psp,m,sg,3,o,,' posn='130' drel='lwg__psp:wabAhI' chunkType='child:NP4' name='ke'>
14	bAxa	NST	<fs af='bAxa,n,,,,,,' posn='140' drel='lwg__psp:wabAhI' chunkType='child:NP4' name='bAxa'>
15	BArawa	NNP	<fs af='BArawa,n,m,sg,3,d,0,0' drel='ccof:Ora' posn='150' name='BArawa' chunkId='NP5' chunkType='head:NP5'>
16	Ora	CC	<fs af='Ora,avy,,,,,,' drel='k1:kareMge' posn='160' name='Ora' chunkId='CCP' chunkType='head:CCP'>
17	pAkiswAna	NNP	<fs af='pAkiswAna,n,m,sg,3,d,0,0' drel='ccof:Ora' posn='170' name='pAkiswAna2' chunkId='NP6' chunkType='head:NP6'>
18	mAnavIya	JJ	<fs af='mAnavIya,adj,any,any,,o,,' posn='180' drel='nmod__adj:xqRtikoNa' chunkType='child:NP7' name='mAnavIya'>
19	xqRtikoNa	NN	<fs af='xqRtikoNa,n,m,sg,3,d,0,0' drel='k2:apanAwe' posn='190' name='xqRtikoNa' chunkId='NP7' chunkType='head:NP7'>
20	apanAwe	VM	<fs af='apanA,v,m,pl,any,,wA_ho+yA,wA' drel='vmod:kareMge' posn='200' vpos='tam_2' name='apanAwe' chunkId='VGNF3' chunkType='head:VGNF3'>
21	hue	VAUX	<fs af='ho,v,m,pl,any,,yA,yA' posn='210' drel='lwg__vaux:apanAwe' chunkType='child:VGNF3' name='hue'>
22	SanivAra	NNP	<fs af='SanivAra,n,m,sg,3,o,0_ko,0' drel='k7t:kareMge' posn='220' vpos='vib_2' name='SanivAra' chunkId='NP8' chunkType='head:NP8'>
23	ko	PSP	<fs af='ko,psp,,,,,,' posn='230' drel='lwg__psp:SanivAra' chunkType='child:NP8' name='ko2'>
24	islAmAbAxa	NNP	<fs af='isalAmAbAxa,n,m,sg,3,d,0_meM,0' drel='k7p:kareMge' posn='240' vpos='vib_2' name='islAmAbAxa' chunkId='NP9' chunkType='head:NP9'>
25	meM	PSP	<fs af='meM,psp,,,,,,' posn='250' drel='lwg__psp:islAmAbAxa' chunkType='child:NP9' name='meM2'>
26	niyaMwraNa	XC	<fs af='niyaMwraNa,n,m,sg,3,d,0,0' posn='260' drel='mod:reKA' chunkType='child:NP10' name='niyaMwraNa'>
27	reKA	NN	<fs af='reKA,n,f,sg,3,d,0,0' drel='k2:Kolane' posn='270' name='reKA' chunkId='NP10' chunkType='head:NP10'>
28	(	SYM	<fs af=',punc,,,,,,' posn='280' drel='rsym:elaosI' chunkType='child:NP11' name='('>
29	elaosI	NN	<fs af='elaosI,n,m,sg,3,d,0,0' drel='nmod:reKA' posn='290' name='elaosI' chunkId='NP11' chunkType='head:NP11'>
30	)	SYM	<fs af=',punc,,,,,,' posn='300' drel='rsym:elaosI' chunkType='child:NP11' name=')'>
31	Kolane	VM	<fs af='Kola,v,any,sg,any,o,nA_kA,nA' drel='r6:masale' posn='310' vpos='tam_2' name='Kolane' chunkId='VGNN' chunkType='head:VGNN'>
32	ke	PSP	<fs af='kA,psp,m,sg,,o,,' posn='320' drel='lwg__psp:Kolane' chunkType='child:VGNN' name='ke2'>
33	masale	NN	<fs af='masalA,n,m,sg,3,o,0_para,0' drel='k7:kareMge' posn='330' vpos='vib_2' name='masale' chunkId='NP12' chunkType='head:NP12'>
34	para	PSP	<fs af='para,psp,,,,,,' posn='340' drel='lwg__psp:masale' chunkType='child:NP12' name='para'>
35	bAwacIwa	NN	<fs af='bAwacIwa,n,f,sg,3,d,0,0' drel='pof:kareMge' posn='350' name='bAwacIwa' chunkId='NP13' chunkType='head:NP13'>
36	kareMge	VM	<fs af='kara,v,m,pl,3,,gA,gA' posn='360' name='kareMge' chunkId='VGF' chunkType='head:VGF'>
37	.	SYM	<fs af='.,punc,,,,,,' posn='370' drel='rsym:kareMge' chunkType='child:VGF' name='.'>
</Sentence>

And in the CoNLL format:

1 pAkiswAna pAkiswAna XC n lex-pAkiswAna|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-10|chunkType-child:NP|name-pAkiswAna 3 mod _ _
2 aXikqwa aXikqwa XC adj lex-aXikqwa|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-20|chunkType-child:NP|name-aXikqwa 3 mod _ _
3 kaSmIra kaSmIra NNP n lex-kaSmIra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-30|vpos-vib_4|name-kaSmIra|chunkId-NP|chunkType-head:NP 8 k7p _ _
4 meM meM PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-40|chunkType-child:NP|name-meM 3 lwg__psp _ _
5 XC num lex-�|cat-num|gend-m|num-sg|pers-3|case-d|vib-|tam-|posn-50|chunkType-child:NP2|name-� 6 mod _ _
6 akwUbara akwUbara NNP n lex-akwUbara|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-60|vpos-vib_3|name-akwUbara|chunkId-NP2|chunkType-head:NP2 8 k7t _ _
7 ko ko PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-70|chunkType-child:NP2|name-ko 6 lwg__psp _ _
8 Ae A VM v lex-A|cat-v|gend-m|num-sg|pers-any|case-|vib-yA|tam-yA|posn-80|name-Ae|chunkId-VGNF|chunkType-head:VGNF 9 nmod__k1inv _ _
9 BUkaMpa BUkaMpa NN n lex-BUkaMpa|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_se|tam-0|posn-90|vpos-vib_2|name-BUkaMpa|chunkId-NP3|chunkType-head:NP3 11 rh _ _
10 se se PSP psp lex-se|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-100|chunkType-child:NP3|name-se 9 lwg__psp _ _
11 macI maca VM v lex-maca|cat-v|gend-f|num-sg|pers-any|case-|vib-yA|tam-yA|posn-110|name-macI|chunkId-VGNF2|chunkType-head:VGNF2 12 nmod__k1inv _ _
12 wabAhI wabAhI NN n lex-wabAhI|cat-n|gend-f|num-sg|pers-3|case-o|vib-0_kA_bAxa|tam-0|posn-120|vpos-vib_2_3|name-wabAhI|chunkId-NP4|chunkType-head:NP4 36 k7t _ _
13 ke kA PSP psp lex-kA|cat-psp|gend-m|num-sg|pers-3|case-o|vib-|tam-|posn-130|chunkType-child:NP4|name-ke 12 lwg__psp _ _
14 bAxa bAxa NST n lex-bAxa|cat-n|gend-|num-|pers-|case-|vib-|tam-|posn-140|chunkType-child:NP4|name-bAxa 12 lwg__psp _ _
15 BArawa BArawa NNP n lex-BArawa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-150|name-BArawa|chunkId-NP5|chunkType-head:NP5 16 ccof _ _
16 Ora Ora CC avy lex-Ora|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-160|name-Ora|chunkId-CCP|chunkType-head:CCP 36 k1 _ _
17 pAkiswAna pAkiswAna NNP n lex-pAkiswAna|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-170|name-pAkiswAna2|chunkId-NP6|chunkType-head:NP6 16 ccof _ _
18 mAnavIya mAnavIya JJ adj lex-mAnavIya|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-180|chunkType-child:NP7|name-mAnavIya 19 nmod__adj _ _
19 xqRtikoNa xqRtikoNa NN n lex-xqRtikoNa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-190|name-xqRtikoNa|chunkId-NP7|chunkType-head:NP7 20 k2 _ _
20 apanAwe apanA VM v lex-apanA|cat-v|gend-m|num-pl|pers-any|case-|vib-wA_ho+yA|tam-wA|posn-200|vpos-tam_2|name-apanAwe|chunkId-VGNF3|chunkType-head:VGNF3 36 vmod _ _
21 hue ho VAUX v lex-ho|cat-v|gend-m|num-pl|pers-any|case-|vib-yA|tam-yA|posn-210|chunkType-child:VGNF3|name-hue 20 lwg__vaux _ _
22 SanivAra SanivAra NNP n lex-SanivAra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-220|vpos-vib_2|name-SanivAra|chunkId-NP8|chunkType-head:NP8 36 k7t _ _
23 ko ko PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-230|chunkType-child:NP8|name-ko2 22 lwg__psp _ _
24 islAmAbAxa isalAmAbAxa NNP n lex-isalAmAbAxa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0_meM|tam-0|posn-240|vpos-vib_2|name-islAmAbAxa|chunkId-NP9|chunkType-head:NP9 36 k7p _ _
25 meM meM PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-250|chunkType-child:NP9|name-meM2 24 lwg__psp _ _
26 niyaMwraNa niyaMwraNa XC n lex-niyaMwraNa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-260|chunkType-child:NP10|name-niyaMwraNa 27 mod _ _
27 reKA reKA NN n lex-reKA|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-270|name-reKA|chunkId-NP10|chunkType-head:NP10 31 k2 _ _
28 ( ( SYM punc lex-|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-280|chunkType-child:NP11|name-( 29 rsym _ _
29 elaosI elaosI NN n lex-elaosI|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-290|name-elaosI|chunkId-NP11|chunkType-head:NP11 27 nmod _ _
30 ) ) SYM punc lex-|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-300|chunkType-child:NP11|name-) 29 rsym _ _
31 Kolane Kola VM v lex-Kola|cat-v|gend-any|num-sg|pers-any|case-o|vib-nA_kA|tam-nA|posn-310|vpos-tam_2|name-Kolane|chunkId-VGNN|chunkType-head:VGNN 33 r6 _ _
32 ke kA PSP psp lex-kA|cat-psp|gend-m|num-sg|pers-|case-o|vib-|tam-|posn-320|chunkType-child:VGNN|name-ke2 31 lwg__psp _ _
33 masale masalA NN n lex-masalA|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_para|tam-0|posn-330|vpos-vib_2|name-masale|chunkId-NP12|chunkType-head:NP12 36 k7 _ _
34 para para PSP psp lex-para|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-340|chunkType-child:NP12|name-para 33 lwg__psp _ _
35 bAwacIwa bAwacIwa NN n lex-bAwacIwa|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-350|name-bAwacIwa|chunkId-NP13|chunkType-head:NP13 36 pof _ _
36 kareMge kara VM v lex-kara|cat-v|gend-m|num-pl|pers-3|case-|vib-gA|tam-gA|posn-360|name-kareMge|chunkId-VGF|chunkType-head:VGF 0 main _ _
37 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-370|chunkType-child:VGF|name-. 36 rsym _ _

And after conversion of the WX encoding to the Devanagari script in UTF-8:

1 पाकिस्तान पाकिस्तान XC n lex-pAkiswAna|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-10|chunkType-child:NP|name-pAkiswAna 3 mod _ _
2 अधिकृत अधिकृत XC adj lex-aXikqwa|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-20|chunkType-child:NP|name-aXikqwa 3 mod _ _
3 कश्मीर कश्मीर NNP n lex-kaSmIra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_meM|tam-0|posn-30|vpos-vib_4|name-kaSmIra|chunkId-NP|chunkType-head:NP 8 k7p _ _
4 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-40|chunkType-child:NP|name-meM 3 lwg__psp _ _
5 XC num lex-�|cat-num|gend-m|num-sg|pers-3|case-d|vib-|tam-|posn-50|chunkType-child:NP2|name-� 6 mod _ _
6 अक्तूबर अक्तूबर NNP n lex-akwUbara|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-60|vpos-vib_3|name-akwUbara|chunkId-NP2|chunkType-head:NP2 8 k7t _ _
7 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-70|chunkType-child:NP2|name-ko 6 lwg__psp _ _
8 आए VM v lex-A|cat-v|gend-m|num-sg|pers-any|case-|vib-yA|tam-yA|posn-80|name-Ae|chunkId-VGNF|chunkType-head:VGNF 9 nmod__k1inv _ _
9 भूकंप भूकंप NN n lex-BUkaMpa|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_se|tam-0|posn-90|vpos-vib_2|name-BUkaMpa|chunkId-NP3|chunkType-head:NP3 11 rh _ _
10 से से PSP psp lex-se|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-100|chunkType-child:NP3|name-se 9 lwg__psp _ _
11 मची मच VM v lex-maca|cat-v|gend-f|num-sg|pers-any|case-|vib-yA|tam-yA|posn-110|name-macI|chunkId-VGNF2|chunkType-head:VGNF2 12 nmod__k1inv _ _
12 तबाही तबाही NN n lex-wabAhI|cat-n|gend-f|num-sg|pers-3|case-o|vib-0_kA_bAxa|tam-0|posn-120|vpos-vib_2_3|name-wabAhI|chunkId-NP4|chunkType-head:NP4 36 k7t _ _
13 के का PSP psp lex-kA|cat-psp|gend-m|num-sg|pers-3|case-o|vib-|tam-|posn-130|chunkType-child:NP4|name-ke 12 lwg__psp _ _
14 बाद बाद NST n lex-bAxa|cat-n|gend-|num-|pers-|case-|vib-|tam-|posn-140|chunkType-child:NP4|name-bAxa 12 lwg__psp _ _
15 भारत भारत NNP n lex-BArawa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-150|name-BArawa|chunkId-NP5|chunkType-head:NP5 16 ccof _ _
16 और और CC avy lex-Ora|cat-avy|gend-|num-|pers-|case-|vib-|tam-|posn-160|name-Ora|chunkId-CCP|chunkType-head:CCP 36 k1 _ _
17 पाकिस्तान पाकिस्तान NNP n lex-pAkiswAna|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-170|name-pAkiswAna2|chunkId-NP6|chunkType-head:NP6 16 ccof _ _
18 मानवीय मानवीय JJ adj lex-mAnavIya|cat-adj|gend-any|num-any|pers-|case-o|vib-|tam-|posn-180|chunkType-child:NP7|name-mAnavIya 19 nmod__adj _ _
19 दृष्टिकोण दृष्टिकोण NN n lex-xqRtikoNa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-190|name-xqRtikoNa|chunkId-NP7|chunkType-head:NP7 20 k2 _ _
20 अपनाते अपना VM v lex-apanA|cat-v|gend-m|num-pl|pers-any|case-|vib-wA_ho+yA|tam-wA|posn-200|vpos-tam_2|name-apanAwe|chunkId-VGNF3|chunkType-head:VGNF3 36 vmod _ _
21 हुए हो VAUX v lex-ho|cat-v|gend-m|num-pl|pers-any|case-|vib-yA|tam-yA|posn-210|chunkType-child:VGNF3|name-hue 20 lwg__vaux _ _
22 शनिवार शनिवार NNP n lex-SanivAra|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_ko|tam-0|posn-220|vpos-vib_2|name-SanivAra|chunkId-NP8|chunkType-head:NP8 36 k7t _ _
23 को को PSP psp lex-ko|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-230|chunkType-child:NP8|name-ko2 22 lwg__psp _ _
24 इस्लामाबाद इसलामाबाद NNP n lex-isalAmAbAxa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0_meM|tam-0|posn-240|vpos-vib_2|name-islAmAbAxa|chunkId-NP9|chunkType-head:NP9 36 k7p _ _
25 में में PSP psp lex-meM|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-250|chunkType-child:NP9|name-meM2 24 lwg__psp _ _
26 नियंत्रण नियंत्रण XC n lex-niyaMwraNa|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-260|chunkType-child:NP10|name-niyaMwraNa 27 mod _ _
27 रेखा रेखा NN n lex-reKA|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-270|name-reKA|chunkId-NP10|chunkType-head:NP10 31 k2 _ _
28 ( ( SYM punc lex-|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-280|chunkType-child:NP11|name-( 29 rsym _ _
29 एलओसी एलओसी NN n lex-elaosI|cat-n|gend-m|num-sg|pers-3|case-d|vib-0|tam-0|posn-290|name-elaosI|chunkId-NP11|chunkType-head:NP11 27 nmod _ _
30 ) ) SYM punc lex-|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-300|chunkType-child:NP11|name-) 29 rsym _ _
31 खोलने खोल VM v lex-Kola|cat-v|gend-any|num-sg|pers-any|case-o|vib-nA_kA|tam-nA|posn-310|vpos-tam_2|name-Kolane|chunkId-VGNN|chunkType-head:VGNN 33 r6 _ _
32 के का PSP psp lex-kA|cat-psp|gend-m|num-sg|pers-|case-o|vib-|tam-|posn-320|chunkType-child:VGNN|name-ke2 31 lwg__psp _ _
33 मसले मसला NN n lex-masalA|cat-n|gend-m|num-sg|pers-3|case-o|vib-0_para|tam-0|posn-330|vpos-vib_2|name-masale|chunkId-NP12|chunkType-head:NP12 36 k7 _ _
34 पर पर PSP psp lex-para|cat-psp|gend-|num-|pers-|case-|vib-|tam-|posn-340|chunkType-child:NP12|name-para 33 lwg__psp _ _
35 बातचीत बातचीत NN n lex-bAwacIwa|cat-n|gend-f|num-sg|pers-3|case-d|vib-0|tam-0|posn-350|name-bAwacIwa|chunkId-NP13|chunkType-head:NP13 36 pof _ _
36 करेंगे कर VM v lex-kara|cat-v|gend-m|num-pl|pers-3|case-|vib-gA|tam-gA|posn-360|name-kareMge|chunkId-VGF|chunkType-head:VGF 0 main _ _
37 . . SYM punc lex-.|cat-punc|gend-|num-|pers-|case-|vib-|tam-|posn-370|chunkType-child:VGF|name-. 36 rsym _ _

The first sentence of the HPST 2012 training data in UTF8 SSF format with gold-standard morphology:

<Sentence id='1'>
1	गुजरात	NNP	<fs af='गुजरात,n,m,sg,3,o,0_का,0' name='गुजरात' posn='10' chunkId='NP' drel='r6:मुख्यमंत्री' vpos='vib_2' chunkType='head:NP'>
2	के	PSP	<fs af='का,psp,m,sg,,o,,' name='के' posn='20' drel='lwg__psp:गुजरात' chunkType='child:NP'>
3	मुख्यमंत्री	NNP	<fs af='मुख्यमंत्री,n,m,sg,3,o,0,0' name='मुख्यमंत्री' posn='30' chunkId='NP2' drel='nmod:मोदी' chunkType='head:NP2'>
4	नरेंद्र	NNPC	<fs af='नरेंद्र,n,m,sg,3,d,0,0' name='नरेंद्र' posn='40' drel='pof__cn:मोदी' chunkType='child:NP3'>
5	मोदी	NNP	<fs af='मोदी,n,m,sg,3,o,0_ने,0' name='मोदी' posn='50' chunkId='NP3' drel='k1:किया' vpos='vib_3' chunkType='head:NP3'>
6	ने	PSP	<fs af='ने,psp,,,,,,' name='ने' posn='60' drel='lwg__psp:मोदी' chunkType='child:NP3'>
7	मंगलवार	NNP	<fs af='मंगलवार,n,m,sg,3,o,0_को,0' name='मंगलवार' posn='70' chunkId='NP4' drel='k7t:किया' vpos='vib_2' chunkType='head:NP4'>
8	को	PSP	<fs af='को,psp,,,,,,' name='को' posn='80' drel='lwg__psp:मंगलवार' chunkType='child:NP4'>
9	गृह	NNPC	<fs af='गृह,n,m,sg,3,d,0,0' name='गृह' posn='90' drel='pof__cn:मंत्री' chunkType='child:NP5'>
10	मंत्री	NNP	<fs af='मंत्री,n,m,sg,3,d,0,0' name='मंत्री' posn='100' drel='nmod__adj:पाटिल' chunkType='child:NP5'>
11	शिवराज	NNPC	<fs af='शिवराज,n,m,sg,3,d,0,0' name='शिवराज' posn='110' drel='pof__cn:पाटिल' chunkType='child:NP5'>
12	पाटिल	NNP	<fs af='पाटिल,n,m,sg,3,o,0_से,0' name='पाटिल' posn='120' chunkId='NP5' drel='k4:किया' vpos='vib_vib_5' chunkType='head:NP5'>
13	से	PSP	<fs af='से,psp,,,,,,' name='से' posn='130' drel='lwg__psp:पाटिल' chunkType='child:NP5'>
14	मुलाकात	NN	<fs af='मुलाकात,n,f,sg,3,d,0,0' name='मुलाकात' posn='140' chunkId='NP6' drel='pof:कर' chunkType='head:NP6'>
15	कर	VM	<fs af='कर,v,any,any,any,,0,0' name='कर' posn='150' chunkId='VGNF' drel='vmod:किया' chunkType='head:VGNF'>
16	आईएएस	NNP	<fs af='आईएएस,n,m,sg,3,o,0,0' name='आईएएस' posn='160' chunkId='NP7' drel='ccof:और' chunkType='head:NP7'>
17	और	CC	<fs af='और,avy,,,,,,' name='और' posn='170' chunkId='CCP' drel='r6:तर्ज' chunkType='head:CCP'>
18	आईपीएस	NNP	<fs af='आईपीएस,n,m,sg,3,o,0_का,0' name='आईपीएस' posn='180' chunkId='NP8' drel='ccof:और' vpos='vib_2' chunkType='head:NP8'>
19	की	PSP	<fs af='का,psp,f,sg,,o,,' name='की' posn='190' drel='lwg__psp:आईपीएस' chunkType='child:NP8'>
20	तर्ज	NN	<fs af='तर्ज,n,f,sg,3,o,0_पर,0' name='तर्ज' posn='200' chunkId='NP9' drel='k7:किया' vpos='vib_2' chunkType='head:NP9'>
21	पर	PSP	<fs af='पर,psp,,,,,,' name='पर' posn='210' drel='lwg__psp:तर्ज' chunkType='child:NP9'>
22	राष्ट्रीय	JJ	<fs af='राष्ट्रीय,adj,any,any,,o,,' name='राष्ट्रीय' posn='220' drel='nmod__adj:स्तर' chunkType='child:NP10'>
23	स्तर	NN	<fs af='स्तर,n,m,sg,3,o,0_पर,0' name='स्तर' posn='230' chunkId='NP10' drel='k7:किया' vpos='vib_3' chunkType='head:NP10'>
24	पर	PSP	<fs af='पर,psp,,,,,,' name='पर2' posn='240' drel='lwg__psp:स्तर' chunkType='child:NP10'>
25	एक	QC	<fs af='एक,num,any,any,,any,,' name='एक' posn='250' drel='nmod__adj:सेवा' chunkType='child:NP11'>
26	खुफिया	JJ	<fs af='खुफिया,adj,any,any,,d,,' name='खुफिया' posn='260' drel='nmod__adj:सेवा' chunkType='child:NP11'>
27	सेवा	NN	<fs af='सेवा,n,f,sg,3,d,0,0' name='सेवा' posn='270' chunkId='NP11' drel='k2:करने' chunkType='head:NP11'>
28	शुरू	NN	<fs af='शुरू,n,m,sg,3,d,0,0' name='शुरू' posn='280' chunkId='NP12' drel='pof:करने' chunkType='head:NP12'>
29	करने	VM	<fs af='कर,v,any,sg,any,o,ना_का,nA' name='करने' posn='290' chunkId='VGNN' drel='r6-k2:अनुरोध' vpos='tam_2' chunkType='head:VGNN'>
30	का	PSP	<fs af='का,psp,m,sg,,d,,' name='का' posn='300' drel='lwg__psp:करने' chunkType='child:VGNN'>
31	अनुरोध	NN	<fs af='अनुरोध,n,m,sg,3,d,0,0' name='अनुरोध' posn='310' chunkId='NP13' drel='pof:किया' chunkType='head:NP13'>
32	किया	VM	<fs af='कर,v,m,sg,any,,या,yA' name='किया' posn='320' chunkId='VGF' chunkType='head:VGF' voicetype='active' stype='declarative'>
33	।	SYM	<fs af='।,punc,,,,,,' name='।' posn='330' chunkId='BLK' drel='rsym:किया' chunkType='head:BLK'>
</Sentence>

And the same in CoNLL format:

1 गुजरात गुजरात NNP n lex-गुजरात|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_का|tam-0|chunkId-NP|chunkType-head|stype-|voicetype- 3 r6 _ _
2 के का PSP psp lex-का|cat-psp|gen-m|num-sg|pers-|case-o|vib-|tam-|chunkId-NP|chunkType-child|stype-|voicetype- 1 lwg__psp _ _
3 मुख्यमंत्री मुख्यमंत्री NNP n lex-मुख्यमंत्री|cat-n|gen-m|num-sg|pers-3|case-o|vib-0|tam-0|chunkId-NP2|chunkType-head|stype-|voicetype- 5 nmod _ _
4 नरेंद्र नरेंद्र NNPC n lex-नरेंद्र|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP3|chunkType-child|stype-|voicetype- 5 pof__cn _ _
5 मोदी मोदी NNP n lex-मोदी|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_ने|tam-0|chunkId-NP3|chunkType-head|stype-|voicetype- 32 k1 _ _
6 ने ने PSP psp lex-ने|cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP3|chunkType-child|stype-|voicetype- 5 lwg__psp _ _
7 मंगलवार मंगलवार NNP n lex-मंगलवार|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_को|tam-0|chunkId-NP4|chunkType-head|stype-|voicetype- 32 k7t _ _
8 को को PSP psp lex-को|cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP4|chunkType-child|stype-|voicetype- 7 lwg__psp _ _
9 गृह गृह NNPC n lex-गृह|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP5|chunkType-child|stype-|voicetype- 10 pof__cn _ _
10 मंत्री मंत्री NNP n lex-मंत्री|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP5|chunkType-child|stype-|voicetype- 12 nmod__adj _ _
11 शिवराज शिवराज NNPC n lex-शिवराज|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP5|chunkType-child|stype-|voicetype- 12 pof__cn _ _
12 पाटिल पाटिल NNP n lex-पाटिल|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_से|tam-0|chunkId-NP5|chunkType-head|stype-|voicetype- 32 k4 _ _
13 से से PSP psp lex-से|cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP5|chunkType-child|stype-|voicetype- 12 lwg__psp _ _
14 मुलाकात मुलाकात NN n lex-मुलाकात|cat-n|gen-f|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP6|chunkType-head|stype-|voicetype- 15 pof _ _
15 कर कर VM v lex-कर|cat-v|gen-any|num-any|pers-any|case-|vib-0|tam-0|chunkId-VGNF|chunkType-head|stype-|voicetype- 32 vmod _ _
16 आईएएस आईएएस NNP n lex-आईएएस|cat-n|gen-m|num-sg|pers-3|case-o|vib-0|tam-0|chunkId-NP7|chunkType-head|stype-|voicetype- 17 ccof _ _
17 और और CC avy lex-और|cat-avy|gen-|num-|pers-|case-|vib-|tam-|chunkId-CCP|chunkType-head|stype-|voicetype- 20 r6 _ _
18 आईपीएस आईपीएस NNP n lex-आईपीएस|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_का|tam-0|chunkId-NP8|chunkType-head|stype-|voicetype- 17 ccof _ _
19 की का PSP psp lex-का|cat-psp|gen-f|num-sg|pers-|case-o|vib-|tam-|chunkId-NP8|chunkType-child|stype-|voicetype- 18 lwg__psp _ _
20 तर्ज तर्ज NN n lex-तर्ज|cat-n|gen-f|num-sg|pers-3|case-o|vib-0_पर|tam-0|chunkId-NP9|chunkType-head|stype-|voicetype- 32 k7 _ _
21 पर पर PSP psp lex-पर|cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP9|chunkType-child|stype-|voicetype- 20 lwg__psp _ _
22 राष्ट्रीय राष्ट्रीय JJ adj lex-राष्ट्रीय|cat-adj|gen-any|num-any|pers-|case-o|vib-|tam-|chunkId-NP10|chunkType-child|stype-|voicetype- 23 nmod__adj _ _
23 स्तर स्तर NN n lex-स्तर|cat-n|gen-m|num-sg|pers-3|case-o|vib-0_पर|tam-0|chunkId-NP10|chunkType-head|stype-|voicetype- 32 k7 _ _
24 पर पर PSP psp lex-पर|cat-psp|gen-|num-|pers-|case-|vib-|tam-|chunkId-NP10|chunkType-child|stype-|voicetype- 23 lwg__psp _ _
25 एक एक QC num lex-एक|cat-num|gen-any|num-any|pers-|case-any|vib-|tam-|chunkId-NP11|chunkType-child|stype-|voicetype- 27 nmod__adj _ _
26 खुफिया खुफिया JJ adj lex-खुफिया|cat-adj|gen-any|num-any|pers-|case-d|vib-|tam-|chunkId-NP11|chunkType-child|stype-|voicetype- 27 nmod__adj _ _
27 सेवा सेवा NN n lex-सेवा|cat-n|gen-f|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP11|chunkType-head|stype-|voicetype- 29 k2 _ _
28 शुरू शुरू NN n lex-शुरू|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP12|chunkType-head|stype-|voicetype- 29 pof _ _
29 करने कर VM v lex-कर|cat-v|gen-any|num-sg|pers-any|case-o|vib-ना_का|tam-nA|chunkId-VGNN|chunkType-head|stype-|voicetype- 31 r6-k2 _ _
30 का का PSP psp lex-का|cat-psp|gen-m|num-sg|pers-|case-d|vib-|tam-|chunkId-VGNN|chunkType-child|stype-|voicetype- 29 lwg__psp _ _
31 अनुरोध अनुरोध NN n lex-अनुरोध|cat-n|gen-m|num-sg|pers-3|case-d|vib-0|tam-0|chunkId-NP13|chunkType-head|stype-|voicetype- 32 pof _ _
32 किया कर VM v lex-कर|cat-v|gen-m|num-sg|pers-any|case-|vib-या|tam-yA|chunkId-VGF|chunkType-head|stype-declarative'>|voicetype-active 0 main _ _
33 SYM punc lex-।|cat-punc|gen-|num-|pers-|case-|vib-|tam-|chunkId-BLK|chunkType-head|stype-|voicetype- 32 rsym _ _

The same sentence with “automatically tagged” morphology. Apparently it means no morphology at all, and the contestants should probably use their own taggers to tag it.

1 गुजरात _ NNP _ _ 3 r6 _ _
2 के _ PSP _ _ 1 lwg__psp _ _
3 मुख्यमंत्री _ NNP _ _ 5 nmod _ _
4 नरेंद्र _ NNPC _ _ 5 pof__cn _ _
5 मोदी _ NNP _ _ 32 k1 _ _
6 ने _ PSP _ _ 5 lwg__psp _ _
7 मंगलवार _ NNP _ _ 32 k7t _ _
8 को _ PSP _ _ 7 lwg__psp _ _
9 गृह _ NNPC _ _ 10 pof__cn _ _
10 मंत्री _ NNP _ _ 12 nmod__adj _ _
11 शिवराज _ NNPC _ _ 12 pof__cn _ _
12 पाटिल _ NNP _ _ 32 k4 _ _
13 से _ PSP _ _ 12 lwg__psp _ _
14 मुलाकात _ NN _ _ 15 pof _ _
15 कर _ VM _ _ 32 vmod _ _
16 आईएएस _ NNP _ _ 17 ccof _ _
17 और _ CC _ _ 20 r6 _ _
18 आईपीएस _ NNP _ _ 17 ccof _ _
19 की _ PSP _ _ 18 lwg__psp _ _
20 तर्ज _ NN _ _ 32 k7 _ _
21 पर _ PSP _ _ 20 lwg__psp _ _
22 राष्ट्रीय _ JJ _ _ 23 nmod__adj _ _
23 स्तर _ NN _ _ 32 k7 _ _
24 पर _ PSP _ _ 23 lwg__psp _ _
25 एक _ QC _ _ 27 nmod__adj _ _
26 खुफिया _ NNC _ _ 27 nmod__adj _ _
27 सेवा _ NN _ _ 29 k2 _ _
28 शुरू _ NN _ _ 29 pof _ _
29 करने _ VM _ _ 31 r6-k2 _ _
30 का _ PSP _ _ 29 lwg__psp _ _
31 अनुरोध _ NN _ _ 32 pof _ _
32 किया _ VM _ _ 0 main _ _
33 _ SYM _ _ 32 rsym _ _

The first sentence of the development data in the UTF8 SSF format with gold-standard morphology:

<Sentence id='1'>
1	भाजपा	NNP	<fs af='भाजपा,n,f,sg,3,o,0_ने,0' name='भाजपा' posn='10' chunkId='NP' drel='k1:लगाया' vpos='vib_2' chunkType='head:NP'>
2	ने	PSP	<fs af='ने,psp,,,,,,' name='ने' posn='20' drel='lwg__psp:भाजपा' chunkType='child:NP'>
3	केंद्र	NNPC	<fs name='केंद्र' chunkId='FRAGP' chunkType='head:'FRAGP' drel='ccof:और'>
4	और	CC	<fs af='और,avy,,,,,,' name='और' posn='40' chunkId='CCP' drel='nmod:सरकार' chunkType='head:CCP'>
5	केरल	NNPC	<fs name='केरल' chunkId='FRAGP2' chunkType='head:'FRAGP2' drel='ccof:और'>
6	सरकार	NNP	<fs af='सरकार,n,f,sg,3,o,0_पर,0' name='सरकार' posn='60' chunkId='NP2' drel='k7:लगाया' vpos='vib_2' chunkType='head:NP2'>
7	पर	PSP	<fs af='पर,psp,,,,,,' name='पर' posn='70' drel='lwg__psp:सरकार' chunkType='child:NP2'>
8	भारतीय	JJ	<fs af='भारतीय,adj,any,any,,o,,' name='भारतीय' posn='80' drel='nmod__adj:ड्राइवर' chunkType='child:NP3'>
9	ड्राइवर	NN	<fs af='ड्राइवर,n,m,sg,3,o,0,0' name='ड्राइवर' posn='90' chunkId='NP3' drel='nmod:कुट्टी' chunkType='head:NP3'>
10	एम.	NNPC	<fs af='एम.,n,m,sg,3,d,0,0' name='एम.' posn='100' drel='pof__cn:कुट्टी' chunkType='child:NP4'>
11	आर.	NNPC	<fs af='आर.,n,m,sg,3,d,0,0' name='आर.' posn='110' drel='pof__cn:कुट्टी' chunkType='child:NP4'>
12	कुट्टी	NNP	<fs af='कुट्टी,n,m,sg,3,o,0_का,0' name='कुट्टी' posn='120' chunkId='NP4' drel='r6:हत्या' vpos='vib_4' chunkType='head:NP4'>
13	की	PSP	<fs af='का,psp,f,sg,,o,,' name='की' posn='130' drel='lwg__psp:कुट्टी' chunkType='child:NP4'>
14	हत्या	NN	<fs af='हत्या,n,f,sg,3,o,0_के_लिए,0' name='हत्या' posn='140' chunkId='NP5' drel='jjmod:जिम्मेदार' vpos='vib_2_3' chunkType='head:NP5'>
15	के	PSP	<fs af='के,psp,,,,,,' name='के' posn='150' drel='lwg__psp:हत्या' chunkType='child:NP5'>
16	लिए	PSP	<fs af='लिए,psp,,,,,,' name='लिए' posn='160' drel='lwg__cont:हत्या' chunkType='child:NP5'>
17	जिम्मेदार	JJ	<fs af='जिम्मेदार,adj,any,any,,o,,' name='जिम्मेदार' posn='170' chunkId='JJP' drel='nmod:तालिबान' chunkType='head:JJP'>
18	तालिबान	NNP	<fs af='तालिबान,n,m,sg,3,o,0_के_साथ,0' name='तालिबान' posn='180' chunkId='NP6' drel='ras-k1:लगाया' vpos='vib_2_3' chunkType='head:NP6'>
19	के	PSP	<fs af='के,psp,,,,,,' name='के2' posn='190' drel='lwg__psp:तालिबान' chunkType='child:NP6'>
20	साथ	NST	<fs af='साथ,nst,m,sg,3,d,,' name='साथ' posn='200' drel='lwg__cont:तालिबान' chunkType='child:NP6'>
21	निपटने	VM	<fs af='निपट,v,any,any,any,o,ना_में,nA' name='निपटने' posn='210' chunkId='VGNN' drel='k7:लगाया' vpos='tam_2' chunkType='head:VGNN'>
22	में	PSP	<fs af='में,psp,,,,,,' name='में' posn='220' drel='lwg__psp:निपटने' chunkType='child:VGNN'>
23	ढिलाई	NN	<fs af='ढिलाई,n,f,sg,3,d,0,0' name='ढिलाई' posn='230' chunkId='NP7' drel='k2:बरतने' chunkType='head:NP7'>
24	बरतने	VM	<fs af='बरत,v,any,sg,any,o,ना_का,nA' name='बरतने' posn='240' chunkId='VGNN2' drel='r6:आरोप' vpos='tam_2' chunkType='head:VGNN2'>
25	का	PSP	<fs af='का,psp,m,sg,,d,,' name='का' posn='250' drel='lwg__psp:बरतने' chunkType='child:VGNN2'>
26	आरोप	NN	<fs af='आरोप,n,m,sg,3,d,0,0' name='आरोप' posn='260' chunkId='NP8' drel='k2:लगाया' chunkType='head:NP8'>
27	लगाया	VM	<fs af='लगा,v,m,sg,3,,या_है,yA' name='लगाया' posn='270' chunkId='VGF' chunkType='head:VGF' voicetype='active' vpos='tam_2' stype='declarative'>
28	है	VAUX	<fs af='है,v,any,sg,3,,है,hE' name='है' posn='280' drel='lwg__vaux:लगाया' chunkType='child:VGF'>
29	।	SYM	<fs af='।,punc,,,,,,' name='।' posn='290' chunkId='BLK' drel='rsym:लगाया' chunkType='head:BLK'>
</Sentence>

Parsing

Nonprojectivities in HyDT-Hindi are not frequent. Only 862 of the 77068 chunks in the training+development ICON 2010 version are attached nonprojectively (1.12%).

The results of the ICON 2009 NLP tools contest have been published in (Husain, 2009). There were two evaluation rounds, the first with the coarse-grained syntactic tags, the second with the fine-grained syntactic tags. To reward language independence, only systems that parsed all three languages were officially ranked. The following table presents the Hindi/coarse-grained results of the four officially ranked systems.

Parser (Authors) LAS UAS
Hyderabad (Ambati et al.) 79.33 90.22
Malt (Nivre) 78.20 89.36
Malt+MST (Zeman) 73.88 88.49
Mannem 76.90 88.06

The results of the ICON 2010 NLP tools contest have been published in (Husain et al., 2010), page 6. These are the best results for Hindi with fine-grained syntactic tags:

Parser (Authors) LAS UAS
Attardi et al. 87.49 94.78
Kosaraju et al. 88.63 94.54
Kolachina et al. 86.22 93.25

[ Back to the navigation ] [ Back to the content ]