There is one treebank versions of which were known in different times under different names:
The dependency treebank Cat3LB was extracted automatically from an earlier constituent-based annotation (see Montserrat Civit, Ma. Antònia Martí, Núria Bufí: Cat3LB and Cast3LB: From Constituents to Dependencies. In: T. Salakoski et al. (eds.): FinTAL 2006, LNAI 4139, pp. 141–152, 2006, Springer, Berlin / Heidelberg)
The AnCora-CA corpus ought to be freely downloadable from its website. The download will not work for unregistered and not signed in users. The website offers creating new account but it is not automatic, one has to wait for approval.
Republication of the two CoNLL versions in LDC is planned but it has not happenned yet.
The CoNLL 2007 license in short:
AnCora-CA was created by members of the Centre de Llenguatge i Computació (CLiC), Universitat de Barcelona, Gran Via de les Corts Catalanes 585, E-08007 Barcelona, Spain.
Mostly newswire (EFE news, ACN Catalan news, Catalan version of El Periódico, 2000).
The CoNLL 2007 version contains 435,860 tokens in 15125 sentences, yielding 28.82 tokens per sentence on average (CoNLL 2007 data split: 430,844 tokens / 14958 sentences training, 5016 tokens / 167 sentences test).
The CoNLL 2009 version contains 496,672 tokens in 16786 sentences, yielding 29.59 tokens per sentence on average (CoNLL 2009 data split: 390,302 tokens / 13200 sentences training, 53015 tokens / 1724 sentences development, 53355 tokens / 1862 sentences test).
The original morphosyntactic tags (EAGLES?) have been converted to fit into the three columns (CPOS, POS and FEAT) columns of the CoNLL 2006/7 format, resp. the two columns (POS and FEAT) of the CoNLL 2009 format. Note that the missing CPOS column is not the only difference between the two conversion schemes. Feature names and values in the FEAT column are different, too.
The morphosyntactic tags have been disambiguated manually. The CoNLL 2009 version also contains automatically disambiguated tags.
Multi-word expressions have been collapsed into one token, using underscore as the joining character. This includes named entities (e.g. La_Garrotxa, Ajuntament_de_Manresa, dilluns_4_de_juny) and prepositional compounds (pel_que_fa_al, d'_acord_amb, la_seva, a_més_de). Empty (underscore) tokens have been inserted to represent missing subjects (Catalan is a pro-drop language).
The first sentence of the CoNLL 2007 training data:
1 | L' | el | d | da | num=s|gen=c | 2 | ESPEC | _ | _ |
2 | Ajuntament_de_Manresa | Ajuntament_de_Manresa | n | np | _ | 4 | SUJ | _ | _ |
3 | ha | haver | v | va | num=s|per=3|mod=i|ten=p | 4 | AUX | _ | _ |
4 | posat_en_funcionament | posar_en_funcionament | v | vm | num=s|mod=p|gen=m | 0 | S | _ | _ |
5 | tot | tot | d | di | num=s|gen=m | 7 | ESPEC | _ | _ |
6 | un_seguit_de | un_seguit_de | d | di | num=p|gen=c | 5 | DET | _ | _ |
7 | mesures | mesura | n | nc | num=p|gen=f | 4 | CD | _ | _ |
8 | , | , | F | Fc | _ | 10 | PUNC | _ | _ |
9 | la | el | d | da | num=s|gen=f | 10 | ESPEC | _ | _ |
10 | majoria | majoria | n | nc | num=s|gen=f | 7 | _ | _ | _ |
11 | informatives | informatiu | a | aq | num=p|gen=f | 10 | _ | _ | _ |
12 | , | , | F | Fc | _ | 10 | PUNC | _ | _ |
13 | que | que | p | pr | num=n|gen=c | 14 | SUJ | _ | _ |
14 | tenen | tenir | v | vm | num=p|per=3|mod=i|ten=p | 7 | SF | _ | _ |
15 | com_a | com_a | s | sp | for=s | 14 | CPRED | _ | _ |
16 | finalitat | finalitat | n | nc | num=s|gen=f | 15 | SN | _ | _ |
17 | minimitzar | minimitzar | v | vm | mod=n | 14 | CD | _ | _ |
18 | els | el | d | da | num=p|gen=m | 19 | ESPEC | _ | _ |
19 | efectes | efecte | n | nc | num=p|gen=m | 17 | SN | _ | _ |
20 | de | de | s | sp | for=s | 19 | SP | _ | _ |
21 | la | el | d | da | num=s|gen=f | 22 | ESPEC | _ | _ |
22 | vaga | vaga | n | nc | num=s|gen=f | 20 | SN | _ | _ |
23 | . | . | F | Fp | _ | 4 | PUNC | _ | _ |
The first sentence of the CoNLL 2007 test data:
1 | Tot_i_que | tot_i_que | c | cs | _ | 5 | SUBORD | _ | _ |
2 | ahir | ahir | r | rg | _ | 5 | CC | _ | _ |
3 | hi | hi | p | pp | num=n|per=3|gen=c | 5 | MORF | _ | _ |
4 | va | anar | v | va | num=s|per=3|mod=i|ten=p | 5 | AUX | _ | _ |
5 | haver | haver | v | va | mod=n | 15 | AO | _ | _ |
6 | una | un | d | di | num=s|gen=f | 7 | ESPEC | _ | _ |
7 | reunió | reunió | n | nc | num=s|gen=f | 5 | CD | _ | _ |
8 | de | de | s | sp | for=s | 7 | SP | _ | _ |
9 | darrera | darrer | a | ao | num=s|gen=f | 10 | SADJ | _ | _ |
10 | hora | hora | n | nc | num=s|gen=f | 8 | SN | _ | _ |
11 | , | , | F | Fc | _ | 5 | PUNC | _ | _ |
12 | no | no | r | rn | _ | 15 | MOD | _ | _ |
13 | es | es | p | p0 | _ | 15 | PASS | _ | _ |
14 | va | anar | v | va | num=s|per=3|mod=i|ten=p | 15 | AUX | _ | _ |
15 | aconseguir | aconseguir | v | vm | mod=n | 0 | S | _ | _ |
16 | acostar | acostar | v | vm | mod=n | 15 | SUJ | _ | _ |
17 | posicions | posició | n | nc | num=p|gen=f | 16 | SN | _ | _ |
18 | , | , | F | Fc | _ | 23 | PUNC | _ | _ |
19 | de_manera_que | de_manera_que | c | cs | _ | 23 | SUBORD | _ | _ |
20 | els | el | d | da | num=p|gen=m | 21 | ESPEC | _ | _ |
21 | treballadors | treballador | n | nc | num=p|gen=m | 23 | SUJ | _ | _ |
22 | han | haver | v | va | num=p|per=3|mod=i|ten=p | 23 | AUX | _ | _ |
23 | decidit | decidir | v | vm | num=s|mod=p|gen=m | 15 | AO | _ | _ |
24 | anar | anar | v | vm | mod=n | 23 | CD | _ | _ |
25 | a | a | s | sp | for=s | 24 | CREG | _ | _ |
26 | la | el | d | da | num=s|gen=f | 27 | ESPEC | _ | _ |
27 | vaga | vaga | n | nc | num=s|gen=f | 25 | SN | _ | _ |
28 | . | . | F | Fp | _ | 15 | PUNC | _ | _ |
The first sentence of the CoNLL 2009 training data:
1 | El | el | el | d | d | postype=article|gen=m|num=s | postype=article|gen=m|num=s | 2 | 2 | spec | spec | _ | _ | _ | _ | _ | _ |
2 | Tribunal_Suprem | Tribunal_Suprem | Tribunal_Suprem | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | 7 | 7 | suj | suj | _ | _ | arg0-agt | _ | _ | _ |
3 | ( | ( | ( | f | f | punct=bracket|punctenclose=open | punct=bracket|punctenclose=open | 4 | 4 | f | f | _ | _ | _ | _ | _ | _ |
4 | TS | TS | TS | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | 2 | 2 | sn | sn | _ | _ | _ | _ | _ | _ |
5 | ) | ) | ) | f | f | punct=bracket|punctenclose=close | punct=bracket|punctenclose=close | 4 | 4 | f | f | _ | _ | _ | _ | _ | _ |
6 | ha | haver | haver | v | v | postype=auxiliary|gen=c|num=s|person=3|mood=indicative|tense=present | postype=auxiliary|gen=c|num=s|person=3|mood=indicative|tense=present | 7 | 7 | v | v | _ | _ | _ | _ | _ | _ |
7 | confirmat | confirmar | confirmar | v | v | postype=main|gen=m|num=s|mood=pastparticiple | postype=main|gen=m|num=s|mood=pastparticiple | 0 | 0 | sentence | sentence | Y | confirmar.a32 | _ | _ | _ | _ |
8 | la | el | el | d | d | postype=article|gen=f|num=s | postype=article|gen=f|num=s | 9 | 9 | spec | spec | _ | _ | _ | _ | _ | _ |
9 | condemna | condemna | condemna | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 7 | 7 | cd | cd | _ | _ | arg1-pat | _ | _ | _ |
10 | a | a | a | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 9 | 9 | sp | sp | _ | _ | _ | _ | _ | _ |
11 | quatre | quatre | quatre | d | d | postype=numeral|gen=c|num=p | postype=numeral|gen=c|num=p | 12 | 12 | spec | spec | _ | _ | _ | _ | _ | _ |
12 | anys | any | any | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 10 | 10 | sn | sn | _ | _ | _ | _ | _ | _ |
13 | d' | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 12 | 12 | sp | sp | _ | _ | _ | _ | _ | _ |
14 | inhabilitació | inhabilitació | inhabilitació | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 13 | 13 | sn | sn | _ | _ | _ | _ | _ | _ |
15 | especial | especial | especial | a | a | postype=qualificative|gen=c|num=s | postype=qualificative|gen=c|num=s | 14 | 14 | s.a | s.a | _ | _ | _ | _ | _ | _ |
16 | i | i | i | c | c | postype=coordinating | postype=coordinating | 12 | 9 | coord | coord | _ | _ | _ | _ | _ | _ |
17 | una | un | un | d | d | postype=indefinite|gen=f|num=s | postype=numeral|gen=f|num=s | 18 | 18 | spec | spec | _ | _ | _ | _ | _ | _ |
18 | multa | multa | multa | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 12 | 9 | sn | sn | _ | _ | _ | _ | _ | _ |
19 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 18 | 18 | sp | sp | _ | _ | _ | _ | _ | _ |
20 | 3,6 | 3.6 | 3,6 | z | n | _ | postype=proper|gen=c|num=c | 21 | 21 | spec | spec | _ | _ | _ | _ | _ | _ |
21 | milions | milió | milió | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 19 | 19 | sn | sn | _ | _ | _ | _ | _ | _ |
22 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 21 | 21 | sp | sp | _ | _ | _ | _ | _ | _ |
23 | pessetes | pesseta | pesseta | z | n | postype=currency | postype=common|gen=f|num=p | 22 | 22 | sn | sn | _ | _ | _ | _ | _ | _ |
24 | per | per | per | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 9 | 9 | sp | sp | _ | _ | _ | _ | _ | _ |
25 | a | a | a | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 24 | 24 | sp | sp | _ | _ | _ | _ | _ | _ |
26 | quatre | quatre | quatre | d | d | postype=numeral|gen=c|num=p | postype=numeral|gen=c|num=p | 27 | 27 | spec | spec | _ | _ | _ | _ | _ | _ |
27 | veterinaris | veterinari | veterinari | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 25 | 25 | sn | sn | _ | _ | _ | _ | _ | _ |
28 | gironins | gironí | gironí | a | a | postype=qualificative|gen=m|num=p | postype=qualificative|gen=m|num=p | 27 | 27 | s.a | s.a | _ | _ | _ | _ | _ | _ |
29 | , | , | , | f | f | punct=comma | punct=comma | 30 | 30 | f | f | _ | _ | _ | _ | _ | _ |
30 | per | per | per | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 9 | 7 | sp | cc | _ | _ | _ | _ | _ | _ |
31 | haver | haver | haver | v | n | postype=auxiliary|gen=c|num=c|mood=infinitive | postype=common|gen=m|num=s | 33 | 33 | v | v | _ | _ | _ | _ | _ | _ |
32 | -se | ell | ell | p | p | gen=c|num=c|person=3 | gen=c|num=c|person=3 | 33 | 33 | morfema.pronominal | morfema.pronominal | _ | _ | _ | _ | _ | _ |
33 | beneficiat | beneficiar | beneficiat | v | a | postype=main|gen=m|num=s|mood=pastparticiple | postype=qualificative|gen=m|num=s|posfunction=participle | 42 | 30 | S | S | Y | beneficiar.a2 | _ | _ | _ | _ |
34 | dels | del | dels | s | s | postype=preposition|gen=m|num=p|contracted=yes | postype=preposition|gen=m|num=p|contracted=yes | 33 | 33 | creg | creg | _ | _ | _ | arg1-null | _ | _ |
35 | càrrecs | càrrec | càrrec | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 34 | 34 | sn | sn | _ | _ | _ | _ | _ | _ |
36 | públics | públic | públic | a | a | postype=qualificative|gen=m|num=p | postype=qualificative|gen=m|num=p | 35 | 35 | s.a | s.a | _ | _ | _ | _ | _ | _ |
37 | que | que | que | p | p | postype=relative|gen=c|num=c | postype=relative|gen=c|num=c | 39 | 39 | cd | cd | _ | _ | _ | _ | arg1-pat | _ |
38 | _ | _ | _ | p | p | _ | _ | 39 | 39 | suj | suj | _ | _ | _ | _ | arg0-agt | _ |
39 | desenvolupaven | desenvolupar | desenvolupar | v | v | postype=main|gen=c|num=p|person=3|mood=indicative|tense=imperfect | postype=main|gen=c|num=p|person=3|mood=indicative|tense=imperfect | 35 | 35 | S | S | Y | desenvolupar.a2 | _ | _ | _ | _ |
40 | i | i | i | c | c | postype=coordinating | postype=coordinating | 42 | 33 | coord | coord | _ | _ | _ | _ | _ | _ |
41 | la_seva | el_seu | el_seu | d | d | postype=possessive|gen=f|num=s|person=3 | postype=possessive|gen=f|num=s|person=3 | 42 | 42 | spec | spec | _ | _ | _ | _ | _ | _ |
42 | relació | relació | relació | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 30 | 33 | sn | cd | _ | _ | _ | _ | _ | _ |
43 | amb | amb | amb | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 42 | 42 | sp | sp | _ | _ | _ | _ | _ | _ |
44 | les | el | el | d | d | postype=article|gen=f|num=p | postype=article|gen=f|num=p | 45 | 45 | spec | spec | _ | _ | _ | _ | _ | _ |
45 | empreses | empresa | empresa | n | n | postype=common|gen=f|num=p | postype=common|gen=f|num=p | 43 | 43 | sn | sn | _ | _ | _ | _ | _ | _ |
46 | càrniques | càrnic | càrnic | a | a | postype=qualificative|gen=f|num=p | postype=qualificative|gen=f|num=p | 45 | 45 | s.a | s.a | _ | _ | _ | _ | _ | _ |
47 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 45 | 45 | sp | sp | _ | _ | _ | _ | _ | _ |
48 | la | el | el | d | d | postype=article|gen=f|num=s | postype=article|gen=f|num=s | 49 | 49 | spec | spec | _ | _ | _ | _ | _ | _ |
49 | zona | zona | zona | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 47 | 47 | sn | sn | _ | _ | _ | _ | _ | _ |
50 | en | en | en | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 42 | 42 | sp | sp | _ | _ | _ | _ | _ | _ |
51 | oferir | oferir | oferir | v | v | postype=main|gen=c|num=c|mood=infinitive | postype=main|gen=c|num=c|mood=infinitive | 50 | 50 | S | S | Y | oferir.a32 | _ | _ | _ | _ |
52 | -los | ell | ell | p | p | postype=personal|gen=c|num=p|person=3 | postype=personal|gen=c|num=p|person=3 | 51 | 51 | ci | ci | _ | _ | _ | _ | _ | arg2-ben |
53 | serveis | servei | servei | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 51 | 51 | cd | cd | _ | _ | _ | _ | _ | arg1-pat |
54 | particulars | particular | particular | a | a | postype=qualificative|gen=c|num=p | postype=qualificative|gen=c|num=p | 53 | 53 | s.a | s.a | _ | _ | _ | _ | _ | _ |
55 | . | . | . | f | f | punct=period | punct=period | 7 | 7 | f | f | _ | _ | _ | _ | _ | _ |
The first sentence of the CoNLL 2009 development data:
1 | Fundació_Privada_Fira_de_Manresa | Fundació_Privada_Fira_de_Manresa | Fundació_Privada_Fira_de_Manresa | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | 3 | 3 | suj | suj | _ | _ | arg0-agt |
2 | ha | haver | haver | v | v | postype=auxiliary|gen=c|num=s|person=3|mood=indicative|tense=present | postype=auxiliary|gen=c|num=s|person=3|mood=indicative|tense=present | 3 | 3 | v | v | _ | _ | _ |
3 | fet | fer | fer | v | v | postype=main|gen=m|num=s|mood=pastparticiple | postype=main|gen=m|num=s|mood=pastparticiple | 0 | 0 | sentence | sentence | Y | fer.a2 | _ |
4 | un | un | un | d | d | postype=numeral|gen=m|num=s | postype=numeral|gen=m|num=s | 5 | 5 | spec | spec | _ | _ | _ |
5 | balanç | balanç | balanç | n | n | postype=common|gen=m|num=s | postype=common|gen=m|num=s | 3 | 3 | cd | cd | _ | _ | arg1-pat |
6 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 5 | 5 | sp | sp | _ | _ | _ |
7 | l' | el | el | d | d | postype=article|gen=c|num=s | postype=article|gen=c|num=s | 8 | 8 | spec | spec | _ | _ | _ |
8 | activitat | activitat | activitat | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | 6 | 6 | sn | sn | _ | _ | _ |
9 | del | del | del | s | s | postype=preposition|gen=m|num=s|contracted=yes | postype=preposition|gen=m|num=s|contracted=yes | 8 | 8 | sp | sp | _ | _ | _ |
10 | Palau_Firal | Palau_Firal | Palau_Firal | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | 9 | 9 | sn | sn | _ | _ | _ |
11 | durant | durant | durant | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 8 | 3 | sp | cc | _ | _ | _ |
12 | els | el | el | d | d | postype=article|gen=m|num=p | postype=article|gen=m|num=p | 15 | 15 | spec | spec | _ | _ | _ |
13 | primers | primer | primer | a | a | postype=ordinal|gen=m|num=p | postype=ordinal|gen=m|num=p | 12 | 12 | a | a | _ | _ | _ |
14 | cinc | cinc | cinc | d | d | postype=numeral|gen=c|num=p | postype=numeral|gen=c|num=p | 12 | 12 | d | d | _ | _ | _ |
15 | mesos | mes | mes | n | n | postype=common|gen=m|num=p | postype=common|gen=m|num=p | 11 | 11 | sn | sn | _ | _ | _ |
16 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | 15 | 15 | sp | sp | _ | _ | _ |
17 | l' | el | el | d | d | postype=article|gen=c|num=s | postype=article|gen=c|num=s | 18 | 18 | spec | spec | _ | _ | _ |
18 | any | any | any | n | n | postype=common|gen=m|num=s | postype=common|gen=m|num=s | 16 | 16 | sn | sn | _ | _ | _ |
19 | . | . | . | f | f | punct=period | punct=period | 3 | 3 | f | f | _ | _ | _ |
The first sentence of the CoNLL 2009 test data:
1 | El | el | el | d | d | postype=article|gen=m|num=s | postype=article|gen=m|num=s | _ | _ | _ | _ | _ |
2 | darrer | darrer | darrer | a | a | postype=ordinal|gen=m|num=s | postype=ordinal|gen=m|num=s | _ | _ | _ | _ | _ |
3 | número | número | número | n | n | postype=common|gen=m|num=s | postype=common|gen=m|num=s | _ | _ | _ | _ | _ |
4 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | _ | _ | _ | _ | _ |
5 | l' | el | el | d | d | postype=article|gen=c|num=s | postype=article|gen=c|num=s | _ | _ | _ | _ | _ |
6 | Observatori_del_Mercat_de_Treball_d'_Osona | Observatori_del_Mercat_de_Treball_d'_Osona | Observatori_del_Mercat_de_Treball_d'_Osona | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | _ | _ | _ | _ | _ |
7 | inclou | incloure | incloure | v | v | postype=main|gen=c|num=s|person=3|mood=indicative|tense=present | postype=main|gen=c|num=s|person=3|mood=indicative|tense=present | _ | _ | _ | _ | Y |
8 | un | un | un | d | d | postype=numeral|gen=m|num=s | postype=numeral|gen=m|num=s | _ | _ | _ | _ | _ |
9 | informe | informe | informe | n | n | postype=common|gen=m|num=s | postype=common|gen=m|num=s | _ | _ | _ | _ | _ |
10 | especial | especial | especial | a | a | postype=qualificative|gen=c|num=s | postype=qualificative|gen=c|num=s | _ | _ | _ | _ | _ |
11 | sobre | sobre | sobre | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | _ | _ | _ | _ | _ |
12 | la | el | el | d | d | postype=article|gen=f|num=s | postype=article|gen=f|num=s | _ | _ | _ | _ | _ |
13 | contractació | contractació | contractació | n | n | postype=common|gen=f|num=s | postype=common|gen=f|num=s | _ | _ | _ | _ | _ |
14 | a_través_de | a_través_de | a_través_de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | _ | _ | _ | _ | _ |
15 | les | el | el | d | d | postype=article|gen=f|num=p | postype=article|gen=f|num=p | _ | _ | _ | _ | _ |
16 | empreses | empresa | empresa | n | n | postype=common|gen=f|num=p | postype=common|gen=f|num=p | _ | _ | _ | _ | _ |
17 | de | de | de | s | s | postype=preposition|gen=c|num=c | postype=preposition|gen=c|num=c | _ | _ | _ | _ | _ |
18 | treball | treball | treball | n | n | postype=common|gen=m|num=s | postype=common|gen=m|num=s | _ | _ | _ | _ | _ |
19 | temporal | temporal | temporal | a | a | postype=qualificative|gen=c|num=s | postype=qualificative|gen=c|num=s | _ | _ | _ | _ | _ |
20 | , | , | , | f | f | punct=comma | punct=comma | _ | _ | _ | _ | _ |
21 | les | el | el | d | d | postype=article|gen=f|num=p | postype=article|gen=f|num=p | _ | _ | _ | _ | _ |
22 | ETT | ETT | ETT | n | n | postype=proper|gen=c|num=c | postype=proper|gen=c|num=c | _ | _ | _ | _ | _ |
23 | . | . | . | f | f | punct=period | punct=period | _ | _ | _ | _ | _ |
Nonprojectivities in AnCora-CA are very rare. Only 487 of the 435,860 tokens in the CoNLL 2007 version are attached nonprojectively (0.11%). In the CoNLL 2009 version, there are no nonprojectivities at all.
The results of the CoNLL 2007 shared task are available online. They have been published in (Nivre et al., 2007). The evaluation procedure was changed to include punctuation tokens. These are the best results for Catalan:
Parser (Authors) | LAS | UAS |
---|---|---|
Titov et al. | 87.40 | 93.40 |
Sagae | 88.16 | 93.34 |
Malt (Nilsson et al.) | 88.70 | 93.12 |
Nakagawa | 87.90 | 92.86 |
Carreras | 87.60 | 92.46 |
Malt (Hall et al.) | 87.74 | 92.20 |
The two Malt parser results of 2007 (single malt and blended) are described in (Hall et al., 2007) and the details about the parser configuration are described here.
The results of the CoNLL 2009 shared task are available online. They have been published in (Hajič et al., 2009). Unlabeled attachment score was not published. These are the best results for Catalan:
Parser (Authors) | LAS |
---|---|
Merlo | 87.86 |
Che | 86.56 |
Bohnet | 86.35 |
Chen | 85.88 |