dc.contributor.author | De Camillis, Flavia |
dc.contributor.author | Chiocchetti, Elena |
dc.contributor.author | Stemle, Egon W. |
dc.date.accessioned | 2023-06-18T18:33:02Z |
dc.date.available | 2023-06-18T18:33:02Z |
dc.date.issued | 2023-06-13 |
dc.identifier.uri | http://hdl.handle.net/20.500.12124/60 |
dc.description | The MT@BZ is a translation corpus that consists of 52 decrees published by the Autonomous Province of Bolzano (South Tyrol) aligned with their machine translated versions. More precisely, it consists of 26 decrees in German and the same 26 in Italian in their official versions, respectively machine translated by the project team into Italian and into German. 10 of them are COVID-19 related decress, while 16 are miscellaneous. Overall, they consist of around 130,000 words. Their machine translation was carried out with a customized version of ModernMT. Later, the corpus was uploaded first into the annotation platform Webanno, then transferred to Inception. Four annotators annotated the translation errors made by the machine according to an ad hoc error taxonomy for quality assessment. Finally, the annotations were curated to create a gold standard corpus. |
dc.language.iso | ita |
dc.language.iso | deu |
dc.publisher | Institute for Applied Linguistics, Eurac Research |
dc.relation.isbasedon | https://gitlab.inf.unibz.it/commul/mt-bz/data/bundle/-/tags/v1.0 |
dc.relation.isreferencedby | https://events.tuni.fi/uploads/2023/06/11678752-proceedings-eamt2023.pdf |
dc.rights | Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-nc/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://www.eurac.edu/it/institutes-centers/istituto-di-linguistica-applicata/projects/mtbz |
dc.subject | machine translation |
dc.subject | annotation |
dc.subject | translation errors |
dc.subject | accuracy |
dc.subject | fluency |
dc.subject | Italian |
dc.subject | German |
dc.subject | South Tyrolean German |
dc.subject | legal language |
dc.title | MT@BZ translation corpus v1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Lexicography, Terminology, and Translation |
contact.person | Elena Chiocchetti elena.chiocchetti@eurac.edu Eurac Research |
contact.person | Corpus Manager clarin@eurac.edu Eurac Research CLARIN Centre (ERCC) |
sponsor | Institute for Applied Linguistics, Eurac Research / Machine Translation at South Tyrolean Institutions ownFunds |
size.info | 52 texts |
size.info | 130.000 tokens |
files.size | 37159305 |
files.count | 5 |
Files in this item
Download all files in item (35.44 MB)This item is
Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
- Name
- README.html
- Size
- 15.01 KB
- Format
- HTML
- Description
- README.html
- MD5
- d579908b48b7ba7904bdf0947dd4b644
- Name
- CHANGELOG.html
- Size
- 1.68 KB
- Format
- HTML
- Description
- CHANGELOG.html
- MD5
- 601def2df881559f213b9eef66404868
- Name
- decrees-v1.0.zip
- Size
- 3.77 MB
- Format
- application/zip
- Description
- decrees-v1.0.zip
- MD5
- 4a9a23e5677d85a1e6aa81f6d1f6fbc5
- decrees-v1.0
- CHANGELOG.md512 B
- LICENSE18 kB
- make_tsv.py10 kB
- README.md11 kB
- C
- C_global_DE-IT.xlsx198 kB
- DE-IT
- C9_trained_IT.txt3 kB
- C9_baseline_IT.txt2 kB
- C2_source_DE.txt4 kB
- C2_ref_IT.txt4 kB
- C13_baseline_IT.txt3 kB
- C8_trained_IT.txt800 B
- C7_source_DE.txt2 kB
- C4_baseline_IT.txt3 kB
- C7_trained_IT.txt2 kB
- C6_trained_IT.txt1 kB
- C4_ref_IT.txt4 kB
- C5_trained_IT.txt2 kB
- C13_source_DE.txt3 kB
- C4_trained_IT.txt4 kB
- C8_baseline_IT.txt773 B
- C6_ref_IT.txt1 kB
- C10_ref_IT.txt3 kB
- C4_source_DE.txt4 kB
- C3_trained_IT.txt3 kB
- C12_baseline_IT.txt17 kB
- C3_baseline_IT.txt3 kB
- C2_trained_IT.txt4 kB
- C8_ref_IT.txt827 B
- C16_trained_IT.txt9 kB
- C12_ref_IT.txt18 kB
- C9_source_DE.txt3 kB
- C1_trained_IT.txt6 kB
- C15_trained_IT.txt3 kB
- C10_source_DE.txt3 kB
- C14_trained_IT.txt6 kB
- C15_source_DE.txt3 kB
- C14_ref_IT.txt6 kB
- C13_trained_IT.txt3 kB
- C7_baseline_IT.txt2 kB
- C1_source_DE.txt7 kB
- C16_baseline_IT.txt9 kB
- C12_trained_IT.txt18 kB
- C16_ref_IT.txt10 kB
- C2_baseline_IT.txt4 kB
- C6_source_DE.txt1 kB
- C11_baseline_IT.txt10 kB
- C11_trained_IT.txt10 kB
- C1_ref_IT.txt6 kB
- C10_trained_IT.txt2 kB
- C12_source_DE.txt18 kB
- C3_ref_IT.txt3 kB
- C3_source_DE.txt3 kB
- C15_baseline_IT.txt3 kB
- C6_baseline_IT.txt1 kB
- C5_ref_IT.txt2 kB
- C10_baseline_IT.txt2 kB
- C8_source_DE.txt716 B
- C1_baseline_IT.txt6 kB
- C7_ref_IT.txt2 kB
- C11_ref_IT.txt11 kB
- C14_source_DE.txt6 kB
- C9_ref_IT.txt3 kB
- C5_baseline_IT.txt2 kB
- C13_ref_IT.txt3 kB
- C14_baseline_IT.txt6 kB
- C5_source_DE.txt2 kB
- C15_ref_IT.txt3 kB
- C11_source_DE.txt10 kB
- C16_source_DE.txt10 kB
- IT-DE
- C2_baseline_DE.txt4 kB
- C12_source_IT.txt18 kB
- C11_baseline_DE.txt11 kB
- C11_trained_DE.txt11 kB
- C1_ref_DE.txt7 kB
- C10_trained_DE.txt3 kB
- C3_ref_DE.txt3 kB
- C3_source_IT.txt3 kB
- C15_baseline_DE.txt3 kB
- C8_source_IT.txt827 B
- C6_baseline_DE.txt1 kB
- C5_ref_DE.txt2 kB
- C10_baseline_DE.txt3 kB
- C1_baseline_DE.txt7 kB
- C14_source_IT.txt6 kB
- C7_ref_DE.txt2 kB
- C11_ref_DE.txt10 kB
- C5_source_IT.txt2 kB
- C9_ref_DE.txt3 kB
- C5_baseline_DE.txt2 kB
- C13_ref_DE.txt3 kB
- C14_baseline_DE.txt6 kB
- C11_source_IT.txt11 kB
- C15_ref_DE.txt3 kB
- C16_source_IT.txt10 kB
- C2_source_IT.txt4 kB
- C9_trained_DE.txt3 kB
- C9_baseline_DE.txt3 kB
- C7_source_IT.txt2 kB
- C8_trained_DE.txt734 B
- C2_ref_DE.txt4 kB
- C13_baseline_DE.txt3 kB
- C4_baseline_DE.txt3 kB
- C7_trained_DE.txt2 kB
- C13_source_IT.txt3 kB
- C6_trained_DE.txt1 kB
- C4_ref_DE.txt4 kB
- C5_trained_DE.txt2 kB
- C4_trained_DE.txt3 kB
- C4_source_IT.txt4 kB
- C8_baseline_DE.txt763 B
- C6_ref_DE.txt1 kB
- C10_ref_DE.txt3 kB
- C3_trained_DE.txt3 kB
- C12_baseline_DE.txt18 kB
- C9_source_IT.txt3 kB
- C16_trained_DE.txt10 kB
- C3_baseline_DE.txt3 kB
- C10_source_IT.txt3 kB
- C2_trained_DE.txt4 kB
- C8_ref_DE.txt716 B
- C12_ref_DE.txt18 kB
- C15_source_IT.txt3 kB
- C1_trained_DE.txt7 kB
- C15_trained_DE.txt3 kB
- C14_trained_DE.txt6 kB
- C1_source_IT.txt6 kB
- C14_ref_DE.txt6 kB
- C13_trained_DE.txt3 kB
- C7_baseline_DE.txt2 kB
- C12_trained_DE.txt17 kB
- C16_baseline_DE.txt10 kB
- C6_source_IT.txt1 kB
- C16_ref_DE.txt10 kB
- C_global_IT-DE.xlsx182 kB
- tsv
- C_DE-IT_11.tsv252 kB
- C_IT-DE_06.tsv33 kB
- B_IT-DE_07.tsv212 kB
- C_IT-DE_14.tsv169 kB
- B_DE-IT_07.tsv205 kB
- C_DE-IT_06.tsv33 kB
- C_IT-DE_02.tsv97 kB
- B_IT-DE_03.tsv221 kB
- C_DE-IT_14.tsv171 kB
- C_IT-DE_10.tsv69 kB
- B_DE-IT_03.tsv226 kB
- C_IT-DE_09.tsv69 kB
- C_DE-IT_02.tsv102 kB
- C_DE-IT_10.tsv68 kB
- C_DE-IT_09.tsv68 kB
- C_IT-DE_05.tsv48 kB
- B_IT-DE_06.tsv663 kB
- C_IT-DE_13.tsv80 kB
- B_DE-IT_06.tsv639 kB
- C_DE-IT_05.tsv48 kB
- C_IT-DE_01.tsv163 kB
- B_IT-DE_02.tsv219 kB
- C_DE-IT_13.tsv80 kB
- B_DE-IT_02.tsv223 kB
- C_IT-DE_08.tsv18 kB
- B_IT-DE_10.tsv819 kB
- C_DE-IT_01.tsv158 kB
- B_IT-DE_09.tsv375 kB
- B_DE-IT_10.tsv820 kB
- C_IT-DE_16.tsv225 kB
- B_DE-IT_09.tsv378 kB
- C_DE-IT_08.tsv18 kB
- C_IT-DE_04.tsv89 kB
- B_IT-DE_05.tsv148 kB
- C_DE-IT_16.tsv218 kB
- C_IT-DE_12.tsv430 kB
- B_DE-IT_05.tsv145 kB
- C_DE-IT_04.tsv91 kB
- B_IT-DE_01.tsv150 kB
- C_DE-IT_12.tsv435 kB
- B_DE-IT_01.tsv149 kB
- C_IT-DE_07.tsv64 kB
- B_IT-DE_08.tsv137 kB
- C_IT-DE_15.tsv80 kB
- B_DE-IT_08.tsv140 kB
- C_DE-IT_07.tsv66 kB
- C_IT-DE_03.tsv74 kB
- B_IT-DE_04.tsv342 kB
- C_DE-IT_15.tsv78 kB
- C_IT-DE_11.tsv258 kB
- B_DE-IT_04.tsv346 kB
- C_DE-IT_03.tsv72 kB
- B
- B_global_DE-IT.xlsx251 kB
- DE-IT
- B7_ref_IT.txt10 kB
- B2_baseline_IT.txt9 kB
- B10_trained_IT.txt35 kB
- B2_source_DE.txt10 kB
- B10_baseline_IT.txt34 kB
- B9_ref_IT.txt16 kB
- B7_source_DE.txt8 kB
- B6_baseline_IT.txt26 kB
- B9_trained_IT.txt16 kB
- B1_baseline_IT.txt6 kB
- B8_trained_IT.txt5 kB
- B4_source_DE.txt14 kB
- B7_trained_IT.txt9 kB
- B6_trained_IT.txt27 kB
- B9_source_DE.txt16 kB
- B2_ref_IT.txt9 kB
- B5_trained_IT.txt6 kB
- B4_trained_IT.txt13 kB
- B5_baseline_IT.txt5 kB
- B4_ref_IT.txt14 kB
- B1_source_DE.txt6 kB
- B3_trained_IT.txt10 kB
- B2_trained_IT.txt9 kB
- B10_ref_IT.txt36 kB
- B6_source_DE.txt26 kB
- B6_ref_IT.txt29 kB
- B10_source_DE.txt34 kB
- B1_trained_IT.txt6 kB
- B9_baseline_IT.txt15 kB
- B8_ref_IT.txt5 kB
- B4_baseline_IT.txt13 kB
- B3_source_DE.txt9 kB
- B8_source_DE.txt5 kB
- B8_baseline_IT.txt5 kB
- B3_baseline_IT.txt9 kB
- B1_ref_IT.txt6 kB
- B5_source_DE.txt6 kB
- B3_ref_IT.txt10 kB
- B5_ref_IT.txt6 kB
- B7_baseline_IT.txt8 kB
- IT-DE
- B10_source_IT.txt36 kB
- B2_trained_DE.txt9 kB
- B10_ref_DE.txt34 kB
- B1_trained_DE.txt6 kB
- B6_ref_DE.txt26 kB
- B9_baseline_DE.txt16 kB
- B8_ref_DE.txt5 kB
- B4_baseline_DE.txt14 kB
- B3_source_IT.txt10 kB
- B8_source_IT.txt5 kB
- B8_baseline_DE.txt5 kB
- B1_ref_DE.txt6 kB
- B3_baseline_DE.txt9 kB
- B5_source_IT.txt6 kB
- B3_ref_DE.txt9 kB
- B5_ref_DE.txt6 kB
- B7_baseline_DE.txt9 kB
- B2_source_IT.txt9 kB
- B7_ref_DE.txt8 kB
- B2_baseline_DE.txt9 kB
- B10_trained_DE.txt34 kB
- B7_source_IT.txt10 kB
- B10_baseline_DE.txt34 kB
- B9_ref_DE.txt16 kB
- B6_baseline_DE.txt28 kB
- B4_source_IT.txt14 kB
- B9_trained_DE.txt16 kB
- B1_baseline_DE.txt6 kB
- B8_trained_DE.txt5 kB
- B9_source_IT.txt16 kB
- B7_trained_DE.txt8 kB
- B6_trained_DE.txt27 kB
- B2_ref_DE.txt10 kB
- B1_source_IT.txt6 kB
- B5_trained_DE.txt6 kB
- B4_trained_DE.txt14 kB
- B5_baseline_DE.txt6 kB
- B4_ref_DE.txt14 kB
- B3_trained_DE.txt9 kB
- B6_source_IT.txt29 kB
- B_global_IT-DE.xlsx228 kB
- Name
- inception-v1.0.zip
- Size
- 22.39 MB
- Format
- application/zip
- Description
- inception-v1.0.zip
- MD5
- b3d0e0de8ad2e9ec7aa4ea1e20bf6670
- inception-v1.0
- CHANGELOG.md512 B
- LICENSE18 kB
- README.md1 kB
- mt_bz-v1_0.zip23 MB
- Name
- gold-v1.0.zip
- Size
- 9.26 MB
- Format
- application/zip
- Description
- gold-v1.0.zip
- MD5
- 0e6d446d61d86ff50209f1e98ce08e27
- gold-v1.0
- CHANGELOG.md512 B
- LICENSE18 kB
- README.md1 kB
- annis
- ExtData
- mt.config679 B
- corpus.properties85 B
- mt.css1 kB
- text.annis923 kB
- resolver_vis_map.annis589 B
- component.annis2 MB
- corpus_annotation.annis641 B
- node.annis21 MB
- rank.annis4 MB
- edge_annotation.annis0 B
- corpus.annis4 kB
- annis.version3 B
- example_queries.annis1 kB
- node_annotation.annis3 MB
- ExtData
- tsv
- C_DE-IT_11.tsv
- CURATION_USER.tsv381 kB
- C_IT-DE_06.tsv
- CURATION_USER.tsv41 kB
- B_IT-DE_07.tsv
- CURATION_USER.tsv315 kB
- C_IT-DE_14.tsv
- CURATION_USER.tsv256 kB
- B_DE-IT_07.tsv
- CURATION_USER.tsv306 kB
- C_DE-IT_06.tsv
- CURATION_USER.tsv50 kB
- C_IT-DE_02.tsv
- CURATION_USER.tsv148 kB
- B_IT-DE_03.tsv
- CURATION_USER.tsv333 kB
- C_DE-IT_14.tsv
- CURATION_USER.tsv255 kB
- C_IT-DE_10.tsv
- CURATION_USER.tsv103 kB
- B_DE-IT_03.tsv
- CURATION_USER.tsv346 kB
- C_IT-DE_09.tsv
- CURATION_USER.tsv104 kB
- C_DE-IT_02.tsv
- CURATION_USER.tsv155 kB
- C_DE-IT_10.tsv
- CURATION_USER.tsv103 kB
- C_DE-IT_09.tsv
- CURATION_USER.tsv103 kB
- C_IT-DE_05.tsv
- CURATION_USER.tsv73 kB
- B_IT-DE_06.tsv
- CURATION_USER.tsv1010 kB
- C_IT-DE_13.tsv
- CURATION_USER.tsv119 kB
- B_DE-IT_06.tsv
- CURATION_USER.tsv971 kB
- C_DE-IT_05.tsv
- CURATION_USER.tsv74 kB
- C_IT-DE_01.tsv
- CURATION_USER.tsv244 kB
- B_IT-DE_02.tsv
- CURATION_USER.tsv336 kB
- C_DE-IT_13.tsv
- CURATION_USER.tsv120 kB
- B_DE-IT_02.tsv
- CURATION_USER.tsv342 kB
- C_IT-DE_08.tsv
- CURATION_USER.tsv28 kB
- B_IT-DE_10.tsv
- CURATION_USER.tsv1 MB
- C_DE-IT_01.tsv
- CURATION_USER.tsv237 kB
- B_IT-DE_09.tsv
- CURATION_USER.tsv555 kB
- B_DE-IT_10.tsv
- CURATION_USER.tsv1 MB
- C_IT-DE_16.tsv
- CURATION_USER.tsv350 kB
- B_DE-IT_09.tsv
- CURATION_USER.tsv547 kB
- C_DE-IT_08.tsv
- CURATION_USER.tsv27 kB
- C_IT-DE_04.tsv
- CURATION_USER.tsv132 kB
- B_IT-DE_05.tsv
- CURATION_USER.tsv224 kB
- C_DE-IT_16.tsv
- CURATION_USER.tsv340 kB
- C_IT-DE_12.tsv
- CURATION_USER.tsv664 kB
- B_DE-IT_05.tsv
- CURATION_USER.tsv220 kB
- C_DE-IT_04.tsv
- CURATION_USER.tsv134 kB
- B_IT-DE_01.tsv
- CURATION_USER.tsv227 kB
- C_DE-IT_12.tsv
- CURATION_USER.tsv680 kB
- B_DE-IT_01.tsv
- CURATION_USER.tsv223 kB
- C_IT-DE_07.tsv
- CURATION_USER.tsv102 kB
- B_IT-DE_08.tsv
- CURATION_USER.tsv206 kB
- C_IT-DE_15.tsv
- CURATION_USER.tsv127 kB
- B_DE-IT_08.tsv
- CURATION_USER.tsv213 kB
- C_DE-IT_07.tsv
- CURATION_USER.tsv98 kB
- C_IT-DE_03.tsv
- CURATION_USER.tsv109 kB
- B_IT-DE_04.tsv
- CURATION_USER.tsv503 kB
- C_DE-IT_15.tsv
- CURATION_USER.tsv124 kB
- C_IT-DE_11.tsv
- CURATION_USER.tsv387 kB
- B_DE-IT_04.tsv
- CURATION_USER.tsv506 kB
- C_DE-IT_03.tsv
- CURATION_USER.tsv115 kB
- C_DE-IT_11.tsv
- annis.pepper478 B