Show simple item record

 
dc.contributor.author De Camillis, Flavia
dc.contributor.author Chiocchetti, Elena
dc.contributor.author Stemle, Egon W.
dc.date.accessioned 2023-06-18T18:33:02Z
dc.date.available 2023-06-18T18:33:02Z
dc.date.issued 2023-06-13
dc.identifier.uri http://hdl.handle.net/20.500.12124/60
dc.description The MT@BZ is a translation corpus that consists of 52 decrees published by the Autonomous Province of Bolzano (South Tyrol) aligned with their machine translated versions. More precisely, it consists of 26 decrees in German and the same 26 in Italian in their official versions, respectively machine translated by the project team into Italian and into German. 10 of them are COVID-19 related decress, while 16 are miscellaneous. Overall, they consist of around 130,000 words. Their machine translation was carried out with a customized version of ModernMT. Later, the corpus was uploaded first into the annotation platform Webanno, then transferred to Inception. Four annotators annotated the translation errors made by the machine according to an ad hoc error taxonomy for quality assessment. Finally, the annotations were curated to create a gold standard corpus.
dc.language.iso ita
dc.language.iso deu
dc.publisher Institute for Applied Linguistics, Eurac Research
dc.relation.isbasedon https://gitlab.inf.unibz.it/commul/mt-bz/data/bundle/-/tags/v1.0
dc.relation.isreferencedby https://events.tuni.fi/uploads/2023/06/11678752-proceedings-eamt2023.pdf
dc.rights Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc/4.0/
dc.rights.label PUB
dc.source.uri https://www.eurac.edu/it/institutes-centers/istituto-di-linguistica-applicata/projects/mtbz
dc.subject machine translation
dc.subject annotation
dc.subject translation errors
dc.subject accuracy
dc.subject fluency
dc.subject Italian
dc.subject German
dc.subject South Tyrolean German
dc.subject legal language
dc.title MT@BZ translation corpus v1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Lexicography, Terminology, and Translation
contact.person Elena Chiocchetti elena.chiocchetti@eurac.edu Eurac Research
contact.person Corpus Manager clarin@eurac.edu Eurac Research CLARIN Centre (ERCC)
sponsor Institute for Applied Linguistics, Eurac Research / Machine Translation at South Tyrolean Institutions ownFunds
size.info 52 texts
size.info 130.000 tokens
files.size 37159305
files.count 5


 Files in this item

 Download all files in item (35.44 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Distributed under Creative Commons Attribution Required Noncommercial
Icon
Name
README.html
Size
15.01 KB
Format
HTML
Description
README.html
MD5
d579908b48b7ba7904bdf0947dd4b644
 Download file  Preview
 File Preview  
Icon
Name
CHANGELOG.html
Size
1.68 KB
Format
HTML
Description
CHANGELOG.html
MD5
601def2df881559f213b9eef66404868
 Download file  Preview
 File Preview  
Icon
Name
decrees-v1.0.zip
Size
3.77 MB
Format
application/zip
Description
decrees-v1.0.zip
MD5
4a9a23e5677d85a1e6aa81f6d1f6fbc5
 Download file  Preview
 File Preview  
  • decrees-v1.0
    • CHANGELOG.md512 B
    • LICENSE18 kB
    • make_tsv.py10 kB
    • README.md11 kB
    • C
      • C_global_DE-IT.xlsx198 kB
      • DE-IT
        • C9_trained_IT.txt3 kB
        • C9_baseline_IT.txt2 kB
        • C2_source_DE.txt4 kB
        • C2_ref_IT.txt4 kB
        • C13_baseline_IT.txt3 kB
        • C8_trained_IT.txt800 B
        • C7_source_DE.txt2 kB
        • C4_baseline_IT.txt3 kB
        • C7_trained_IT.txt2 kB
        • C6_trained_IT.txt1 kB
        • C4_ref_IT.txt4 kB
        • C5_trained_IT.txt2 kB
        • C13_source_DE.txt3 kB
        • C4_trained_IT.txt4 kB
        • C8_baseline_IT.txt773 B
        • C6_ref_IT.txt1 kB
        • C10_ref_IT.txt3 kB
        • C4_source_DE.txt4 kB
        • C3_trained_IT.txt3 kB
        • C12_baseline_IT.txt17 kB
        • C3_baseline_IT.txt3 kB
        • C2_trained_IT.txt4 kB
        • C8_ref_IT.txt827 B
        • C16_trained_IT.txt9 kB
        • C12_ref_IT.txt18 kB
        • C9_source_DE.txt3 kB
        • C1_trained_IT.txt6 kB
        • C15_trained_IT.txt3 kB
        • C10_source_DE.txt3 kB
        • C14_trained_IT.txt6 kB
        • C15_source_DE.txt3 kB
        • C14_ref_IT.txt6 kB
        • C13_trained_IT.txt3 kB
        • C7_baseline_IT.txt2 kB
        • C1_source_DE.txt7 kB
        • C16_baseline_IT.txt9 kB
        • C12_trained_IT.txt18 kB
        • C16_ref_IT.txt10 kB
        • C2_baseline_IT.txt4 kB
        • C6_source_DE.txt1 kB
        • C11_baseline_IT.txt10 kB
        • C11_trained_IT.txt10 kB
        • C1_ref_IT.txt6 kB
        • C10_trained_IT.txt2 kB
        • C12_source_DE.txt18 kB
        • C3_ref_IT.txt3 kB
        • C3_source_DE.txt3 kB
        • C15_baseline_IT.txt3 kB
        • C6_baseline_IT.txt1 kB
        • C5_ref_IT.txt2 kB
        • C10_baseline_IT.txt2 kB
        • C8_source_DE.txt716 B
        • C1_baseline_IT.txt6 kB
        • C7_ref_IT.txt2 kB
        • C11_ref_IT.txt11 kB
        • C14_source_DE.txt6 kB
        • C9_ref_IT.txt3 kB
        • C5_baseline_IT.txt2 kB
        • C13_ref_IT.txt3 kB
        • C14_baseline_IT.txt6 kB
        • C5_source_DE.txt2 kB
        • C15_ref_IT.txt3 kB
        • C11_source_DE.txt10 kB
        • C16_source_DE.txt10 kB
      • IT-DE
        • C2_baseline_DE.txt4 kB
        • C12_source_IT.txt18 kB
        • C11_baseline_DE.txt11 kB
        • C11_trained_DE.txt11 kB
        • C1_ref_DE.txt7 kB
        • C10_trained_DE.txt3 kB
        • C3_ref_DE.txt3 kB
        • C3_source_IT.txt3 kB
        • C15_baseline_DE.txt3 kB
        • C8_source_IT.txt827 B
        • C6_baseline_DE.txt1 kB
        • C5_ref_DE.txt2 kB
        • C10_baseline_DE.txt3 kB
        • C1_baseline_DE.txt7 kB
        • C14_source_IT.txt6 kB
        • C7_ref_DE.txt2 kB
        • C11_ref_DE.txt10 kB
        • C5_source_IT.txt2 kB
        • C9_ref_DE.txt3 kB
        • C5_baseline_DE.txt2 kB
        • C13_ref_DE.txt3 kB
        • C14_baseline_DE.txt6 kB
        • C11_source_IT.txt11 kB
        • C15_ref_DE.txt3 kB
        • C16_source_IT.txt10 kB
        • C2_source_IT.txt4 kB
        • C9_trained_DE.txt3 kB
        • C9_baseline_DE.txt3 kB
        • C7_source_IT.txt2 kB
        • C8_trained_DE.txt734 B
        • C2_ref_DE.txt4 kB
        • C13_baseline_DE.txt3 kB
        • C4_baseline_DE.txt3 kB
        • C7_trained_DE.txt2 kB
        • C13_source_IT.txt3 kB
        • C6_trained_DE.txt1 kB
        • C4_ref_DE.txt4 kB
        • C5_trained_DE.txt2 kB
        • C4_trained_DE.txt3 kB
        • C4_source_IT.txt4 kB
        • C8_baseline_DE.txt763 B
        • C6_ref_DE.txt1 kB
        • C10_ref_DE.txt3 kB
        • C3_trained_DE.txt3 kB
        • C12_baseline_DE.txt18 kB
        • C9_source_IT.txt3 kB
        • C16_trained_DE.txt10 kB
        • C3_baseline_DE.txt3 kB
        • C10_source_IT.txt3 kB
        • C2_trained_DE.txt4 kB
        • C8_ref_DE.txt716 B
        • C12_ref_DE.txt18 kB
        • C15_source_IT.txt3 kB
        • C1_trained_DE.txt7 kB
        • C15_trained_DE.txt3 kB
        • C14_trained_DE.txt6 kB
        • C1_source_IT.txt6 kB
        • C14_ref_DE.txt6 kB
        • C13_trained_DE.txt3 kB
        • C7_baseline_DE.txt2 kB
        • C12_trained_DE.txt17 kB
        • C16_baseline_DE.txt10 kB
        • C6_source_IT.txt1 kB
        • C16_ref_DE.txt10 kB
      • C_global_IT-DE.xlsx182 kB
    • tsv
      • C_DE-IT_11.tsv252 kB
      • C_IT-DE_06.tsv33 kB
      • B_IT-DE_07.tsv212 kB
      • C_IT-DE_14.tsv169 kB
      • B_DE-IT_07.tsv205 kB
      • C_DE-IT_06.tsv33 kB
      • C_IT-DE_02.tsv97 kB
      • B_IT-DE_03.tsv221 kB
      • C_DE-IT_14.tsv171 kB
      • C_IT-DE_10.tsv69 kB
      • B_DE-IT_03.tsv226 kB
      • C_IT-DE_09.tsv69 kB
      • C_DE-IT_02.tsv102 kB
      • C_DE-IT_10.tsv68 kB
      • C_DE-IT_09.tsv68 kB
      • C_IT-DE_05.tsv48 kB
      • B_IT-DE_06.tsv663 kB
      • C_IT-DE_13.tsv80 kB
      • B_DE-IT_06.tsv639 kB
      • C_DE-IT_05.tsv48 kB
      • C_IT-DE_01.tsv163 kB
      • B_IT-DE_02.tsv219 kB
      • C_DE-IT_13.tsv80 kB
      • B_DE-IT_02.tsv223 kB
      • C_IT-DE_08.tsv18 kB
      • B_IT-DE_10.tsv819 kB
      • C_DE-IT_01.tsv158 kB
      • B_IT-DE_09.tsv375 kB
      • B_DE-IT_10.tsv820 kB
      • C_IT-DE_16.tsv225 kB
      • B_DE-IT_09.tsv378 kB
      • C_DE-IT_08.tsv18 kB
      • C_IT-DE_04.tsv89 kB
      • B_IT-DE_05.tsv148 kB
      • C_DE-IT_16.tsv218 kB
      • C_IT-DE_12.tsv430 kB
      • B_DE-IT_05.tsv145 kB
      • C_DE-IT_04.tsv91 kB
      • B_IT-DE_01.tsv150 kB
      • C_DE-IT_12.tsv435 kB
      • B_DE-IT_01.tsv149 kB
      • C_IT-DE_07.tsv64 kB
      • B_IT-DE_08.tsv137 kB
      • C_IT-DE_15.tsv80 kB
      • B_DE-IT_08.tsv140 kB
      • C_DE-IT_07.tsv66 kB
      • C_IT-DE_03.tsv74 kB
      • B_IT-DE_04.tsv342 kB
      • C_DE-IT_15.tsv78 kB
      • C_IT-DE_11.tsv258 kB
      • B_DE-IT_04.tsv346 kB
      • C_DE-IT_03.tsv72 kB
    • B
      • B_global_DE-IT.xlsx251 kB
      • DE-IT
        • B7_ref_IT.txt10 kB
        • B2_baseline_IT.txt9 kB
        • B10_trained_IT.txt35 kB
        • B2_source_DE.txt10 kB
        • B10_baseline_IT.txt34 kB
        • B9_ref_IT.txt16 kB
        • B7_source_DE.txt8 kB
        • B6_baseline_IT.txt26 kB
        • B9_trained_IT.txt16 kB
        • B1_baseline_IT.txt6 kB
        • B8_trained_IT.txt5 kB
        • B4_source_DE.txt14 kB
        • B7_trained_IT.txt9 kB
        • B6_trained_IT.txt27 kB
        • B9_source_DE.txt16 kB
        • B2_ref_IT.txt9 kB
        • B5_trained_IT.txt6 kB
        • B4_trained_IT.txt13 kB
        • B5_baseline_IT.txt5 kB
        • B4_ref_IT.txt14 kB
        • B1_source_DE.txt6 kB
        • B3_trained_IT.txt10 kB
        • B2_trained_IT.txt9 kB
        • B10_ref_IT.txt36 kB
        • B6_source_DE.txt26 kB
        • B6_ref_IT.txt29 kB
        • B10_source_DE.txt34 kB
        • B1_trained_IT.txt6 kB
        • B9_baseline_IT.txt15 kB
        • B8_ref_IT.txt5 kB
        • B4_baseline_IT.txt13 kB
        • B3_source_DE.txt9 kB
        • B8_source_DE.txt5 kB
        • B8_baseline_IT.txt5 kB
        • B3_baseline_IT.txt9 kB
        • B1_ref_IT.txt6 kB
        • B5_source_DE.txt6 kB
        • B3_ref_IT.txt10 kB
        • B5_ref_IT.txt6 kB
        • B7_baseline_IT.txt8 kB
      • IT-DE
        • B10_source_IT.txt36 kB
        • B2_trained_DE.txt9 kB
        • B10_ref_DE.txt34 kB
        • B1_trained_DE.txt6 kB
        • B6_ref_DE.txt26 kB
        • B9_baseline_DE.txt16 kB
        • B8_ref_DE.txt5 kB
        • B4_baseline_DE.txt14 kB
        • B3_source_IT.txt10 kB
        • B8_source_IT.txt5 kB
        • B8_baseline_DE.txt5 kB
        • B1_ref_DE.txt6 kB
        • B3_baseline_DE.txt9 kB
        • B5_source_IT.txt6 kB
        • B3_ref_DE.txt9 kB
        • B5_ref_DE.txt6 kB
        • B7_baseline_DE.txt9 kB
        • B2_source_IT.txt9 kB
        • B7_ref_DE.txt8 kB
        • B2_baseline_DE.txt9 kB
        • B10_trained_DE.txt34 kB
        • B7_source_IT.txt10 kB
        • B10_baseline_DE.txt34 kB
        • B9_ref_DE.txt16 kB
        • B6_baseline_DE.txt28 kB
        • B4_source_IT.txt14 kB
        • B9_trained_DE.txt16 kB
        • B1_baseline_DE.txt6 kB
        • B8_trained_DE.txt5 kB
        • B9_source_IT.txt16 kB
        • B7_trained_DE.txt8 kB
        • B6_trained_DE.txt27 kB
        • B2_ref_DE.txt10 kB
        • B1_source_IT.txt6 kB
        • B5_trained_DE.txt6 kB
        • B4_trained_DE.txt14 kB
        • B5_baseline_DE.txt6 kB
        • B4_ref_DE.txt14 kB
        • B3_trained_DE.txt9 kB
        • B6_source_IT.txt29 kB
      • B_global_IT-DE.xlsx228 kB
Icon
Name
inception-v1.0.zip
Size
22.39 MB
Format
application/zip
Description
inception-v1.0.zip
MD5
b3d0e0de8ad2e9ec7aa4ea1e20bf6670
 Download file  Preview
 File Preview  
  • inception-v1.0
    • CHANGELOG.md512 B
    • LICENSE18 kB
    • README.md1 kB
    • mt_bz-v1_0.zip23 MB
Icon
Name
gold-v1.0.zip
Size
9.26 MB
Format
application/zip
Description
gold-v1.0.zip
MD5
0e6d446d61d86ff50209f1e98ce08e27
 Download file  Preview
 File Preview  

Show simple item record