Show simple item record

 
dc.contributor.author Dal Negro, Silvia
dc.contributor.author Ciccolone, Simone
dc.contributor.editor Luca, Ducceschi
dc.contributor.editor Franzini, Greta
dc.date.accessioned 2024-11-12T17:10:55Z
dc.date.available 2024-11-12T17:10:55Z
dc.date.issued 2024-06-10
dc.identifier.uri http://hdl.handle.net/20.500.12124/78
dc.description Kontatto is a corpus of transcribed and annotated spoken data collected by Silvia Dal Negro at the Free University of Bozen/Bolzano. It consists of almost 150,000 orthographic words divided into 55 recordings involving 97 different speakers for a total of 18 hours of speech. The corpus is multilingual and contains a variety of spontaneously occurring code-mixing patterns. However, language distribution is not even: 80.4% of the corpus is made of Tyrolean words, 11.5% of Italian, 2.6% of the words were classified as Trentino, another 0.8% involved other languages (e.g. Ladin, English, etc.) and, finally, 4.7% of the words are not confidently attributable to any language in particular (e.g. proper names, widespread loanwords, some interjections, etc.). This repository contains the Kontatto-MT corpus subset. The data was collected using a collaborative Map Task, during which two speakers and an interviewer interacted to navigate a physical map in order to reach a given destination. This subcorpus documents a variety of languages and dialects in the dolomite region, including (some) Tyrolean and Trentino dialects, Italian, Cimbrian, Ladin, usually combined in the same dialogue. At present it consists of 35,453 tokens, 73% classified as local German dialect. Kontatto was created within the scope of two projects financed by the Autonomous Province of Bozen-Bolzano between 2011-2014, “Italiano-tedesco: aree storiche di contatto in Sudtirolo e Trentino”, and 2016-2019, “Germanico-Romanzo: discorsi e strutture in contatto nell’area dolomitica”. Over the years, many research assistants and students have contributed to the annotation of the data: Katrin Tartarotti, Mara Leonardi, Marta Ghilardi, Nicole Giaier, Adriana Rasa, Lucia Rossaro, Luigi Parisi and Jay Hevelone. The CLARIN deposit was prepared by Greta Franzini and Luca Ducceschi of Eurac Research.
dc.language.iso deu
dc.language.iso ita
dc.language.iso bar
dc.publisher Free University of Bolzano
dc.relation.isbasedon https://gitlab.inf.unibz.it/commul/kontatto/data/-/tags/v1.0
dc.relation.isreferencedby https://doi.org/10.1515/soci-2020-0014
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.label PUB
dc.source.uri https://kontatti.projects.unibz.it/category/corpus/
dc.subject German dialects
dc.subject language contact
dc.subject South Tyrolean
dc.subject multilingualism
dc.title KONTATTO v1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType audio
has.files yes
branding Various
contact.person Silvia Dal Negro silvia.dalnegro@unibz.it University of Bolzano
contact.person Corpus Manager clarin@eurac.edu Eurac Research CLARIN Centre (ERCC)
sponsor Free University of Bolzano I71J10000370003 Italiano-tedesco: aree storiche di contatto in Sudtirolo e Trentino ownFunds
size.info 1.50 gb
files.size 849214637
files.count 2


 Files in this item

 Download all files in item (809.87 MB)
Icon
Name
README.html
Size
5.53 KB
Format
HTML
Description
README.html
MD5
f2a79b12725c61add167755490352c91
 Download file  Preview
 File Preview  
Icon
Name
data-v1.0.zip
Size
809.87 MB
Format
application/zip
Description
data-v1.0
MD5
282c837ffd1436941683ed3712c9fae3
 Download file  Preview
 File Preview  
  • data-v1.0
    • README.md4 kB
    • Kontatto-MT_metadata.csv2 kB
    • Map_Task_1B.pdf431 kB
    • Map_Task_1A.pdf457 kB
    • Map_Task_2.pdf309 kB
    • .gitattributes43 B
    • eaf
      • Kontatto_MT_IT_01.eaf1 MB
      • Kontatto_MT_TYR_11.eaf2 MB
      • Kontatto_MT_TYR_03.eaf1 MB
      • Kontatto_MT_TYR_08.eaf798 kB
      • Kontatto_MT_TYR_13.eaf261 kB
      • Kontatto_MT_TR_01.eaf1 MB
      • Kontatto_MT_TYR_05.eaf1 MB
      • Kontatto_MT_TYR_10.eaf2 MB
      • Kontatto_MT_TYR_15.eaf3 MB
      • Kontatto_MT_TYR_02.eaf1 MB
      • Kontatto_MT_TR_03.eaf1 MB
      • Kontatto_MT_TYR_07.eaf3 MB
      • Kontatto_MT_IT_02.eaf2 MB
      • Kontatto_MT_TYR_12.eaf542 kB
      • Kontatto_MT_TYR_04.eaf870 kB
      • Kontatto_MT_TYR_09.eaf1 MB
      • Kontatto_MT_TYR_14.eaf3 MB
      • Kontatto_MT_TYR_01.eaf941 kB
      • Kontatto_MT_TR_02.eaf1 MB
      • Kontatto_MT_TYR_06.eaf2 MB
    • flac
      • Kontatto_MT_TYR_05.flac22 MB
      • Kontatto_MT_IT_01.flac33 MB
      • Kontatto_MT_TR_01.flac27 MB
      • Kontatto_MT_TYR_06.flac44 MB
      • Kontatto_MT_IT_02.flac69 MB
      • Kontatto_MT_TR_02.flac43 MB
      • Kontatto_MT_TYR_07.flac63 MB
      • Kontatto_MT_TR_03.flac21 MB
      • Kontatto_MT_TYR_10.flac80 MB
      • Kontatto_MT_TYR_08.flac25 MB
      • Kontatto_MT_TYR_15.flac97 MB
      • Kontatto_MT_TYR_11.flac56 MB
      • Kontatto_MT_TYR_01.flac46 MB
      • Kontatto_MT_TYR_09.flac40 MB
      • Kontatto_MT_TYR_12.flac12 MB
      • Kontatto_MT_TYR_02.flac12 MB
      • Kontatto_MT_TYR_13.flac6 MB
      • Kontatto_MT_TYR_03.flac24 MB
      • Kontatto_MT_TYR_14.flac74 MB
      • Kontatto_MT_TYR_04.flac10 MB
    • Kontatto_MT_tagsets.csv1 kB
    • LICENSE20 kB

Show simple item record