dc.contributor.author | Bienati, Arianna |
dc.contributor.author | Frey, Jennifer-Carmen |
dc.contributor.author | Zanasi, Lorenzo |
dc.contributor.author | Stemle, Egon |
dc.contributor.author | Brasolin, Paolo |
dc.contributor.author | Vettori, Chiara |
dc.date.accessioned | 2025-07-05T07:23:17Z |
dc.date.available | 2025-07-05T07:23:17Z |
dc.date.issued | 2024-02-28 |
dc.identifier.uri | http://hdl.handle.net/20.500.12124/76 |
dc.description | The ITACA Corpus is a corpus of argumentative essays written in Italian by upper secondary school students from South Tyrol. It has been created with the aim to investigate and describe the students’ textual competences with a special focus on text coherence. The ITACA corpus consists of 635 texts collected during the school year 2021/2022 in schools with Italian as a language of instruction. The whole corpus has been automatically tokenized, lemmatized, and annotated for part-of-speech and dependency relations. A subset of 388 texts additionally contains annotations regarding textual features, such as punctuation, connectives, agreement, anaphora, argumentative structure, off-topics and contradictions. The corpus furthermore provides metadata regarding student’s age, gender, language background, reading and writing habits, their performance in a standardized language test as well as holistic and analytic coherence evaluations for each text. |
dc.language.iso | ita |
dc.publisher | Eurac Research |
dc.rights | CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0) |
dc.rights.uri | https://gitlab.inf.unibz.it/commul/var/eurac-licenses/-/raw/v1.0/EULA-CLARIN-ACA-BY-NC-NORED.md |
dc.rights.label | ACA |
dc.source.uri | https://www.porta.eurac.edu/lci/itaca/ |
dc.subject | cohesion |
dc.subject | coherence |
dc.subject | L1 |
dc.subject | student writing |
dc.subject | argumentative essays |
dc.subject | Italian |
dc.subject | South Tyrol |
dc.subject | upper secondary school |
dc.subject | 12th grade |
dc.title | ITACA Corpus - Coherence in Italian Argumentative Essays v1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
hidden | false |
hasMetadata | false |
has.files | yes |
branding | Learner Language |
demo.uri | https://commul.eurac.edu/annis/itaca |
contact.person | Jennifer-Carmen Frey porta@eurac.edu Institute for Applied Linguistics, Eurac Research |
contact.person | Arianna Bienati porta@eurac.edu Institute for Applied Linguistics, Eurac Research |
sponsor | Autonomous province of Südtirol/Alto Adige x ITACA - Coerenza nell’ITAliano Accademico nationalFunds |
size.info | 635 texts |
files.size | 49208791 |
files.count | 7 |
Files in this item
Download all files in item (46.93 MB)This item is
CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)
Academic Use
and licensed under:CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)




- Name
- README.md
- Size
- 3.94 KB
- Format
- Unknown
- Description
- README
- MD5
- a971e692ebda49733b18653876155188

- Name
- CHANGELOG.md
- Size
- 385 bytes
- Format
- Unknown
- Description
- CHANGELOG
- MD5
- cb6bdbfdce5f7f411efc44842dd355c5

- Name
- docs-v1.0.zip
- Size
- 1.3 MB
- Format
- application/zip
- Description
- documentation
- MD5
- 0c6b6a718f38c1f4657b75690232c5b6

- Name
- metadata-v1.0.zip
- Size
- 125.85 KB
- Format
- application/zip
- Description
- corpus metadata
- MD5
- 0c8440382ec5ec211cee46bfc7119974

- Name
- txt-v1.0.zip
- Size
- 1.12 MB
- Format
- application/zip
- Description
- Unknown
- MD5
- 480c98b4e3f0e0c67f112a519990cfe3

- Name
- inception-v1.0.zip
- Size
- 9.42 MB
- Format
- application/zip
- Description
- corpus data in webannoTSV format for INCEpTION annotation tool
- MD5
- 90d237f79d82c359c46ac91da0d4333c

- Name
- annis-v1.0.zip
- Size
- 34.97 MB
- Format
- application/zip
- Description
- corpus data in ANNIS format for upload in search interface
- MD5
- e6d4b8325cf0a127663df6807ba78635