dc.contributor.author | Abel, Andrea |
dc.contributor.author | Zanasi, Lorenzo |
dc.contributor.author | Nicolas, Lionel |
dc.contributor.author | Konecny, Christine |
dc.contributor.author | Autelli, Erica |
dc.date.accessioned | 2023-02-22T09:50:42Z |
dc.date.available | 2023-02-22T09:50:42Z |
dc.date.issued | 2021 |
dc.identifier.uri | http://hdl.handle.net/20.500.12124/33 |
dc.description | The LEKO corpora LEKO_Kolipsi and LEKO_Merlin provide lexical annotations for phraseological elements in Italian L2 writing on the basis of a subset of the texts of the Kolipsi-1 corpus and the Merlin corpus respectively. The annotations were jointly created by the University of Innsbruck (Austria) and Eurac Research Bolzano (Italy) within the project LEKO, whose aim was to describe the use of phrasemes in these texts. There are manual annotations for phraseme category, lexical errors, morpho-syntactic features and error explanations. LEKO_Kolipsi contains about 55 000 tokens in 282 texts from 141 pupils of the final year of upper secondary school, representing two different text types (email and letter, narrative and argumentative genre) as described in the Kolipsi-1 documentation. LEKO_Merlin contains about 9 000 tokens in 50 texts from 50 examinees, who took part in an official language test (TELC) for Italian. The documents have been transcribed according to the Kolipsi-1 and Merlin Transcription guidelines. Annotation guidelines for the lexical annotations can be found here. Note: The LEKO corpora do not contain manual annotations for non-lexical errors, foreign word insertions, target language transcriptions, ambiguous writings or other annotations available in the base corpora Kolipsi-1 and Merlin. In order to retrieve any of those annotations and/or full target versions of the student writings please consult the base corpora directly. |
dc.language.iso | ita |
dc.publisher | Institute for Applied Linguistics, Eurac Research |
dc.relation.isreferencedby | http://hdl.handle.net/10863/7683 |
dc.rights | CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0) |
dc.rights.uri | https://gitlab.inf.unibz.it/commul/var/eurac-licenses/-/raw/v1.0/EULA-CLARIN-ACA-BY-NC-NORED.md |
dc.rights.label | ACA |
dc.subject | Phraseology |
dc.subject | Phrasemes |
dc.subject | Lexical combinations |
dc.subject | learner language |
dc.subject | student writing |
dc.subject | non-standard language |
dc.title | LEKO v1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
hidden | false |
hasMetadata | false |
has.files | yes |
branding | Eurac Research |
demo.uri | https://commul.eurac.edu/annis/leko |
contact.person | Aivars Glaznieks porta@eurac.edu Eurac Research |
contact.person | Jennifer-Carmen Frey porta@eurac.edu Eurac Research |
sponsor | Autonomous Province of Bozen/Bolzano 02/40.3 LeKo - Lexemkombinationen und typisierte Rede im mehrsprachigen Kontext. Authentische Sprachdaten für die Erarbeitung didaktischer Materialien zur italienischen Wortkombinatorik für deutschsprachige L2-Lerner nationalFunds |
size.info | 64 000 tokens |
size.info | 332 texts |
files.size | 8685842 |
files.count | 7 |
Files in this item
Download all files in item (8.28 MB)This item is
CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)
Academic Use
and licensed under:CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)
- Name
- README.html
- Size
- 6.68 KB
- Format
- HTML
- Description
- Readme
- MD5
- 8077b48d13a3c529cc123dfb07f065b7
- Name
- CHANGELOG.html
- Size
- 1.89 KB
- Format
- HTML
- Description
- Information on changes for different versions of the corpus
- MD5
- 6919db9dfc38c3695bdea6d13714a66b
- Name
- docs-v1.0.zip
- Size
- 335.34 KB
- Format
- application/zip
- Description
- annotation guidelines for the lexical annotations specific to the LEKO subsets of the Kolipsi-1 and Merlin corpora
- MD5
- ee2fac510776d4bf322210bfe3b0e88c
- Name
- txt-v1.0.zip
- Size
- 213.8 KB
- Format
- application/zip
- Description
- contains the transcriptions of the student productions in the corpus in unannotated plain text format
- MD5
- 42f342f5fb230f33779f686d9b6aaed9
- Name
- metadata-v1.0.zip
- Size
- 90.67 KB
- Format
- application/zip
- Description
- metadata for corpora, tasks, authors and texts as csv and xlsx
- MD5
- 349f559a32031686b14a5c8b54167d51
- Name
- annis-v1.0.zip
- Size
- 3.76 MB
- Format
- application/zip
- Description
- corpus for input in corpus query software ANNIS
- MD5
- 7a5b1802792e8a97c666968bf57eb339
- Name
- mmax2-v1.0.zip
- Size
- 3.89 MB
- Format
- application/zip
- Description
- corpus with stand-off annotations produced in MMAX2
- MD5
- 505010f385ba2f5e7fe20a52d8e40960