|Date of Submission||July 18, 2019, 4:26 p.m.|
|Resource Type||Annotated corpus|
|Size||276 MB (approx. 210.000 tokens)|
TheLitkey Corpus is a richly-annotated longitudinal corpus of written texts produced by primary school children in Germany from grades 2 to 4. It has been transcribed and annotated at various linguistic levels, which include POS tags, features of the word-internal structure (phonemes, syllables, morphemes) and key orthographic features of the target words as well as a categorization of spelling errors. Comprehensive evaluations show that high accuracy was achieved on all levels, making the Litkey Corpus a useful resource for corpus-based research on literacy acquisition of German primary school children and for developing NLP tools for educational purposes. The corpus is freely available under https://www.linguistics.rub.de/litkeycorpus/.
|Creator||Stefanie Dipper - Ruhr-Universität Bochum , Eva Belke - Ruhr-Universität Bochum , Ronja Laarmann-Quante - Ruhr-Universität Bochum|
|Distributor||Stefanie Dipper - Ruhr-Universität Bochum|
|Rights Holder||Stefanie Dipper - Ruhr-Universität Bochum , Eva Belke - Ruhr-Universität Bochum , Ronja Laarmann-Quante - Ruhr-Universität Bochum|