Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS)

Full Official Name: Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS)
Submission date: Sept. 1, 2014, 4 p.m.

The Khanty computer corpus contains the following sub-corpora:Khanty, Atlym dialect, 519 words, 3967 charactersKhanty, Kazym dialect, 62766 words, 585659 charactersKhanty, Konda dialect, 1115 words, 10234 charactersKhanty, Nizjam dialect, 17681 words, 259732 charactersKhanty, Obdorsk dialect, 10939 words, 200358 charactersKhanty, Synja dialect, 10939 words, 200358 characters.The corpora of the Khanty dialects are samples taken from the following text collections:Rédei, Károly (1968). Nord-ostjakische Texte (Kazym-Dialekt) mit Skizze der Grammatik. Gesammelt und herausgegeben von Károly Rédei. Abhandlung der Akademie der Wissenschaften in Göttingen, philologisch-historische Klasse, dritte Folge 71. Göttingen.Steinitz, Wolfgang (1989). Ostjakologische Arbeiten III. Texte aus dem Nachlass. Eds.: Hartung, Liselotte, Hauel, Petra, Sauer, Gert & Schulze, Birgitte. Janua Linguarum, Series Practica 256. Mouton de Gruyter, Berlin.Vértes, Edith (1980). H. Paasonens südostjakische Textsammlungen. Suomalais-Ugrilaisen Seuran Toimituksia 175. Suomalais-Ugrilainen Seura, Helsinki.The corpora are running texts and several corpora are morphologically analyzed. Morphologically encoded words of the texts are in the word-per-line format, and the plain texts are in sentence-per-line format. There are also texts in which the clauses and the sentences are marked with the information about the location of the sentences in the texts.Khanty, Textbook:Rugin, R.P. (1990). Shum jôxan sjun'öng xâtLöt. (Shchastlivye den'ki na Shum-jugane.) [Onnellisia päiviä Shum-joella.] Kniga dlja dopol'nitel'nogo chtenija v 3-4 klassax xantyjskix shkol (shuryshkarskij dialekt). Prosveshchenie, Leningrad.The text includes six different versions: (1) one version edited in the original form by using the Cyrillic alphabet; (2) the same text as transformed to the Latin alphabet; the same text as translated into (3) Finnish, (4) English and (5) Russian, and (6) the original text in the Latin format as morphologically coded and translated into English.Children's books: Life of Jesus in Khanty (the Kazim dialect). (Trial edition). Translation: Nyomysova, Yevdokiya Andreyevna & Lozyamova, Zoya Nikiforovna. ISBN 952-9790-25-2, ISBN 91-88394-97-2. 63 pp. Institute for Bible Translation. Stockholm & Helsinki 1995.Life of Jesus in Khanty (the Kazim dialect). (Second edition). Translation: Nyomysova, Yevdokiya Andreyevna & Lozyamova, Zoya Nikiforovna. ISBN 952-9790-40-6, ISBN 91-88794-83-0. 63 pp. Institute for Bible Translation. Stockholm & Helsinki 1997.The computer corpora on the Khanty dialects, and the textbook were compiled and edited by Merja Salo with the financial support of the Academy of Finland. The adaptation of the texts for public use was done with the financial support of the Department of General Linguistics, University of Helsinki. The books of children were donated to the University of Helsinki by the Institute for Bible Translation, Helsinki and Stockholm.The Khanty Corpus is a part of the UHLCS corpus collection.UHLCS has many different IPR holders. Should you have any questions regarding the collection, please contact Pirkko Suihkonen (suihkonen.pirkko@gmail.com). License details: http://www.csc.fi/english/research/software/a-licDetailed information: http://www.ling.helsinki.fi/uhlcs/ Instructions for applying for access: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/KielipankkiAccessRightsThe purpose of the resource use must be outlined in a research plan.

Creator(s)
Distributor(s)
Right Holder(s)