PAROLE Irish Distributable Corpus

Full Official Name: PAROLE Irish Distributable Corpus
Submission date: Jan. 24, 2014, 4:30 p.m.

The PAROLE Irish Distributable Corpus consists of over 8 million words (a subset of the 15+ million words Irish Reference corpus). The text is marked-up in accordance with the PAROLE encoding standard which incorporates the Corpus Encoding Standard (CES) and Text Encoding Initiative (TEI) Guidelines. All the files are in SGML format with a detailed header and the body of the text tagged to paragraph level. The header includes information such as title, author(s), number of words, ownership, publication details and also a standard coding for Medium, Topic and Genre categories. A subset of the Distributable Corpus is morpho-syntactically tagged. Included in this distribution is approximately 3,000 manually checked words.

Right Holder(s)