Full Official Name: The Serbian Cross-Level Semantic Similarity News Corpus
Submission date: April 28, 2022, 12:13 a.m.

The Serbian CLSS News Corpus consists of 1000 phrase-sentence and 1000 sentence-paragraph pairs in Serbian gathered from news sources on the web. Each sentence pair was manually annotated with fine-grained semantic similarity scores on the 0-4 scale. The final scores were obtained by averaging the individual scores of five annotators. A more detailed description of the corpus is available on its webpage, as well as in the following reference paper: Cross-Level Semantic Similarity for Serbian Newswire Texts, Vuk Batanović, Maja Miličević Petrović, in Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC 2022), Marseille, France (2022).

Right Holder(s)