ClusterMWE

Full Official Name: Goldstandard Dataset of Russian Multi-word Expressions
Submission date: Oct. 21, 2023, 1:14 p.m.

The dataset contains 285 Russian multi-word expressions (MWEs) based on 15 lexical, grammatical and other criteria of idiomaticity described in theoretical books and papers on the concept of idiomaticity. The MWEs were collected from the same theoretical sources as the criteria, and a set of experts in linguistics annotated them with these criteria. It can be used to build a data-driven classification and facilitate automatic extraction of MWEs via feature engineering.

Creator(s)
Distributor(s)
Right Holder(s)