CMRC 2019 Dataset

Full Official Name: A Chinese Reading Comprehension Dataset for the 3rd Chinese Machine Reading Comprehension Evaluation (CMRC 2019)
Submission date: July 7, 2020, 3:24 p.m.

We propose a new task called Sentence Cloze-style Machine Reading Comprehension (SC-MRC). The proposed task aims to fill the right candidate sentence into the passage that has several blanks. Moreover, to add more difficulties, we also made fake candidates that are similar to the correct ones, which requires the machine to judge their correctness in the context. The proposed dataset contains over 100K blanks (questions) within over 10K passages, which was originated from Chinese narrative stories. To evaluate the dataset, we implement several baseline systems based on pre-trained models, and the results show that the state-of-the-art model still underperforms human performance by a large margin.

Creator(s)
Distributor(s)
Right Holder(s)