Resource: NLU Evaluation Corpora

Reference NLU Evaluation Corpora
Date of Submission March 28, 2018, 12:23 p.m.
Status accepted
ISLRN 165-571-578-116-6
Resource Type Primary Text
Media Type Text
Language English
Format/MIME Type text/json
Size 496 sentences

This project is a collection of three corpora which can be used for evaluating chatbots or other conversational interfaces. Two of the corpora were extracted from StackExchange, one from a Telegram chatbot.

If you use the data and publish please let us know and cite our SIGdial 2017 paper:
author = {Braun, Daniel and Hernandez-Mendez, Adrian and Matthes, Florian and Langen, Manfred},
title = {Evaluating Natural Language Understanding Services for Conversational Question Answering Systems},
booktitle = {Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue},
month = {August},
year = {2017},
address = {Saarbr├╝cken, Germany},
publisher = {Association for Computational Linguistics},
pages = {174--185},
url = {}

All three corpora are released under the CC BY-SA 3.0 license.

Ask Ubuntu Corpus
190 questions and answers from
Five intents (MakeUpdate, SetupPrinter, ShutdownComputer, SoftwareRecommendation, None) and three entity types (Printer, Software, Version).

Web Applications Corpus
100 questions and answers from
Eight intents (ChangePassword, DeleteAccount, DownloadVideo, ExportData, FilterSpam, FindAlternative, SyncAccounts, None) and three entity types (WebService, OS, Browser).

Chatbot Corpus
206 questions from a Telegram chatbot for public transport in Munich.
Two intents (Departure Time, Find Connection) and five entity types (StationStart, StationDest, Criterion, Vehicle, Line).

*Contact Information*
If you have any questions, please contact:
Daniel Braun (Technical University of Munich)

Version 1.17
Creator Daniel Braun - Technical University of Munich
Distributor Daniel Braun - Technical University of Munich
Rights Holder Daniel Braun - Technical University of Munich