The Sad Boyz Corpus

Full Official Name: The Sad Boyz Corpus - an Irony-Annotated Corpus of Naturalistic Speech Data
Submission date: June 25, 2022, 12:21 a.m.

The Sad Boyz Corpus contains 5,812 utterances of naturalistic speech data in WAV format captured from the Sad Boyz comedy podcast. Each sample is marked with an episode code, a speaker code, a numerical indicator of when the utterance occurs relative to other utterances from the same utterance, and an irony label. The corpus contains data from 12 different speakers (7 female 5 male). An additional test corpus is also provided containing 1,356 samples from only two speakers annotated in the same way as the main corpus.

Right Holder(s)