SmartWeb Motorbike Corpus (SMC)

Full Official Name: SmartWeb Motorbike Corpus (SMC)
Submission date: Jan. 24, 2014, 4:31 p.m.

The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004-2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS device (one person, SmartWeb Handheld Corpus SHC, ref. ELRA-S0278), field recordings with video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus SVC, ref. ELRA-S0279), as well as mobile recordings performed on a BMW motorbike (one speaker, SmartWeb Motorbike Corpus SMC, ref. ELRA-S0280). This corpus corresponds to mobile recordings performed on a BMW motorbike (SmartWeb Motorbike Corpus SMC) and contains recordings spoken by 36 speakers in a human-machine query situation on a running motor cycle (BMW). Bikers were asked to solve several tasks with a spoken query system to the WWW using an integrated system connected to a speech server via an UMTS connection. Recorded channels are the Bluetooth helmet microphone over UMTS (telephone quality), and - partly - the Bluetooth helmet microphone and an additional neck microphone in high quality. The corpus contains: - Total number of recorded queries: 2,315 - Total duration segmented speech: 377 minutes - Formats: WAV 44,1kHz, 16 bit, ALAW 8kHz 8bit, Verbmobil transliteration, BAS Partitur Format (BPF) - Segmentation: automatic segmentation into queries by the recording server - Distribution: 3 DVD-R See also ELRA-S0278 and ELRA-S0280.

Creator(s)
Distributor(s)
Right Holder(s)