Resource: TRAD Pashto Broadcast News Speech Corpus

Reference TRAD Pashto Broadcast News Speech Corpus
Date of Submission April 6, 2016, 4:51 p.m.
Status accepted
ISLRN 918-508-885-913-7
Resource Type Other
Media Type Audio
Language Pushto
Size 14.8 Gb

This corpus contains transcribed broadcast news recordings in Pashto. Recordings are collected from 5 sources: Ashna TV, Azadi Radio, Deewa Radio, Mashaal Radio and Shamshad TV.

The corpus contains 108 hours of recordings covering more than 1,000 speakers. Transcriptions are provided together with the audio files and include about 46,000 segments and 1.1M words.

Pashto is an indo-iranian language spoken by the Pashtun people mainly in Pakistan and Afghanistan.

This corpus was produced by ELDA within the PEA TRAD project supported by the French Ministry of Defence (DGA).

Version 1.0
Distributor ELRA