ISLRN

MASRI Synthetic

Full Official Name: MASRI Synthetic

Submission date: Sept. 12, 2022, 11:41 p.m.

<h3>Introduction</h3> <p>MASRI (Maltese Automatic Speech Recognition I) Synthetic was developed by the <a href="https://www.um.edu.mt/projects/masri/">MASRI team </a> at the <a href="https://www.um.edu.mt/">University of Malta</a> and consists of approximately 99 hours of synthesized Maltese speech.</p> <h3>Data</h3> <p>Source sentences were extracted from the <a href="https://mlrs.research.um.edu.mt/index.php?page=corpora">Maltese Language Resource Server</a> (MLRS) corpus, comprised of written or transcribed Maltese covering various genres, including parliamentary debates, news, law, opinion, sports, culture, academic, literature and religious texts. Text was processed through the CrimsonWing text-to-speech system to generate speech files. Synthesized speech was created with 210 voices (105 male and 105 female).</p> <p>Audio files are presented as 16kHz, 16-bit, single channel flac files. When uncompressed, they produce PCM wav files.</p> <p>Transcripts are contained in a single plain text file encoded as UTF-8.</p> <h3>Samples</h3> <p>Please view the following samples:</p> <ul> <li><a href="desc/addenda/LDC2022S08.f.flac">Female Audio (FLAC)</a></li> <li><a href="desc/addenda/LDC2022S08.f.txt">Female Transcript (TXT)</a></li> <li><a href="desc/addenda/LDC2022S08.m.flac">Male Audio (FLAC)</a></li> <li><a href="desc/addenda/LDC2022S08.m.txt">Male Transcript (TXT)</a></li> </ul> <h3>Updates</h3> <p>None at this time.</p>

Creator(s)

Carlos Daniel Hernández Mena

Albert Gatt

Claudia Borg

Andrea DeMarco

Lonneke van der Plas

Distributor(s)

Linguistic Data Consortium

Right Holder(s)

Status : Accepted

ISLRN :

518-019-551-096-3

Version

1.0

Source

https://catalog.ldc.upenn.edu/LDC2022S08

Resource Type

Primary Text

Media Type

Sound

Text

Language(s)

Maltese

Access Medium

Web Download