ISLRN

Comprehensive Arabic Phonetic Database

Full Official Name: Comprehensive Arabic Phonetic Database

Submission date: March 21, 2025, 12:05 p.m.

The Comprehensive Arabic Phonetic Database is a robust and detailed linguistic resource offering both phonemic and phonetic transcriptions, precisely reflecting how Modern Standard Arabic words are realized in actual speech. This database is ideally suited for speech technology applications. This is a highly comprehensive and accurate Arabic phonetic/phonemic database, covering over 329,000 entries as follows: 1. Over 61,000 general vocabulary entries including feminine and plural forms 2. Over 101,000 Arab personal names (given names and surnames) 3. Over 143,000 foreign personal names in Arabic 4. Over 21,000 worldwide place names both Arab and non-Arab Total entries: 329,012 Each entry consists of canonical forms both vocalized and unvocalized (as in natural language) accompanied by phonetic transcriptions in IPA and X-SAMPA and the user-friendly CARS phonemic transcription system. Additionally, unique features include explicit indication of vowel neutralization, accurate word stress, gender and number codes (singular or plural), and POS (part-of-speech) codes. A high attention has been paid to the IPA transcription accuracy, particularly in representing vowel velarization and centralization. For instance, the word إِهْرَاقٌ is transcribed as [ʔih.ˈrˤɑˤː.qʊn], accurately capturing both velarized and centralized vowels, while its phonemic transcription appears as /ʾihrā́qun/. Notably, even the pharyngealization of /r/ is explicitly shown as [rˤ]. Quantity and size: 329,012 lines / 30.8 MB File format: flat TSV text file

Creator(s)

Distributor(s)

ELRA

Right Holder(s)

Status : Accepted

ISLRN :

511-751-240-544-8

Version

1.0

Source

http://catalog.elra.info/en-us/repository/browse/ELRA-S0493

Resource Type

Lexicon

Media Type

Text

Language(s)

Arabic

Access Medium