Iraqi Arabic - English Lexical Database

Full Official Name: Iraqi Arabic - English Lexical Database
Submission date: Dec. 12, 2024, 4:55 p.m.

The Iraqi Arabic - English Lexical Database (LDC2025L01) was developed by the Linguistic Data Consortium (LDC). It is comprised of six interrelated tables, presenting each Iraqi Arabic word as an orthographic form in Arabic script and a pronunciation form in International Phonetic Alphabet (IPA) format. This release contains over 67,000 Iraqi Arabic words in Arabic script and IPA notation, and more than 120,000 English tokens. This lexical database is the result of a collaboration with Georgetown University Press (GUP) to enhance and update three dialectal Arabic dictionaries -- Iraqi, Moroccan and Syrian -- originally published in paper form in the 1960s by GUP. LDC also undertook to develop a lexical database for each dialect. The Georgetown Dictionary of Iraqi Arabic was published in 2013. That work was based on, and expanded, two dictionaries, A Dictionary of Iraqi Arabic: English-Arabic (Clarity, Stowasser and Wolfe, eds., 2003) and A Dictionary of Iraqi Arabic: Arabic-English (Woodhead and Beene, eds., 2003). The several enhancements developed by LDC included facilitating comparisons across Arabic dialects and Modern Standard Arabic by providing Arabic script spellings and IPA pronunciations to Iraqi words and phrases; promoting ease of use by language learners and researchers by developing reasonable orthographic conventions for applying the Arabic alphabet to the dialect; and facilitating a user's understanding of morphological and lexical relations by adding information on the linguistic structures of Iraqi Arabic. The documentation accompanying this release includes instructions for combining into one database the tables in this corpus with the tables in Moroccan Arabic - English Lexical Database LDC2023L01. The number of entries in each table is as follows: Roots 4,512 Lemmas 17,224 Wordforms 22,988 Multi-word Expressions 261 Definitions 23,834 Phrases 15,714 Each table is presented as a UTF-8 encoded tab-delimited file with Unix-style (line-feed only) line breaks.

Creator(s)
Distributor(s)
Right Holder(s)