Corpus of Arabic Recordings

Downloadable Recordings of Urban Jordanian Arabic

For this corpus, 12 native speakers of Urban Jordanian Arabic were recorded (6 females, 6 males). Urban Jordanian Arabic (UJA) is spoken by people living in the major cities of Jordan more than two-thirds of the population of Jordan.

While the focus of these recordings was on the distinction between plain and emphatic consonants, the speech files contain a variety of consonants differing in place and manner of articulation as well as vowels differing in both quality and length. Mono-, bi-, and trisyllabic words and nonwords were recorded, with the target consonant either in initial, medial, or final position. Three repetitions of each token are (typically) included. All tokens were produced in the carrier phrase ehki _____ keman merah (say _____ once again').

This corpus is based upon work supported by the National Science Foundation under Grant No. 0518969 (Acoustic and perceptual correlates of Emphasis in Arabic, Allard Jongman, P.I.). Please acknowledge the source of these materials in any presentation or publication.

Technical details

All speakers were recorded at the Hashmite University in Zarqa , Jordan using a Marantz PMD671 portable solid state recorder and an Electro-Voice N/D767a microphone. Sampling rate: 22.05 kHz

Audio format: .WAV

Key to files and symbols

Each speaker's recordings can be separately downloaded as a zip folder. The 6 male speakers constitute folders m1-m6; f1-f6 contain the data for the 6 female speakers.

Vowels are coded as follows:

There are three short (i, a, u) and three long vowels (ii, aa, uu).

Consonants less obvious consonant codings include:

th voiced dental fricative

s voiceless alveolar fricative

x voiceless velar fricative

q voiceless uvular stop

7 voiceless pharyngeal fricative

NOTE: the number 3' indicates emphasis. For example bas' and bas3' represent a minimal pair, with a word-final plain and emphatic alveolar fricative, respectively.

Data by Subject