Phonetic aspects of second language acquisition

Several projects involve the acoustics, perception, and on-line processing of foreign-accented speech.

Instead of training non-native speakers of English to improve their English pronunciation, we tried to improve native English speakers’ comprehension of Spanish-accented English (Wade, 2003 dissertation; Wade, Jongman, and Sereno, 2007). Results showed that three days of high-variability training had virtually no effect. Subsequent acoustic analysis of the vowels of the target words revealed that non-native speakers’ productions were consistently more variable (by about 33%) in terms of both height and backness than those of native speakers. The most heavily overlapping categories were precisely the categories that seemed to cause the most errors in comprehension after training. These observations suggest that variability, rather than deviation from native-produced categories, is to blame for difficulty in their learning. This notion that the confusion and increased category overlap caused by non-native variability, and not simply non-native speakers’ absolute deviation from native-like sounds, was the primary cause of difficulty was further supported by a training study in which the mean and variability of the categories that participants were exposed to were systematically manipulated.

At the suprasegmental level, we examine the acquisition of English lexical stress by non-native learners through production and perception studies. In our study of Arabic learners of English, (Zuraiq, 2005 dissertation; Zuraiq and Sereno, 2007) we investigated the production of lexical stress by native speakers of English as well as learners of English. Two experiments were conducted. The first experiment inspected acoustic cues to lexical stress in English minimal pairs. Stressed vowels were compared to unstressed reduced vowels. Minimal pairs were recorded by native speakers of English and Arabic learners of English. Four acoustic cues were examined: duration, fundamental frequency, amplitude, and second formant frequency. Results showed that native speakers of English consistently use all four cues to signal stress, with lower f0, shorter duration, lower amplitude, and more reduced vowel quality for unstressed syllables. The Arabic learners of English were similar to native speakers in their use of duration and amplitude cues. Interestingly, Arabic second language speakers used f0 cues to a greater extent than native English speakers.  Also, Arabic second language speakers did not reduce unstressed vowels, with little difference in F2 between stressed and unstressed vowels. In a second experiment, Arabic bisyllabic minimal pairs contrasting in stress placement were examined to observe the cues used by Arabic speakers in their native language. The results consistently showed that Jordanian Arabic speakers use duration, amplitude and f0 to cue stress in Arabic but do not reduce vowels in Arabic to cue stress. In English, however, Arabic speakers increase their use of amplitude and duration cues to resemble English speakers but Arabic speakers do not appropriately reduce unstressed vowels in English. Instead, they over-use fundamental frequency cues.

In our study of Mandarin learners of English (Lai, 2009 dissertation; Lai and Sereno, 2007), an acoustic study focusing on the implementation of mean F0, max F0, duration, intensity, and F2 in stressed and unstressed vowels in noun-verb word pairs contrasting in stress location (e.g. object and object) was conducted.  The results from native English speakers (n=10) showed that all correlates were utilized to signal stress in nouns.  In verbs, however, mean and max F0 were not utilized and duration cues were amplified.  Implementation patterns for Mandarin L2 learners (beginning=9; advanced=9) were similar to native speakers in nouns. However, in verbs learners used mean and max F0 as well.  Reduction of unstressed vowels was found to be inconsistent in learners when compared to native speakers.  A perceptual study utilizing the disyllabic nonword ‘dada’, with resynthesized max F0, duration, and vowel quality, was conducted in order to evaluate the perceptual relevance of those cues in stress perception.  Results from an identification task indicate that full vowels induce significantly stronger stress perception in all listener groups.  In terms of max F0 and duration, beginning listeners (n=25) relied mainly on duration, advanced listeners (n=25) focused more on max F0, while native listeners (n=25) made use of both duration and max F0 in perception. These findings are discussed in terms of the similarities and differences in prosodic systems between Mandarin and English, as well as the possible discrepancies in production and perception data from second language learning research.

Most recently, we have tried to tease apart the contribution of segmental and suprasegmental information to the perception of foreign accent (Lammers, 2010, honors thesis).This study examined the relative impact of segmentals and suprasegmentals on accentedness, comprehensibility, and intelligibility. Two English and two Korean speakers recorded 40 sentences from the CID Everyday Sentences list. These sentences were then manipulated by combining the segmentals from one speaker with the suprasegmentals (specifically pitch contour and duration) of another speaker. Four versions of each sentence were created: one English control, one Korean control, and two Korean-English combinations (one with Korean suprasegmentals and English segmentals and the other with English suprasegmentals and Korean segmentals). These sentences were presented to 40 native English speakers who transcribed the sentences for intelligibility and rated their comprehensibility and accentedness. The study found that segmentals have a significant effect on accentedness, comprehensibility, and intelligibility. Suprasegmentals only had a significant effect on intelligibility. Therefore, it seems that native speakers rely on segmentals when determining how accented and comprehensible non-native speech is.

In terms of on-line processing, we investigated the effects of accentedness and phonetic inventory on word recognition (McCall 2001, honors thesis; Sereno, McCall, Jongman, Dijkstra, and Van Heuven, 2002; Jongman and Wade, 2007). Forty undergraduates from the University of Kansas participated in a lexical decision task in which both native English and Dutch-accented English stimuli were used. A group of 40 Dutch college students were tested with the same materials at the University of Nijmegen in The Netherlands. The phonetic make-up was manipulated such that half of the stimuli consisted of phonemes that were unique to English and the other half consisted of phonemes that were common between English and Dutch. A nonword condition was included for control purposes.

Results for the reaction time data revealed a significant Listener by Speaker interaction Post-hoc tests revealed that, as expected, American listeners respond significantly faster to native English speech than to Dutch-accented speech. In contrast, the Dutch listeners showed the opposite pattern, with significantly faster reaction times to the Dutch-accented speech. A similar significant interaction was also found for the error data. American listeners made significantly more errors on the Dutch-accented speech than on the native speech while no difference is observed across native and non-native speech for the Dutch listeners. It seems that in a speeded lexical decision task, American listeners preferred the native speech while the Dutch listeners preferred their own Dutch-accented variety.

A second set of results involves the contrast between stimuli containing common versus unique phonemes. When listening to native English, American listeners did not make any distinction between common and unique phoneme stimuli, neither in terms of reaction time nor error rate. This should not come as a surprise since this common/unique distinction is of course meaningless to monolingual English speakers. However, when American listeners hear Dutch-accented speech, they make significantly more errors on unique phoneme stimuli than on common-phoneme stimuli. The unique-phoneme accented stimuli, as heard by the native listeners, resulted in many more errors.

A different pattern obtains for the Dutch listeners listening to Dutch-accented speech: Unlike the Americans, they have comparable reaction times and error rates for unique phoneme stimuli and common phoneme stimuli. The Dutch do not seem to be hindered by the accented unique stimuli. However, when the Dutch listeners listen to native English, a difference between common and unique phonemes shows up in terms of reaction time. The Dutch listeners have more trouble with the unique than with the common phoneme stimuli when produced by a native speaker: they respond more slowly to the unique phoneme stimuli. Presumably, this reflects the fact that the native English stimuli with unique phonemes mismatch the Dutch listeners’ internal representations for those sounds.

One of 34 U.S. public institutions in the prestigious Association of American Universities
Nearly $290 million in financial aid annually
44 nationally ranked graduate programs.
—U.S. News & World Report
Top 50 nationwide for size of library collection.
23rd nationwide for service to veterans —"Best for Vets," Military Times
KU Today