1. Introduction
It has been observed that when an English word ending in a stop is adapted to Korean, Korean speakers often insert a vowel after the final stop (e.g., rope→[lophɨ], knit→[nithɨ], peak→[pikhɨ]). This vowel insertion is interesting since native Korean words may end in stops (e.g., /pɑp/→[pɑp˺] ‘meal’, /kot/→[kot˺] ‘soon’, /mok/→[mok˺] ‘neck’). Thus, the original foreign forms ending in a stop would be pronounceable in Korean. This vowel insertion has been referred to as unnecessary repair by Peperkamp (2005), where a foreign structure is changed even when the original structure would have been legal in the recipient language (Golston & Yang, 2001; Kang, 2003; Peperkamp, 2005).
Previous studies have focused mainly on vowel insertion patterns per se in which some words always take an epenthetic vowel (e.g., rope→[lophɨ]), some never do (e.g., group→[kɨrup˺]), and some vary between these two options (e.g., tape→[thɛip˺]~[thɛiphɨ]). Rhee & Choi (2001) conducted a statistical analysis on the frequency of variable vowel epenthesis and Jun (2002) reported on a large-scale experiment involving 260 college students; however, both of these studies simply provided a list of factors affecting the likelihood of vowel insertion while the former relied on standardized written loanword data. Boersma & Hamann (2009) accounted for the phenomenon in terms of an Optimality-theoretic grammar model. Kwon (2017) deals with inter-speaker variation to see if Korean listeners' experience with English affects online adaptation, which is not directly connected to the main interest of the present study. Importantly, none of these studies have provided a phonetic analysis of Korean speakers' productions. Thus, the contribution of this paper is that it pays close attention to acoustic details of the recipient language, finding an answer to unnecessary repair, which is one of puzzling emergent patterns identified in the literature on loanword phonology.
Among earlier studies on unnecessary adaptations, Kang (2003) claims that this seemingly unmotivated vowel epenthesis is motivated by perceptual similarity between Korean and English forms. Kang discusses several perceptual factors promoting vowel epenthesis following English word-final postvocalic stops. One of those factors is release of final stops. Korean word-final stops are never released (Sohn, 1999), whereas word-final stops in English are variably released (Byrd, 1992; Crystal & House, 1988). Kang argues that vowel insertion may make the Korean output form perceptually similar to an English final released stop, noting that stop release in English and an epenthetic vowel in Korean are phonetically similar.
Kang (2003) describes vowel insertion pattern in this position based on a survey of loanword list compiled by the National Academy of the Korean Language (1991). Her loanword list contained loanwords from English source words that ended in postvocalic stops. According to her report, the overall frequency of final vowel insertion was 50.3%, that of having no inserted vowel was 43.6%, and that of variable insertion was 6% (Kang, 2003: 229). However, this result might not reflect the recent tendency of loanwords borrowed from English since her list was based on loanwords gathered from books published in 1990.
The current study has built a new corpus consisting of material complied in more recent publications of the National Academy of the Korean Language (2001; 2002). The corpus data is based on 540 Korean loanwords borrowed from English whose source words end in a stop. Out of 540 English words with a final stop, 264 were consistently adapted with final vowel insertion and 214 were consistently adapted without final vowel insertion, while 62 were variably adapted both with and without vowel insertion. That is, the frequency of vowel insertion patterns in the corpus is 49% for vowel insertion, 40% for no vowel insertion, and 11% for optional vowel insertion. This finding shows that vowel insertion is more frequent than lack of insertion, even though final stops are permissible codas in Korean.
The higher frequency of final vowel insertion in Korean loanwords borrowed from English might be attributed to the fact that the corpus focus on English words that have already entered the Korean lexicon. In order to investigate whether Korean speakers really insert a vowel following the English final stop, this study conducted an online production task where Korean participants listened to English nonce words ending in a stop and repeated what they heard. As a control group, English speakers were recruited from the same task. The repetition task was essentially an L2 production experiment. This task was actually about both perception and production since participants first heard and then reproduced what they heard. If release causes them to hear a final vowel, they should produce a final vowel, and an illusory vowel is expected to be reflected in their production. This laboratory adaptation study clearly has an advantage over a production method like a reading task in that the repetition experiment taps into two different levels of perception and production.
Following the completion of the repetition task, to determine whether Korean speakers inserted a vowel after final stops, their productions were analyzed in terms of the duration of noise intervals following the closure of final stops. This experiment can also serve to compare the patterns in integrated loanwords of the corpus to Korean speakers’ online production of English nonce words that were completely new to Korean participants.
2. Method
Experimental items consisted of 36 English nonce forms: 12 monosyllabic, 12 disyllabic, and 12 trisyllabic forms. English nonce forms consisted of words with a lax pre-final vowel [ɛ]. The shape of the monosyllabic words was CVC; that of disyllabic words was C1V1C2V2C; and that of trisyllabic words was C1V1C2V2C3V3C. Disyllabic and trisyllabic items had final stressed syllables (e.g., goˈzɛp˺, ˈgomoˌzɛp˺). Items varied in terms of three different linguistic factors: (i) release of final stops, i.e., 18 items ending in unreleased stops (e.g., kɛp˺, kɛb˺) and 18 items ending in released stops (e.g., kɛp, kɛb); (ii) voicing of final stops, i.e., 18 items ending in voiceless stops (e.g., kɛp˺, kɛp) and 18 items ending in voiced stops (e.g., kɛb˺, kɛb); and (iii) place of final stops, i.e., 12 items ending in labial stops (e.g., kɛp˺, kɛb˺, kɛp, kɛb), 12 items ending in coronal stops (e.g., kɛt˺, kɛd˺, kɛt, kɛd) and 12 items ending in dorsal stops (e.g., fɛk˺, fɛg˺, fɛk, fɛg). The set of stimuli used in the experiment is given in Table 1.
To create the auditory stimuli, a female speaker of American English produced the experimental items. The speaker was a linguist who was able to carefully control release of the English final stop. Praat (Boersma & Weenink, 2018) was used to check the presence/ absence of release for the auditory stimuli. The speaker recorded the stimuli in a sound-attenuated booth using a Zoom H4n recorded at 44.1 kHz sampling rate (16 bits per sample) and a Shure SM57 unidirectional dynamic microphone.
All of the auditory stimuli were analyzed to make sure that they had phonetic characteristics related to release and voicing of final stops using Praat: (i) presence/absence of final stop release, (ii) length of final stop release, and (iii) closure voicing length of voiced final stops. All of these acoustic properties were hypothesized to influence vowel insertion of Korean speakers. Specifically, regarding stop release duration, Wilson et al. (2014) suggests that L2 speakers are more likely to interpret longer stop releases as having an epenthetic vowel due to the acoustic similarity between a longer release and a vowel. The duration of closure voicing for voiced stops will also help confirm that there is a phonetic difference between voiced vs. voiceless stops in the auditory stimuli.
Each stimulus classified as having a released final stop contained evidence of visible release on the waveform and spectrogram, and no visible release was observed for stops classified as unreleased. Figures 1 and 2, waveform and spectrogram for the stimuli [kɛp˺] and [kɛp], are representative. All the other stimuli also show similar release or non-release, which is consistent with this classification.
The stop release duration was measured for released final stops. The onset of stop release was defined as the point at which a pulse of acoustic energy for the release of the final stop. The offset of stop release was the point at which acoustic energy of the stop release significantly decreased. As shown in Table 2, the mean duration of stop release was longer for voiceless final stops than for voiced final stops.
(ms) | Voiceless | Voiced | ||||
---|---|---|---|---|---|---|
Stops | Lab | Cor | Dor | Lab | Cor | Dor |
Burst duration | 11 | 15 | 25 | 12 | 15 | 18 |
Mean | 17 | 15 |
The stop closure voicing length was measured for released and unreleased voiced final stops. The onset of stop voicing was defined as the point at which acoustic energy of the preceding vowel significantly decreased and there was a change in periodicity that signaled the beginning of a stop closure. The offset of stop voicing during the closure was the point at which acoustic energy and periodicity ceased. The duration of voicing for unreleased and released voiced final stops is given in Tables 3 and 4. The results for voicing length confirmed that there was an acoustic difference between voiced and voiceless stops in the stimuli.
Ten Korean and ten English native speakers participated in the experiment. The Korean participants, five males and five females (Mean age=23.9, SD=2.0), were recruited from Sogang University in Seoul, Korea. Their average age of first exposure to English study was 10.2 years (SD=1.9). No participants had lived in an English-speaking country or majored in English at the time of the task. As a control group, ten native speakers of American English recruited from Stony Brook University participated in the repetition experiment, five males and five females (Mean age=26.3, SD=4.3). They were monolingual and had no experience with Korean. None of the participants reported any hearing or speech disorders. All were paid for their participation after completing the experiment.
Participants were asked to listen to auditory stimuli and to repeat what they heard through a laptop computer. They were given no orthographic or other information but only aural information using a headphone. Each frame consisted of repetition of a stimulus followed by the phrase "Please repeat". After this, participants were given three seconds to produce the stimulus. The participants were familiarized with the experimental task by taking a practice trial round with a couple of words that were not included in the test items. The recording of the Korean group was conducted in a sound-attenuated booth in the English Department at Sogang University, and that of the English group in the Linguistics Department at Stony Brook University. Both recordings were done using a Shure SM57 microphone and a Zoom H4n recorder at 44.1 kHz sampling rate.
The perceptual similarity approach, proposed by Kang (2003) following Steriade (2001), assumes that Korean speakers accurately perceive the English forms, but they insert a vowel in their production to maintain perceptual similarity between the English and Korean forms. Thus, this approach predicts for the production experiment that Korean speakers will produce the English final stop as a stop followed by a vowel although they correctly perceive the stop as a final consonant.
Producing C as CV should result in noise intervals after the final consonant that are longer than those associated with producing C as C even where C is released because producing C as C involves transient and frication of the stop while producing C as CV involves aspiration and onset of voicing following transient and frication. As will be discussed in the following section, noise intervals were defined as every noise made following the closure of final stops. For the production task, Korean speakers are predicted to produce stronger noise intervals than English speakers, who never insert a vowel after the final stop and simply release the stop. The vowel that is expected to be inserted by Korean speakers is predicted to be perceived as an epenthetic vowel by English listeners. The predictions given in (1) are tested by comparing the productions of Korean and English speakers and investigating the noise intervals of Korean speakers.
(1) Predictions for the production experiment
a. Korean speakers will produce significantly longer noise intervals after English final stops than English speakers.
b. The longer noise intervals of Korean speakers will be perceived by English listeners as an epenthetic vowel.
In the following section, I discuss the noise intervals after the stop closure of the final stops and check if noise intervals produced by Korean speakers are longer when compared to those of English speakers.
The productions of ten Korean and ten English speakers were measured using Praat (Boersma & Weenink, 2018). For each speaker, a noise interval following the closure of final stops was measured. I first discuss the definition of burst noise in the description of noise events of syllable-initial stops and turn to "noise intervals" that the current study addresses. Kent & Read (2002) describes a sequence of acoustic events associated with progression from a word-initial stop to a vowel: transient, frication, aspiration, and voicing. On the release, a pulse of energy is created as the air escapes. This plosion is called a transient because of its brevity and momentary character although this terminology is not widely used (Kent & Read, 2002: 141). The transient is one of the shortest acoustic events in speech, no longer than 5 to 40 ms in duration. It is followed by frication which is a turbulence noise created as the oral constriction is gradually released. Following the transient and frication, aspiration occurs in the case of word-initial stops. Aspiration is followed by onset of voicing where vocal fold vibration for the vowel is initiated.
The productions of ten Korean and ten English speakers were measured using Praat (Boersma & Weenink, 2018). For each speaker, a noise interval following the closure of final stops was measured. I first discuss the definition of burst noise in the description of noise events of syllable-initial stops and turn to "noise intervals" that the current study addresses. Kent & Read (2002) describes a sequence of acoustic events associated with progression from a word-initial stop to a vowel: transient, frication, aspiration, and voicing. On the release, a pulse of energy is created as the air escapes. This plosion is called a transient because of its brevity and momentary character although this terminology is not widely used (Kent & Read, 2002: 141). The transient is one of the shortest acoustic events in speech, no longer than 5 to 40 ms in duration. It is followed by frication which is a turbulence noise created as the oral constriction is gradually released. Following the transient and frication, aspiration occurs in the case of word-initial stops. Aspiration is followed by onset of voicing where vocal fold vibration for the vowel is initiated.
Unlike word-initial stop consonants, stops in word-final position, which are the focus of this study, may be either released or unreleased. When the stop is not released, the closure is maintained until after the utterance is finished and no burst such as transient and frication occurs. On the other hand, when the final stop is released, transient and frication appear, as in word-initial stops. This is where we expect to see differences between the productions of English and Korean speakers. English speakers who release the final stops should produce only transient and frication; however, Korean speakers are predicted to insert a vowel following the final released stop and hence produce transient, frication, aspiration and voicing of an epenthetic vowel, just like in word-initial stops. Thus, the duration of noise intervals after the stop closure is expected to be much longer in the productions of Korean speakers compared to those of English speakers since noise intervals of Korean speakers are predicted to include all of the acoustic events from transient through onset of voicing.
Measurements were conducted for items ending in released stops. The onset of noise intervals was defined as the point at which there was a pulse of acoustic energy for the release of the final stop. The offset of noise intervals was the point at which frication of the final stop significantly decreased. Here, only correct productions were included in the analysis, and error responses were excluded. Examples of incorrect responses were devoicing (b, d, g→p, t, k), voicing (p, t, k→b, d, g), and fricativization (b→v). Figures 3 through 6 are representative samples of how voiceless and voiced final stops were segmented.
Results
A statistical analysis was conducted using a linear mixed-effects model (Baayen et al., 2008), which examines the difference in noise intervals between Korean and English groups. The analysis was carried out using the lmer function in the lme4 package (Bates et al., 2015) for R (R Core Team, 2017). The dependent variable was the duration of noise intervals following the final stops. A fixed effect predictor was Group (Korean or English) and it was coded using deviation coding (English=–0.5; Korean=0.5). Random effects include participants and items. Random intercept model converged and only a random intercept was included for both participants and items.
The statistical model confirmed that Korean participants had significantly longer noise intervals than English participants (ß=0.129, SE=0.009, t=13.81, p<0.001), which was consistent with the prediction about differences in noise intervals after stop closure of final stops between the two speaker groups. Table 5 shows that the mean duration of noise intervals for Korean speakers was 188 ms, while that of English speakers was 53 ms. Male speakers produced longer noise intervals than female speakers in both Korean and English participant groups.
We now turn to the next question: is this longer noise interval of Korean speakers heard as an epenthetic vowel by English listeners? This question is important in deciding whether the Korean participants were producing final released stops or whether they were actually inserting a vowel after the final stop. In the following section, I discuss how English speakers transcribed the productions of Korean participants to determine whether English speakers actually perceive productions of Korean speakers as having an epenthetic vowel.
To see if the stronger noise intervals found in Korean speakers' productions were heard as epenthetic vowels by English listeners, the productions of Korean speakers were transcribed by two phonetically trained native English speakers.1) Transcribers were asked to decide whether the Korean participants were producing a vowel word-finally or whether they were just releasing the word-final stop. Forms on which the two transcribers did not agree were transcribed by a third transcriber. The results of the transcriptions showed that only 3% of total correct productions were heard as an epenthetic vowel, i.e., 8 responses out of 231 were perceived as having a final vowel. As in the waveform analysis of noise intervals, only correct responses were included in the transcriptions; responses that were incorrectly produced were excluded from the analysis, i.e., voiced segments as voiceless, voiceless segments as voiced, or stops produced as fricatives. Total correct production samples of ten Korean participants were 231 out of 360 (36 stimuli×10 participants), where they heard 180 items ending in released stops.
Table 6 gives the numbers of tokens perceived as having an epenthetic vowel for each Korean participant and mean duration of final vowels for tokens heard as CV; Figure 7 gives the percent of tokens perceived as CV. As shown in the figure, even the highest CV rate (S5) was only 13% and six participants (S1, S2, S3, S4, S7, and S10) had no final vowel transcribed in any of their productions (CV=0%). Although the CV rate of male speakers was higher than that of female speakers, the mean rate for male speakers was still below 5%.
The first prediction for the production task was confirmed—Korean speakers produced significantly longer noise intervals after English final stops than English speakers—on the other hand, the second prediction was not confirmed: the longer noise intervals of Korean participants were not perceived by English listeners as an epenthetic vowel.
3. Discussion
The fact that more than 95% of Korean participants' productions were perceived to include no epenthetic vowel was not consistent with the loanword data, where 49% of words showed vowel insertion. The result of the production task was also inconsistent with the prediction of the perceptual similarity approach. This view predicted that because Korean speakers correctly perceive an English final released stop as a final consonant, they would insert a vowel to make the English sound more similar to the Korean sound.
The difference in the results between the loanword analysis and the production task might have arisen from the fact that the corpus study was based on written integrated loanwords. Korean loans written in books tend to observe the guidelines of the National Academy of the Korean Language, where vowel insertion is required when certain conditions are satisfied. For example, the guidelines indicate that a word-final voiced stop shall be written with [ɨ] and that a word-final voiceless stop after a lax vowel shall be written as a coda while one after a tense vowel shall be followed by [ɨ] (http://www.korean.go.kr/). However, in the production experiment, Korean participants were asked to immediately repeat a series of English nonce words. The results from the online adaptation would suggest that speakers were trying to imitate the release of the English final stop in an exaggerated manner by the longer noise intervals after the stop closure. The longer noise interval did not turn out to be identified as an epenthetic vowel by English listeners. That is, the productions of Korean participants as perceived by English speakers almost never included final vowel insertion. Therefore, the results of the production task were not predicted by the perceptual similarity approach.
There are other possible explanations for this unexpected finding. First, it is possible that the nature of the task was simply too different from actual loan adaptation, where listeners might have more competing demands on their attention. Here in the production task, participants heard and repeated a single word, whereas in loanword adaptation listeners might hear different words in different contexts while they are doing real processing and therefore be more likely to misperceive. It is also possible that Korean speakers did intend to produce a final vowel, but that English listeners failed to hear this vowel because Korean high vowels tend to be devoiced after aspirated stops (Jun & Beckman, 1994). The waveform analysis of participants' productions showed that some of their final vowels really tended to be devoiced following released stops, which suggests that English listeners might have perceived the Korean devoiced vowel as consonant release.
The latter possibility will be pursued in future research by examining the phonetic properties of epenthetic vowels in Korean and comparing them with those of lexical vowels. If it turns out that epenthetic vowels are acoustically different from lexical vowels and that they are phonetically close to devoiced vowels, that would account for the finding that English transcribers perceived Koreans' strong noise intervals as stop release. Moreover, the mismatch between the loanword patterns and the production experiment raises the question of what actually happens in perception of English forms by Korean speakers. As the perceptual similarity approach, Korean listeners might accurately perceive an English final released stop as a final consonant. On the other hand, they might incorrectly perceive it as a stop followed by a vowel (misperception approach, Boersma & Hamann, 2009; Broselow, 2009; de Jong & Park, 2012; Dupoux et al., 1999; Kwon, 2017; Silverman, 1992; among others). I will conduct a perception experiment in future study to carry on a discussion about the two different approaches and to investigate the perception of L2 speakers.