Phonetics

The relationship between vowel production and proficiency levels in L2 English produced by Korean EFL learners*

Seohee Lee1, Seok-Chae Rhee1,**
Author Information & Copyright
1Department of English Language and Literature, Yonsei University, Seoul, Korea
**Corresponding author : scrhee@yonsei.ac.kr

© Copyright 2019 Korean Society of Speech Sciences. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Mar 19, 2019; Revised: Jun 18, 2019; Accepted: Jun 23, 2019

Published Online: Jun 30, 2019

Abstract

This study explored the relationship between accurate vowel production and proficiency levels in L2 English produced by Korean EFL adult learners. To this end, nine English vowels /i, ɪ, ɛ, æ, ʌ, ɔ, ɑ, ʊ, u/ were selected and adjacent vowels paired up (e.g., /i/-/ɪ/, /u/-/ʊ/, /ɛ/-/æ/, /ʌ/-/ɔ/, /ɔ/-/ɑ/). The spectral features of the pairs were measured instrumentally, namely F1 (indicating tongue height) and F2 (indicating tongue backness). Meanwhile, the durations as well as spectral features of the tense and lax counterparts in /i/-/ɪ/ and /u/-/ʊ/ were measured, as both temporal and spectral features are important in distinguishing them. The findings of this study confirm that higher-rated speakers were better able to distinguish the contrasts in the front vowel pairs /i/-/ɪ/ and /ɛ/-/æ/ than lower-rated learners, but in the central and back vowel pairs /u/-/ʊ/ and /ʌ/-/ɔ/ (though not /ɔ/-/ɑ/), Korean EFL learners generally showed difficulty distinguishing adjacent vowels with spectral cues. On the other hand, the durations of the tense and lax vowels showed that the lower-rated speakers were less able to use the temporal feature to differentiate tense vowels from their lax counterparts, unlike previous studies that found that in general Korean learners depend excessively on the temporal cue to distinguish tense and lax vowels.

Keywords: L2 production; Korean EFL learners; proficiency effects; L2 vowel production

1. Introduction

It is challenging for English as a Foreign Language (EFL) learners to acquire English vowels. Many previous studies have demonstrated that non-native English speakers find it difficult to identify English vowels (Cho et al., 2013; Cutler et al., 2005; Flege et al., 1997; Franklin, 2009; Wang & van Heuven, 2006).

First and foremost, it has been widely acknowledged that language transfer heavily contributes to second/foreign language acquisition through numerous earlier studies. Lado (1957) proposed Contrastive Analysis Hypothesis (CAH) to explain the correlation between a native language (L1) and a second language (L2) in L2 acquisition: the greater the affinity between L1 and L2, the easier it is to learn L2 while the more dissimilarity there is, the harder to learn L2. However, CAH has been criticized and disputed on the grounds that it is insufficient for predicting the degree and directionality of difficulty for second-language learners (Eckman, 1977); furthermore, it cannot ensure that the similarity between two languages necessarily facilitates learning or that difference automatically inhibits the process.

Meanwhile, Speech Learning Model (SLM), suggested by Flege (1987), indicated that the more similarity that exists between L1 and L2, the more difficult it is for second-language learners to establish a new category for the L2 system. Additionally, Flege (1995) proposed an improved version of SLM that regards the L2 learning experience as one of the most important factors in L2 acquisition. Learning a new non-native inventory that is absent in a native language may seem difficult at first but can actually be learned with ease as learners extend their L2 experience/exposure. That is, L2 learners with an extended L2 experience in the complete attainment can reach a native-like performance in L2 acquisition (Ho, 2010).

With regard to acquiring L2 English, English vowels are more uncertain and equivocal for second-language learners of English than English consonants. This is largely due to the fact that it is difficult to elucidate the articulation of vowels compared to that of consonants, regardless of what the second language is (Franklin, 2009; Jones, 1960). Vowel articulation consists of complex movements of the tongue (e.g., the tongue height, backness of the tongue, and lip rounding), making it difficult for non-native English speakers and even L1 English speakers to pronounce English vowels correctly (Jones, 1960; Ladefoged & Disner, 2012:129). In the case of English, tenseness, the degree of tongue tension, also affects the articulation of English vowels. For this reason, when EFL learners’ L1 vowels are not distinguished by tenseness, as is the case for Chinese or Korean EFL learners, they experience greater difficulty in identifying English vowels, than those whose L1 distinguishes tense and lax vowels, such as Dutch or German EFL learners (these relevant papers will be illustrated in section 1.1).

Furthermore, compared to other languages in the world, the relatively high density of the English vowel system causes pronunciation difficulties for EFL learners insofar as they may encounter vowels in English that simply do not exist in their native language (Franklin, 2009; Maddieson, 1997). For instance, in the case of monophthongs, American English has approximately 14 or 15 vowel qualities /i, ɪ, u, ʊ, e, ɛ, æ, ɚ, ɝ ə, ʌ, ɔ, o, ɑ, (a)/ in its vowel inventory (Kenyon & Knott, 1953; Reetz & Jongman, 2009). In fact, fewer than 10% of languages in the world contain 15 or more vowel phonemes in their vowel systems (Franklin, 2009). In addition to English, Dutch and German have 13 and roughly 13 to 17 plain vowels respectively. However, approximately 60% of the world’s languages contain only six or fewer pure vowel qualities in the vowel system (Maddieson, 1997). For example, Spanish, Japanese, and Mandarin Chinese (i.e., Beijing dialect) comprise five pure vowels qualities in each vowel inventory (Franklin, 2009). On the other hand, Korean is composed of ten vowel phonemes /i, y, e, ø, ɛ, ɨ, ʌ, a, u, o/ (Yang, 1996), which is denser than Spanish, Japanese, and Mandarin Chinese but less dense than English (Franklin, 2009).

Based on the theoretical backgrounds outlined above, it can be predicted that EFL learners 1) with native languages (L1) that substantially differ from English such as Korean, Chinese, and Japanese and 2) with different levels of L2 proficiency, as claimed by SLM, result in varied patterns of English vowel production. Given these predictions, this paper concentrates on the relationship between English vowel production and L2 proficiency in L2 production by EFL learners— especially Korean EFL learners.

1.1. Non-Native Speakers’ English Vowels

There have been a variety of cross-linguistic studies on English vowel production by non-native English speakers with diverse native language (L1) backgrounds. For instance, Wang & van Heuven (2006) conducted research on ten English vowels /i, ɪ, u, ʊ, ɛ, ɔ, æ, ʌ, o, e/ produced by Chinese, Dutch, and American speakers. The study classified the English vowels into two subsets: the short/lax vowels (e.g., /ɪ, ɛ, ʌ, ʊ/) and the long/tense vowels (e.g., /i, e, æ, ɔ, o, u/) then compared the two subsets through spectral (i.e., formant frequencies) and temporal (i.e., vowel duration) features across the three L1 groups. According to the results of the research, L1 English speakers were more accurately able to separate the two subsets in both spectral and temporal aspects compared to the two non-native (i.e., Chinese and Dutch) groups. Chinese speakers, however, basically failed to differentiate the short/lax vowels from their long/tense counterparts in a spectral fashion while native Dutch speakers demonstrated clear spectral differences between the vowels. This is due to the fact that Chinese does not regard the length of vowels (i.e., short/lax or long/tense vowels) as a vowel feature at the phonological level, but Dutch operates similarly to English in distinguishing phonetically lax and tense vowels in its vowel system (Wang & van Heuven, 2006). However, as for the temporal feature (i.e., vowel duration), native Chinese speakers demonstrated a clearer temporal distinction between the short/lax and long/tense vowels than Dutch EFL learners. Dutch speakers, by contrast, scarcely distinguished the tense and lax vowel contrast /u/-/ʊ/. Although both English and Dutch have tense and lax vowel subsets in their vowel inventories, there is a lack of the /u/-/ʊ/ contrast in the Dutch vowel system, preventing Dutch EFL speakers from separating the contrast (Li & Lee, 2017; Wang & van Heuven, 2006).

Studies of English have also been conducted on the production of English vowels by Japanese learners. Ingram & Park (1997) investigated the perception and production of Australian English vowels /i, ɪ, ɛ, æ, a/ by Korean and Japanese learners of English. A noteworthy finding from the study was that Japanese learners of English clearly perceived the /æ/ vowel that was absent in their L1 inventory; furthermore, the /æ/ vowel was distinguished from its near neighbor vowel /a/. The study concluded that this was at least, in part, because Japanese had phonological length as a main acoustic cue (Tsujimura, 1996), suggesting that this L1 characteristic affected the perception of non-native vowels. Meanwhile, Korean speakers rarely separated the /ɛ-æ/ contrast since the recent phonological merger /ɛ/-/e/ in their L1 vowel system was reflected in L2 production (Ingram & Park, 1997).

On the whole, these previous studies confirmed that L1 transfer causes a lack of fluency in non-native L2 pronunciation, which does not occur in the L2 learners’ native language (Best, 1991). Other empirical studies regarding English vowels produced by L1 English speakers and Korean learners of English have been discussed in section 1.2.

1.2. Korean EFL Learners’ English Vowels

English and Korean have basically disparate vowel systems. The two languages share /i, e, ɛ, o, u, ʌ/ in common. However, Korean does not have the four vowels /ɪ, æ, ʊ, ɑ/ (the vowel /ɑ/ in English is not identical to the vowel /a/ in Korean) that are present in the English vowel inventory but instead uses two front rounded vowels /ø, y/ and one high central vowel /ɨ/, which are not found in English (Franklin, 2009; Yang, 1996). In addition, as mentioned above, English has tense and lax vowel subsets in its inventory such as /i/-/ɪ/, /ɛ/-/æ/, and /u/-/ʊ/ whereas there are no Korean vowels that are distinguished by the degree of tongue tension (Cho et al., 2013; Flege et al., 1997; Ingram & Park, 1997; Li & Lee, 2017; Kim, 2007; Koo, 2000; Tsukada et al., 2005; Yang, 1996). Thus, it can be expected that Korean EFL learners may have difficulty in pronouncing vowels in English that do not exist in the Korean inventory. There are a number of prior studies concerning English vowel production by native speakers of English and Korean EFL speakers.

Flege et al. (1997) compared German, Spanish, Mandarin Chinese and Korean EFL learners with native English speakers in the production of front English vowel contrasts /i/-/ɪ/ and /ɛ/-/æ/. The results from Flege et al. (1997) indicated that Korean speakers, in particular, were unduly dependent on the length of the vowels in separating the contrasts as compared to the other non-native English speakers’ groups. Yang (2008) surveyed English tense and lax vowels /i/-/ɪ/ and /u/-/ʊ/ produced by Korean and American males. Both groups temporally separated the contrasts by producing the tense vowels much longer than their lax counterparts. By contrast, the spectral distinction between the pairs was not apparent to Korean speakers but to native English speakers who presented a marked contrast between the pairs in the first formant (F1) relevant to the tongue height.

All these findings confirm that Korean EFL learners barely separate English tense and lax contrasts in a spectral manner due to the lack of tense and lax vowels in the Korean vowel system. Owing to the fact that the Korean inventory has fewer vowels than the English inventory, Korean learners of English yield a smaller English vowel space compared to native English speakers, checking them in the phonetic realization of English vowels, especially in the production of phonetically neighboring vowels (Franklin, 2009; Koo, 2000). In other words, Korean learners of English are comparable to Chinese and Japanese learners of English insofar as each group demonstrates negative L1 transfer in the production of L2 English vowels that are not present in native language (L1) inventory.

However, according to Flege’s SLM (1995), L2 learners with extended L2 experience/exposure can improve their L2 performance. Namely, these instances of negative L1 transfer can be overcome as learners’ L2 experience/exposure grows. Indeed, there are numerous empirical studies to support the claim in SLM, illustrated in section 1.3.

1.3. Learner Factors Affecting L2 English Production

Numerous studies have classified EFL learners’ L2 experience according to several factors and investigated whether these factors affect L2 English production.

Flege et al. (1997) studied the effects of English-language experience in the production and perception of English front vowels /i, ɪ, ɛ, æ/ by non-native English speakers including native Spanish, German, Mandarin, and Korean speakers. The EFL learners were dispersed into relatively experienced or inexperienced groups on the basis of their length of residence in the US (mean=7.3 vs. 0.7 years). In general, experienced groups produced and perceived English vowels more accurately than relatively inexperienced ones regardless of the non-native English speakers’ groups. Wang (1988) studied the perception and production of English /i/-/ɪ/ and /ɛ/-/æ/ contrasts by Mandarin Chinese EFL learners. The Chinese learners were classified according to their length of stay in an English-speaking country (mean=1 vs. 5 years). The results suggested that the less experienced group pronounced the vowel /ɪ/ like /i/ and the vowel /æ/ like /ɛ/ while the relatively experienced group produced the vowel /ɪ/ and /æ/ similarly to native speakers.

Tsukada et al. (2005) examined the production and perception of eight English vowels /i, ɪ, e, ɛ, æ, ɑ, ʌ, u/ by native Korean adults and children. The two distinct age groups were compared to age-matched native English speakers respectively. The research yielded the following finding: native Korean children had a better understanding of identifying and separating English vowels than native Korean adults. The L1 Korean children resembled their age-matched native English speakers in both their perception and production of the eight English vowels; however, the L1 Korean adults failed to produce a native-like performance in either their perception or production of the vowels.

Kim (2007) investigated the production of English front vowels /i, ɪ, e, ɛ, æ/ by Korean L2 English learners. The Korean speakers were assigned to 10 distinct groups based on three main factors: their age of arrival in the US, length of residency in the US, and their degree of motivation to learn English. Prior to the analysis in this research, the participants filled in a questionnaire to determine their age, gender, time of arrival, the length of residence in the US, and their degree of motivation to learn English. The results showed that most of the Korean learners hardly separated the /i/-/ɪ/ and /ɛ/-/æ/ contrasts. Most importantly, only those who arrived in the U.S. before the age of 11 could pronounce English vowels in much the same way as native English speakers. Even in the case of those who had been long-time residents in the Unites States, if they had not arrived in the US during early childhood, they were more likely to struggle to distinguish the English front contrasts.

Except for the production of English vowels, Escudero et al. (2012) probed the perception of English front vowel contrast /ɛ/-/æ/ through two regional varieties of Dutch: the North Holland Dutch spoken in the Netherlands and Flemish Dutch spoken in Belgium. In fact, the lack of the English /ɛ/-/æ/ contrast was evident irrespective of regional differences of Dutch. However, the research also found that the two varieties differed in their non-native perception of the English vowels /ɛ/ and /æ/. Specifically, North Holland speakers identified English /ɛ/ more accurately than /æ/, whereas the Flemish group showed the same result of identifying both vowels.

There have been empirical studies on the effects of L2 proficiency in L2 English production. Ho (2010) investigated the influence of L2 proficiency levels in the production and perception of American English front vowels /i, ɪ, e, ɛ, æ/ by EFL learners in Taiwan. There were 40 EFL participants assigned to either a higher-level proficiency EFL group (HEFL) or a lower-level proficiency EFL group (LEFL) (20 vs. 20, respectively) through the scores on their English proficiency level tests, otherwise known as the GEPT (General English Proficiency Test), standardized English proficiency test in Taiwan. The results displayed significant L2 proficiency effects in the production and perception of English front vowels by Taiwan EFL learners. The HEFL group significantly outperformed the LEFL group in the perception of all the front vowels. As for production performance, the HEFL group produced the vowel /æ/ in a near-native fashion while the LEFL group had a better performance with the vowel /i/ and /ɛ/ than with other front vowels but failed to reach a near-native level across all the vowels.

However, most studies on the influence of L2 English proficiency in L2 English production, especially by Korean EFL learners, have typically focused on English consonants than on English vowels (Cho, 2017; Kong & Yoon, 2013; Lee, 2018; Park, 2017; Park et al., 2010). For instance, Park et al. (2010) examined whether the Korean learners’ production of English n-l sequenced words (e.g., only, fan letter, boneless) and m-l sequenced words (e.g., homeland, home loan, harmless) correlate with the influence of Korean /n/-lateralization (e.g., /non-li/ [nol.li] ‘logic’, /nan-lo/ [nal.lo] ‘stove’) and /l/ nasalization (e.g., /kam-li/ [kam.ni] ‘supervision’, /kɨm-li/ [kɨm.ni] ‘interest rate’) when they acquire the L2 English sound system. The finding of the study demonstrated that in general, the high proficiency group outperformed the low proficiency group in the production of both the n-l and m-l sequenced words. Kong & Yoon (2013) examined how Korean learners of English employ multiple acoustic cues (i.e., VOT and F0) in the perception and production of the English alveolar stop with a voicing contrast. The effects of L2 English proficiency were visible insofar as the high proficiency group had better control of inhibiting and enhancing the relevant acoustic parameters. Cho (2017) probed native Korean speakers’ production of English stops and fricatives by the rated L2 English read speech corpus spoken by Korean learners of English. The results showed that there was a correlation between higher proficiency levels and appropriate aspiration while lower levels displayed the high proportion of stops and fricatives production errors.

Overall, several learner factors, including the length of residence in the US, age (adult vs. children), regional varieties of L1, and L2 proficiency have been investigated in relation to their influence on non-native speakers’ production of English vowels. They have demonstrated that the ways to classify L2 learners correlate, to some extent, with EFL learners’ English production and perception. However, the research on the role of L2 English proficiency in Korean learners of English has focused more on the production of English consonants in comparison with that of English vowels. In this regard, the current thesis examines the correlation between Korean EFL learners’ L2 proficiency and English vowel production.

The purpose of this present study is to determine whether there is a relationship between accurate vowel production and L2 proficiency in L2 English produced by Korean EFL learners. Our working hypothesis is that the higher the levels in the rated speech corpus, the better they separate the adjacent vowel pairs.

1.4. Present Study

This present study, based on the results of previous studies, investigates the relationship between English vowel production and L2 proficiency in L2 production by Korean EFL learners. To this end, this study employs rated L2 English read speech corpus, named ‘Genie SpeeCor’, spoken by Korean learners of English (Rhee, 2016). According to Mauranen (2004), in L2 production, speech corpus is helpful and needed for teaching and learning non-native speakers. In addition, the size of corpus has been considered an important matter to provide a representative corpus to permit the way the language is actually used (Campbell et al., 2007; Park, 2017). There have been various speech corpus of Korean EFL learners (Kim et al., 2004; Yoon et al., 2009) but they lack the rated corpus with detailed guidelines for rubric to evaluate Korean learners. In this regard, Genie SpeeCor is composed of 200 Korean EFL learners and rated them with detailed scoring rubric (see detailed in Chapter 2).

Compared to earlier work on Korean EFL learners’ production of English vowels that have been heavily biased toward the production of English tense and lax contrasts (Cho et al., 2013; Flege et al., 1997; Ingram & Park, 1997; Li & Lee, 2017; Kim, 2007; Tsukada et al., 2005; Yang, 2008), this study incorporates more comprehensive English vowel phonemes /i, ɪ, ɛ, æ, ʌ, ɔ, ɑ, ʊ, u/. All the nine vowels are paired with adjacent vowels (e.g, /i/-/ɪ/, /u/-/ʊ/, /ɛ/-/æ/, /ʌ/-/ɔ/, /ɔ/-/ɑ/) and each pair is compared by being phonetically measured with a spectral feature (i.e., formant frequency) that is the primary cue in separating vowel qualities. However, out of the adjacent pairs, the tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/ are acoustically measured with a temporal feature (i.e., vowel duration) as well as with a spectral feature on the grounds that both acoustic cues (i.e., formant frequency and vowel duration) play a key factor in distinguishing between the tense and lax contrast.

2. Method

2.1. Genie Speech Corpus

This study employs rated L2 English read speech corpus, named ‘Genie SpeeCor’, spoken by Korean learners of English. A total of 200 native Korean speakers participated in this speech corpus. With the exception of 10 subjects who used to live in English-speaking countries for less than five years, all of the subjects had never lived in other countries. Specifically, they consisted of three age groups: sixty elementary school students (age range: 10–12 years old; 29 males vs. 31 females), eighty middle school students (age range: 13–14 years old; 40 males vs. 40 females), and sixty adults (age range: 19–33 years old; 30 males vs. 30 females). There were 100 English sentences in each group. The participants were asked to read the text materials aloud at a casual rate through the head-set microphone, Shure WH20XLR, in a sound-controlled room. Their voices were set at 16 kHz/ 16 Bit and recorded as a PCM format (Park, 2017; Rhee, 2016). The fluency in Korean L2 English was assessed by five human raters. Three of them were native Korean speakers who were either researchers or graduate students majoring in English language and literature in 2016. The other raters were native English-speaking education experts. They had all been trained with respect to the evaluation of L2 English proficiency before implementing the pronunciation and fluency ratings. They rated the recorded audio files for L2 English proficiency through a scoring tool on the screen suggested by ETRI (Electronics and Telecommunications Research Institute) (Rhee, 2016). The evaluated text materials were assigned to five different levels of L2 English proficiency from level 1 (novice) to level 5 (mastery). The raters assessed the data according to specific phonetic features (i.e., analytic evaluation) and then evaluated them in general (i.e., holistic evaluation). Appendix A and B provide information for the analytic and holistic scoring rubric in the Genie speech corpus.

2.2. Participants

Of the age groups, only the adult group (age range: 19–33 years old; 30 males vs. 30 females) was chosen for this study to eliminate the effects of age. Furthermore, in the adult group, the data produced by male adults, not by female adults, were selected for this study. Due to the fact that most materials are densely clustered in the intermediate level (i.e., level 3) (e.g., 65 tokens in the level 1; 471 in the level 2; 1,064 in the level 3; 176 in the level 4; 65 in the level 5), the five rated levels are further redistributed into three categories: level 1–2 to the lower level, level 3 to the middle level, and level 4–5 to the higher level.

2.3. Stimuli
2.3.1. Formant Frequency

In this analysis of formant frequency, the first two formant frequencies (i.e., F1 and F2) are measured since they play the most important role in determining vowel quality. From the text materials, the nine vowels /i, ɪ, ɛ, æ, ʌ, ɔ, ɑ, u, ʊ/ are selected as follows:

  1. CVC within a word

    e.g., /ʊ/ in ‘could’ [kʊd]

  2. CVC across word boundaries

    a. C// VC e.g., /i/ in ‘this evening’ [ðɪs] [iːvnɪŋ]

    b. CV //C e.g., /u/ in ‘to the’ [tu] [ðə]

  3. VC at the beginning of a sentence.

    e.g., /ɪ/ in ‘It was~’ [ɪt]

All the selected vowels have primary stress and are followed by a consonant. All the surrounding consonants are obstruents (e.g., stops /p, t, k, b, d, g/, fricatives /f, v, Ɵ, ð, s, z, ʃ, ʒ, h/, and affricates /ʧ, ʤ/) to lessen co-articulation effects on the vowels. If a vowel is surrounded by sonorants (e.g., glides /w, j/, liquids /l, r/, and nasals /m, n, ŋ/), it is fairly difficult to identify the formant boundary between the vowel and the sonorants. This is largely because sonorants have relatively high resonance (Ladefoged & Disner, 2012:77). Thus, to reduce these measurement difficulties, the vowels to which sonorants are adjacent are excluded from this study.

2.3.2. Vowel Duration

For the analysis of vowel duration, the tense and lax vowels /i/-/ɪ/ and /u/-/ʊ/ that 1) contain a voiced obstruent /b, d, g, v, ð, z, ʒ, ʤ/ as the following consonant and that 2) are not existent in the last syllable of a sentence are chosen. This is due to the fact that the durational difference in vowels may occur depending upon the voicing of a following consonant (i.e., voiced vs. voiceless) (Fougeron & Keating, 1997; Klatt, 1975) and that the vowel in a syllable is placed at the very end of a sentence tends to elongate (Cho et al., 2013; Klatt, 1975).

2.4. Procedure

Acoustic characteristics (F1, F2, and duration) of the vowels are measured with WaveSurfer 1.8.8 software program.

To measure acoustic features (F1, F2, and duration), vowel onset is considered the point which shows the onset of periodicity in the waveform and the onset of voicing in the spectrogram as strong vertical striations of F1. Vowel offset is defined as the point representing the offset of periodicity in the waveform and a cessation of formant bands in the spectrogram. The temporal interval from the vowel onset to the vowel offset is regarded as vowel duration. F1 and F2 frequencies of a vowel are measured right in the middle of the temporal interval.

3. Results

3.1. Formant Frequency

The F1 and F2 means and standard deviations (in parenthesis) of nine English vowels in the rated levels were presented in Table 1.

Table 1. Mean F1 and F2 frequencies and (Standard Deviations) of English vowels (F1 and F2 are in Hz)
Level 1–2 Level 3 Level 4–5
/i/ F1 341 (55) 341 (57) 337 (89)
F2 2,063 (302) 2,124 (174) 2,334 (133)
/ɪ/ F1 338 (47) 350 (56) 433 (95)
F2 2,056 (150) 2,122 (252) 2,004 (77)
/ɛ/ F1 556 (91) 559 (80) 605 (141)
F2 1,856 (129) 1,873 (238) 1,821 (203)
/æ/ F1 568 (94) 576 (125) 695 (146)
F2 1,863 (200) 1,827 (225) 1,733 (157)
/ʌ/ F1 581 (120) 594 (89) 625 (123)
F2 1,201 (207) 1,132 (145) 1,335 (224)
/ɔ/ F1 582 (136) 593 (97) 592 (68)
F2 1,199 (160) 1,150 (56) 1,081 (49)
/ɑ/ F1 666 (117) 748 (168) 756 (111)
F2 1,233 (187) 1,288 (211) 1,182 (128)
/u/ F1 382 (45) 386 (60) 395 (47)
F2 1,365 (235) 1,366 (176) 1,349 (191)
/ʊ/ F1 384 (56) 402 (28) 411 (59)
F2 1,421 (269) 1,423 (245) 1,439 (177)
Download Excel Table

To compare the rated levels’ vowel spaces, the mean F1–F2 plot of the vowels across the rated levels is displayed in Figure 1.

pss-11-2-1-g1
Figure 1. The mean F1–F2 plot of English vowels across the levels (F1 and F2 are in Hz).
Download Original Figure

Compared to the intermediate and lower levels (i.e., level 1–2 and 3), the higher level (i.e., level 4–5) has a relatively larger vowel space. In particular, the disparity between the rated levels is greater in the front vowels /i, æ/ and the low-back vowel /ɑ/ (F2 of the vowel /i/: M=2,063 Hz in level 1–2, M=2,124 Hz in level 3, M=2,334 Hz in level 4–5; F1 of the vowel /æ/: M=568 Hz in level 1–2, M=576 Hz in level 3, M=695 Hz in level 4–5; F1 of the vowel /ɑ/: M=666 Hz in level 1–2, M=748 Hz in level 3, M=756 Hz in level 4–5). Moreover, the results showed that the central and back vowel pairs /ʌ/-/c/ and /u/-/ʊ/ substantially overlap compared to other adjacent vowels. For a more accurate analysis of the spectral distinction between the adjacent pairs in the rated levels, spectral acoustic cues (F1 and F2) are investigated respectively.

3.1.1. The Comparison of F1 Values

As a matter of fact, every single adjacent pair is composed of two vowels with different tongue heights. To be specific, for the adjacent vowel pairs /ɛ/-/æ/, /ʌ/-/ɔ/, and /ɔ/-/ɑ/, the vowels /ɛ, ɔ/ are produced in the middle of the tongue height while the vowels /æ, ʌ, ɑ/ are placed low in terms of tongue height (Kenyon & Knott, 1953). Regarding the tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/, the tense vowels /i, u/ are articulated relatively higher than their lax counterparts /ɪ, ʊ/ (Alfonso & Baer, 1982).

Based on these spectral features relevant to the tongue height (F1), this section compares the F1 values for the five adjacent contrasts in the rated levels in Figure 2.

pss-11-2-1-g2
Figure 2. The mean F1 of adjacent pairs in the rated levels (F1 in Hz).
Download Original Figure

Given that F1 is inversely proportional to the tongue height, the scale of F1 goes downwards (i.e., low values at the top and high values at the bottom).

Two-way ANOVAs, the 2×3 analysis of variances (Vowel: two adjacent vowels in each pair×level: level 1–2, level 3, level 4–5) are performed to find out whether there is the influence of L2 proficiency in producing each adjacent pair with acoustically separable vowel qualities, especially in terms of tongue height (F1). In all of the statistical analysis in this work, differences relevant to a p<.05 are considered significant. The results are given in Table 2.

Table 2. Results of two-way ANOVA for F1 of adjacent pairs
Pair Variable df F-value p-value
/i/-/ɪ/ Vowel 1 17.030 .000***
Level 2 10.260 .000***
Vowel×level 2 12.207 .000***
/ɛ/-/æ/ Vowel 1 3.986 .048*
Level 2 6.610 .002**
Vowel×level 2 1.302 .275
/ʌ/-/ɔ/ Vowel 1 .122 .727
Level 2 .238 .789
Vowel×level 2 .101 .904
/ɔ/-/ɑ/ Vowel 1 12.472 .001**
Level 2 .828 .442
Vowel×level 2 .487 .617
/u/-/ʊ/ Vowel 1 3.141 .078
Level 2 .546 .580
Vowel×level 2 .853 .428

* p<.05,

** p<.01,

*** p<.001.

Download Excel Table

In the high-front tense and lax contrast /i/-/ɪ/, a two-way analysis of variance yielded significant main effects of vowel, F(1,271)=17.03, p=.000, and level, F(2,271)=10.26, p=.000. There was a significant interaction effect of Vowel×level, F(2,271)=12.20, p=.000, indicating that the level effect was greater in the lax vowel /ɪ/ condition than in the tense vowel /i/ condition (see Figure 2). Post hoc analysis using the Scheffe post hoc criterion for significance revealed that the higher level (the contrast /i/-/ɪ/: M=337 vs. 433 Hz, SD=89 vs. 95 Hz) was significantly different from the middle (M=341 vs. 350 Hz, SD=57 vs. 56 Hz) and the lower (M=341 vs. 338 Hz, SD=55 vs. 47 Hz) levels, which did not differ from each other (see Figure 2). According to the result of paired sample T-test for F1 of the contrast /i/-/ɪ/ in the rated levels, only the higher level significantly separated the contrast /i/-/ɪ/, t(61)=–4.02, p=.000. Namely, in producing the front vowels /i/-/ɪ/, unlike the middle and lower levels, only the higher level was spectrally able to distinguish the contrast with the tongue height (F1) by further lowering the tongue height when producing the lax /ɪ/ compared to the tense counterpart /i/.

With regard to the front adjacent pair /ɛ/-/æ/, the ANOVA revealed significant main effects of vowel, F(1,164)=3.99, p<.05, and level, F(2,164)=6.61, p<.01, but the Vowel×level interaction was not significant F(2,164)=1.30, p>.05. Post hoc analysis of the main effect of level revealed that the higher level (the pair /ɛ/-/æ/: M=605 vs. 695 Hz, SD=141 vs. 146 Hz) was significantly different from the middle (M=559 vs. 576 Hz, SD=80 vs. 125 Hz) and the lower (M=556 vs. 568 Hz, SD=91 vs. 94 Hz) levels, which did not differ from each other (see Figure 2). It means that in terms of the tongue height (F1), the spectral distinction between the contrast /ɛ/-/æ/ was greater in the higher level than in the middle and lower levels although there was no interaction effect of Vowel×level.

Concerning the central and back vowels /ʌ/-/ɔ/, all effects are insignificant, indicating that the pair /ʌ/-/ɔ/ was hardly distinguished by the tongue height regardless of the fluency ratings.

For the adjacent back vowels /ɔ/-/ɑ/, the main effect of vowel was significant, F(1,64)=12.47, p=.001, but the main effect of level was non-significant, F(2,64)=.83, p>.05. The interaction effect of Vowel×level was insignificant, F(2,64)=.49, p>.05. It means that in general, Korean EFL learners were spectrally able to separate the back vowels /ɔ/-/ɑ/ by the tongue height (F1).

As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed that none of effects were significant, which indicates that Korean EFL learners generally find it difficult to make a spectral distinction between the pair /u/-/ʊ/ with the tongue height.

3.1.2. The Comparison of F2 Values

In terms of F2 associated with tongue backness, according to Alfonso & Baer (1982), the high-front tense vowel /i/ is produced relatively forward compared to its lax counterpart /ɪ/ whereas the high-back tense vowel /u/ is articulated from a relatively backward position compared to the lax vowel /ʊ/. Meanwhile, the mid-front vowel /ɛ/ is placed more forward than the low-front vowel /æ/; the mid-back vowel /ɔ/ placed backward in contrast to the low-back vowel /ɑ/ and the low-central vowel /ʌ/ (Jones, 1960).

On the basis of these spectral characteristics related to F2, the mean values of F2 for the five adjacent pairs in the rated levels are compared in Figure 3.

pss-11-2-1-g3
Figure 3. The mean F2 of adjacent pairs in the rated levels (F2 in Hz).
Download Original Figure

The scale of F2 goes downwards (i.e., low values at the top and high values at the bottom) as in the scale of F1 in Figure 2.

Two-way ANOVAs, the 2×3 (Vowel: two adjacent vowels in each pair×level: level 1–2, level 3, level 4–5) analysis of variances, are conducted to ascertain whether there is the influence of L2 proficiency in producing each adjacent pair with acoustically separable vowel qualities, especially in terms of tongue backness (F2). The results are given in Table 3.

Table 3. Results of two-way ANOVA for F2 of adjacent pairs
Pair Variable df F-value p-value
/i/-/ı/ Vowel 1 19.274 .000***
Level 2 5.825 .003**
Vowel×level 2 14.964 .000***
/ɛ/-/æ/ Vowel 1 1.430 .234
Level 2 1.707 .185
Vowel×level 2 .542 .583
/ʌ/-/ɔ/ Vowel 1 3.344 .070
Level 2 .766 .468
Vowel×level 2 2.158 .121
/ɔ/-/ɑ/ Vowel 1 4.495 .038*
Level 2 .953 .391
Vowel×level 2 .308 .736
/u/-/ʊ/ Vowel 1 2.708 .102
Level 2 .001 .999
Vowel×level 2 .061 .941

* p<.05,

** p<.01,

*** p<.001.

Download Excel Table

In the high-front tense and lax contrast /i/-/ɪ/, a two-way ANOVA yielded significant main effects of vowel, F(1,271)=19.27, p=.000 and of level, F(2,271)=5.83, p<.01. There was a significant interaction effect of Vowel×level, F(2,271)=14.96, p=.000, indicating that the level effect was greater in the lax vowel /ɪ/ condition than in the tense vowel /i/ condition (see Figure 3). Post hoc analysis (Scheffe) revealed that the higher rated level (the contrast /i/-/ɪ/: M=2,334 vs. 2,004 Hz, SD=133 vs. 77 Hz) was significantly different from the middle (M=2,124 vs. 2,122 Hz, SD=174 vs. 252 Hz) and the lower (M=2,063 vs. 2,056 Hz, SD=302 vs. 150 Hz) levels, which did not different from each other (see Figure 3). The result of paired sample T-test for F2 of the contrast /i/-/ɪ/ in the rated levels showed the significant difference between the F2 values of the contrast /i/-/ɪ/ in the higher rated level, t(61)=12.49, p=.000, not in the middle and lower rated levels. To put it simply, unlike the middle and lower levels, only the higher rated level was spectrally able to distinguish between the contrast /i/-/ɪ/ with the tongue backness (F2) by moving the tongue more backward when producing the lax vowel /ɪ/ than the tense /i/.

With respect to the front adjacent pair /ɛ/-/æ/, the ANOVA revealed that none of effects were significant, indicating that with the tongue backness (F2), native Korean speakers were unable to differentiate the vowel /ɛ/ from the vowel /æ/ in general.

Concerning the central and back vowels /ʌ/-/ɔ/, none of effects are significant. That is to say, the pair /ʌ/-/ɔ/ was hardly separated by the tongue backness for Korean EFL learners.

For the adjacent back vowels /ɔ/-/ɑ/, the main effect of vowel was significant, F(1,64)=4.50, p<.05, but no other effects were significant. It means that in general, Korean EFL learners could distinguish the back vowels /ɔ/-/ɑ/ by the tongue backness (F2).

As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed that none of effects were significant. In other words, all the proficiency levels failed to distinguish between the contrast /u/-/ʊ/ in terms of the tongue backness.

3.2. Vowel Duration

The mean durations and standard deviations (in parenthesis) for tense and lax contrasts /i/-/ɪ/ and /u/-/ʊ/ in the rated levels are presented in Table 4.

Table 4. Mean durations and (Standard Deviations) of tense and lax contrasts (Duration in ms)
English tense and lax contrasts
High-front vowels High-back vowels
/i/ /ɪ/ /u/ /ʊ/
Level 1–2 122 (40) 106 (49) 174 (50) 123 (30)
Level 3 107 (17) 78 (14) 161 (19) 81 (13)
Level 4–5 90 (15) 47 (8) 170 (14) 69 (19)
Download Excel Table

To compare the mean durations for the tense and lax contrasts in the rated levels, the results are illustrated in Figure 4.

pss-11-2-1-g4
Figure 4. The mean duration of tense and lax contrasts across the levels (Duration in ms).
Download Original Figure

Regardless of L2 proficiency levels, mean durations of the tense vowels /i, u/ are relatively longer than those of their lax counterparts /ɪ, ʊ/. In general, the high-back tense vowel /u/ has a longer duration than any other tense or lax vowels across all the rated levels. Most importantly, except for the high-back tense vowel /u/ in the intermediate level (i.e., level 3), every tense and lax vowel duration gradually decreases as the rated level becomes more proficient. It may suggest that the lower level (i.e., level 1–2) is more likely to produce the contrasts for an unduly longer length of time. For a more accurate analysis of the temporal distinction between the contrasts in the rated levels, a statistical analysis is conducted.

Two-way ANOVAs, the 2×3 (Vowel: tense vs. lax vowel×level: level 1–2, level 3, level 4–5) analysis of variances, are performed to check whether in terms of vowel duration, there are relationship and influence between the temporal distinction between tense and lax contrasts and L2 fluency ratings in Korean EFL learners’ production of English vowels. The results are displayed in Table 5.

Table 5. Results of two-way ANOVA for duration of tense and lax contrasts
Pair Variable df F-value p-value
/i/-/ɪ/ Vowel 1 7.981 .006**
Level 2 5.987 .004**
Vowel×level 2 .556 .576
/u/-/ʊ/ Vowel 1 44.959 .000***
Level 2 3.701 .033*
Vowel×level 2 1.796 .179

* p<.05,

** p<.01,

*** p<.001.

Download Excel Table

In the high-front tense and lax contrast /i/-/ɪ/, a two-way ANOVA yielded significant main effects of vowel, F(1,93)=7.98, p<.01, and of level, F(2,93)=5.99, p<.01, but the interaction effect of Vowel×level was non-significant, F(2,93)=.56, p>.05. Post hoc test (Scheffe) revealed that the lower level (the contrast /i/-/ɪ/: M=122 vs. 106 ms, SD=40 vs. 49 ms) was significantly different from the middle (M=107 vs. 78 ms, SD=17 vs. 14 ms) and higher (M=90 vs. 47 ms, SD=15 vs. 8 ms) levels, which did not differ from each other (see Figure 4). In other words, the durational distinction between the contrast /i/-/ɪ/ was greater in the relatively proficient levels (i.e., the middle and higher levels) compared to in the lower level.

As for the high-back tense and lax contrast /u/-/ʊ/, the ANOVA revealed significant main effects of vowel, F(1,45)=44.96, p=.000, and level, F(2,45)=3.70, p<.05, but the Vowel×level interaction was not significant, F(2,45)=1.80, p>.05. Post hoc analysis of the main effect of level indicated that the lower level (the contrast /u/-/ʊ/: M=174 vs. 123 ms, SD=50 vs. 30 ms) was significantly different from the middle (M=161 vs. 81 ms, SD=19 vs. 13 ms) and the higher (M=170 vs. 69 ms, SD=14 vs. 19 ms) levels, which did not differ from each other (see Figure 4). It means that the temporal distinction between the contrast /u/-/ʊ/ was greater in the middle and higher levels in comparison to the lower level.

4. Discussion

The current study investigated whether there is a relationship between accurate vowel production and proficiency levels in L2 English produced by Korean EFL learners. The results of this study suggest that the influence between English vowel production and L2 proficiency was apparent only in producing the high-front tense and lax contrast /i/-/ɪ/. The more proficient the rated levels, the better they produced the contrast /i/-/ɪ/ with acoustically separable vowel qualities. However, the other pairs failed to show the influence between vowel production and L2 proficiency levels in Korean EFL learners’ production of English vowels.

4.1. English Tense and Lax Contrasts: /i-ɪ/, /u-ʊ/

The results of this thesis revealed that the middle and lower levels showed little spectral distinction between the contrast /i/-/ɪ/ while the higher level significantly separated the contrast in a spectral manner by moving the tongue much lower and backward in producing the lax vowel /ɪ/ than in producing the tense counterpart /i/. However, in terms of vowel duration, Korean EFL learners were generally able to differentiate the tense /i/ from the lax vowel /ɪ/ by producing the tense /i/ much longer than the lax counterpart /ɪ/. Particularly, the middle and higher levels were better able to separate the contrast with the temporal feature (i.e., vowel duration) compared to the lower level. Many previous studies have demonstrated that native Korean speakers lack an understanding of the spectral distinction between the tense and lax contrast /i/-/ɪ/ insofar as there is no concept of tense and lax subsets in their L1 inventory (Flege et al., 1997; Hong, 2012; Tsukada et al., 2005). However, the findings of this study may suggest that Korean learners of English at proficient L2 fluency levels can separate the tense and lax contrast /i/-/ɪ/ according to spectral as well as temporal cues. Moreover, numerous studies have shown that Korean EFL learners unduly rely on the temporal characteristic in distinguishing between tense and lax contrasts (Flege et al., 1997; Tsukada et al., 2005; Yun, 2009), but the findings also indicate that Korean learners at relatively lower proficient L2 fluency ratings have difficulty separating the contrast /i/-/ɪ/ even with the temporal feature.

With regard to the high-back tense and lax contrast /u/-/ʊ/, the results suggested that Korean learners of English seldom distinguish between the contrast /u/-/ʊ/ in a spectral manner. However, as for vowel duration, the temporal distinction between the contrast /u/-/ʊ/ was significant across all the rated levels. The results support previous studies that Korean L1 speakers classify the high-back tense and lax vowels /u/-/ʊ/ mainly by vowel duration rather than vowel quality (Hong, 2012; Tsukada et al., 2005; Yun, 2009). However, the results revealed that the temporal distinction between the vowels was greater in the higher and middle levels than in the lower level; namely, the length effect diminished in the lower fluency rating.

4.2. English Adjacent Vowel Pairs: /ɛ-æ/, /ʌ/-/ɔ/, /ɔ-ɑ/

The pair of adjacent front vowels /ɛ/-/æ/ was significantly distinguished by the tongue height (F1). In particular, compared to the middle and lower levels, the higher level clearly distinguished the vowels by further lowering the tongue in producing the low vowel /æ/ compared to producing the mid vowel /ɛ/. On the other hand, all the rated levels failed to separate the front vowels with the tongue backness (F2). Existing research has established that Korean learners of English find it difficult to produce the /ɛ/-/æ/ contrast with acoustically separable vowel qualities (Flege et al., 1997; Hwang & Lee, 2012; Ingram & Park, 1997; Tsukada et al., 2005). Ingram & Park (1997) showed that this was largely because of the recent phonological merger /ɛ/-/e/ in the Korean vowel system. On the other hand, the findings of this study suggest that Korean learners at proficient L2 English levels can separate the vowels /ɛ/-/æ/ in a spectral manner, especially in terms of the tongue height (F1).

Concerning the central and back vowels /ʌ/-/ɔ/, the pair was not spectrally separated across all the rated levels. This was partly because of the negative L1 transfer. To be specific, the mid-back vowel /ɔ/ is not included in the Korean L1 inventory, but the central vowel /ʌ/ is shared both in Korean and English vowel systems. Thus, Korean learners of English generally consider the vowels /ʌ, ɔ/ as the one vowel /ʌ/ which is present in Korean inventory. This finding lends support to the previous studies of Koo (2000), Hong (2012) and Tsukada et al. (2005) on the grounds that Korean learners have experience significant confusion when producing central and back vowels.

As for the back vowels /a/-/ɔ/, on the other hand, the results demonstrated that the vowels were classified both by F1 and F2 for all the rated levels, meaning that Korean EFL learners are able to distinguish the back vowels with spectral cues.

5. Conclusion

In conclusion, only in the high-front tense and lax vowels /i/-/ɪ/ was the influence between accurate vowel production and L2 proficiency apparent. The more proficient the fluency ratings, the better they separated the contrast through spectral as well as temporal cues. Besides, although there was no interaction effect of Vowel×level in the production of the adjacent front vowels /ɛ/-/æ/, the higher level showed greater spectral distinction between the vowels with the tongue height compared to the middle and lower levels. However, except for the back vowels /a/-/ɔ/, Korean EFL learners generally experience difficulty in separating the central and back vowels. It may result from native Korean speakers’ smaller vowel spaces in comparison to native English speakers. According to Koo (2000) and Franklin (2009), the Korean vowel system is less dense than the vowel system in English, making native Korean speakers articulate English vowels within relatively narrow vowel spaces in relation to native English speakers. Hence, Korean learners of English need to move their tongue more drastically when producing English vowels to avoid any confusion in separating adjacent vowels.

Meanwhile, this study lacks a controlled group of native English speakers. Thus, it is unable to fully ascertain whether the performances representing significant differences in separating the pairs are native-like or not. To see if these significant effects reach a native level, future research needs to compare the results with those of native English speakers. Moreover, apart from English monophthongs, English diphthongs should be examined for further comprehensive analysis of L2 English vowel production by Korean L1 speakers.

Footnote

* This work is revised from the first author’s MA thesis, which is commented and supervised by the corresponding author.

References

1.

Alfonso, P. J., & Baer, T. (1982). Dynamics of vowel articulation. Language and Speech, 25(2), 151-173.

2.

Best, C. T. (1991, April). Phonetic influences on the perception of non-native speech contrasts by 6-8 and 10-12 month-olds. Paper presented at the Meeting of the Society for Research in Child Development, Seattle, WA.

3.

Campbell, D. F., McDonnell, C., Meinardi, M., & Richardson, B. (2007). The need for a speech corpus. ReCALL, 19(1), 3-20.

4.

Cho, C. (2017). Analysis of phonetic and phonological accent in rated L2 read speech corpus of Korean learners of English (Master’s thesis). Yonsei University, Seoul, Korea.

5.

Cho, T., Kim, S., & Hur, Y. (2013). Effects of prosodic strengthening on the production of English high front vowels /i, ɪ/ by native vs. non-native speakers. Phonetics and Speech Sciences, 5(4), 129-136.

6.

Cutler, A., Smits, R., & Cooper, N. (2005). Vowel perception: Effects of non-native language vs. non-native dialect. Speech Communication, 47(1-2), 32-42.

7.

Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27(2), 315-330.

8.

Escudero, P., Simon, E., & Mitterer, H. (2012). The perception of English front vowels by North Holland and Flemish listeners: Acoustic similarity predicts and explains cross-linguistic and L2 perception. Journal of Phonetics, 40(2), 280-288.

9.

Flege, J. E. (1987). The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15(1), 47-65.

10.

Flege, J. E. (1995). Second-language speech learning: Theory, findings and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Theoretical and methodological issues in cross-language speech research (pp. 229-273). Timonium, MD: York Press.

11.

Flege, J. E., Bohn, O. S., & Jang, S. (1997). Effects of experience on non-native speakers' production and perception of English vowels. Journal of Phonetics, 25(4), 437-470.

12.

Fougeron, C., & Keating, P. A. (1997). Articulatory strengthening at edges of prosodic domains. The Journal of the Acoustical Society of America, 101(6), 3728-3740.

13.

Franklin, A. D. (2009). English vowel production in Japanese, Korean, and Spanish ESL speakers before and after vowel-focused pronunciation training (Doctoral dissertation). University of Washington, Seattle, WA.

14.

Ho, Y. K. (2010). The perception and production of American English front vowels by EFL learners in Taiwan: The influence of first language and proficiency levels (Doctoral dissertation). University of Kansas, Lawrence, KS.

15.

Hong, S. (2012). The relative perceptual easiness between perceptually assimilated vowels for university-level Korean learners of American English and measurement bias in an identification test. Studies in Phonetics, Phonology and Morphology, 18(3), 491-511.

16.

Hwang, I., & Lee, S. (2012). Perception of English vowel categories by Korean university students. Linguistics, 37(4), 1095-1117.

17.

Ingram, J. C., & Park, S. G. (1997). Cross-language vowel perception and production by Japanese and Korean learners of English. Journal of Phonetics, 25(3), 343-370.

18.

Jones, D. (1960). An outline of English phonetics. Cambridge: W. Heffer & Sons.

19.

Kenyon, J. S., & Knott, T. A. (1953). A pronouncing dictionary of American English. Springfield, MA: Merriam.

20.

Kim, J. E. (2007). A phonetic study of Koreans’ production of English vowels and their pronunciation pedagogy. Linguistics, 15(4), 41-54.

21.

Kim, J. M., Wang, C., Peabody, M., & Seneff, S. (2004, October). An interactive English pronunciation dictionary for Korean learners. Proceedings of the Interspeech 2004-ICSLP, Jeju, Korea.

22.

Klatt, D. H. (1975). Vowel lengthening is syntactically determined in a connected discourse. Journal of Phonetics, 3(3), 129-140.

23.

Kong, E. J., & Yoon, I. H. (2013). L2 proficiency effect on the acoustic cue-weighting pattern by Korean L2 learners of English: Production and perception of English stops. Phonetics and Speech Sciences, 5(4), 81-90.

24.

Koo, H. (2000). Characteristics of English vowels spoken by Koreans. Speech Sciences, 7(3), 99-108.

25.

Ladefoged, P., & Disner, S. F. (2012). Vowels and consonants (3rd ed.). Oxford: Wiley-Blackwell.

26.

Lado, R. (1957). Linguistics across cultures: Applied linguistics for language teachers. Ann Arbor, MI: University of Michigan Press.

27.

Lee, J. W. (2018). Degree of phonological rule transfer: Evidence from L2 English production by L1 Korean speakers (Master’s thesis). Yonsei University, Seoul, Korea.

28.

Li, J. H., & Lee, S. H. (2017). An acoustic study on English tense-lax vowel pairs produced by Chinese and Korean speakers. Wonkwang Journal of Humanities, 18(1), 283-306.

29.

Maddieson, I. (1997). Phonetic universals. In W. J. Hardcastle & J. Laver (Eds.), The handbook of phonetic sciences (pp. 619-639). Cambridge: Blackwell.

30.

Mauranen, A. (2004). Speech corpora in the classroom. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and language learners (pp. 195-211). Amsterdam: John Benjamins.

31.

Park, D., Lee, S., & Cho, M. (2010). Interference of L1 phonological processes in English learning. English Language and Linguistics, 16(3), 187-215.

32.

Park, E. (2017). The study of speech rate and pause in rated L2 speech corpus of Korean learners of English (Master’s thesis). Yonsei University, Seoul, Korea.

33.

Reetz, H., & Jongman, A. (2009). Phonetics: Transcription, production, acoustics and perception. Malden, MA: Blackwell.

34.

Rhee, S. (2016). Construction of rated Korean L2 English speech corpus. Proceedings of the Korean Society of Speech Sciences 2016 Spring Conference (pp. 125-126). Gwangju Institute of Science and Technology (GIST), Gwangju, Korea.

35.

Tsujimura, N. (1996). An introduction to Japanese linguistics. Oxford: Blackwell.

36.

Tsukada, K., Birdsong, D., Bialystok, E., Mack, M., Sung, H., & Flege, J. (2005). A developmental study of English vowel production and perception by native Korean adults and children. Journal of Phonetics, 33(3), 263-290.

37.

Wang, C. (1988). The production and perception of English vowels by native speakers of Mandarin (Doctoral dissertation). University of Alabama, Birmingham, AL.

38.

Wang, H., & van Heuven, V. J. (2006). Acoustical analysis of English vowels produced by Chinese, Dutch and American speakers. Linguistics in the Netherlands, 23(1), 237-248.

39.

Yang, B. (1996). A comparative study of American English and Korean vowels produced by male and female speakers. Journal of Phonetics, 24(2), 245-261.

40.

Yang, B. (2008). An acoustical comparison of English tense and lax vowels produced by Korean and American males. Speech Sciences, 15(4), 19-27.

41.

Yoon, S. Y ., Pierce, L., Huensch, A., Juul, E., Perkins, S., Sproat, R., & Hasegawa-Johnson, M. (2009). Construction of a rated speech corpus of L2 learners' spontaneous speech. CALICO Journal, 26(3), 662-673.

42.

Yun, Y. (2009). Identification of English vowels /u/-/ʊ/ by English and Korean speakers. The Journal of Modern British & American Language & Literature, 27(4), 241-253.

Appendices

Appendix 1. Checkpoints for Analytic Evaluation in Genie Speech Corpus
1. Speech Speed & Pause
  • - Check naturalness and the speed of speech.

  • - Check the numbers and the length of the pauses between words and between syllables.

  • - Check if brief pauses between thought groups are natural or normal (in case pauses are proper).

2. (Lexical & Sentential) Stress and Rhythm
  • - Check the appropriate, noticeable distinctions between stressed and unstressed syllables in terms of loudness, pitch and length.

  • - Assess the correct placement of the lexical and sentential stress.

  • - Examine the syllable-timed rhythm.

  • - Check if stress distinction is well observed between content and function words, and/or between focused and non-focused words.

3. Intonation
  • - Check if the proper and intended intonation pattern is clear.

  • - Check if the tonic syllable is noticeably distinct in each tone unit.

  • - Check if the pitch between the tonic syllable and the end of the tone unit is appropriate and adequate.

4. Segmental Features
  • - Assess the phonemic differences of all vowels and consonants.

  • - Check the observance of English phonological rules.

  • - Refer to the segmental features given on pages 1–2 of this report.

Appendix 2. Rubric for Analytic and Holistic Evaluation in Genie Corpus (Rhee, 2016)
Levels Category Description Proficiency achievement
5 Mastery Speech speed & pause No/Minor awkwardness & errors: Native or near-native level 91%–100%
Stress & rhythm
Intonation
Segmental features
4 Advanced Speech speed & pause Some awkwardness & errors 76%–90%
Stress & rhythm
Intonation
Segmental features
3 Adequate Speech speed & pause Occasional awkwardness & errors 51%–75%
Stress & rhythm
Intonation
Segmental features
2 Developing Speech speed & pause Frequent incorrect pronunciations 31%–50%
Stress & rhythm
Intonation
Segmental features
1 Novice Speech speed & pause Many wrong pronunciation. Unintelligible, incomprehensible speech. 0%–30%
Stress & rhythm
Intonation
Segmental features
Download Excel Table