1. Introduction
Stop consonants are produced in the oral cavity in the order of complete closure of the airflow, release burst, and aspiration. The time of release burst followed by aspiration is called voice onset time (VOT) since it is the same as the time that takes before the vibration of the following vowel or voiced consonant.
Most previous studies on stop consonants have been focused on the VOT. It was used as a phonetic cue that distinguished between voiced and voiceless stops (Lisker & Abramson, 1964). Voiceless stops (/p, t, k/) had longer VOTs than voiced stops (/b, d, ɡ/) in English. It was also known that the deeper the place of articulation of a stop was in the oral cavity, the longer the VOT was. Thus, based on places of articulations of stops, the VOTs were: bilabial stops < alveolar stops < velar stops (Lisker & Abramson, 1967).
But recent studies have showed that the VOT may not be the phonetic cue for places of articulations of stops. Chodroff and Wilson’s (2017) study on read speech showed that the VOTs of English word-initial /t/ and /k/ were almost the same, and that of /p/ was also very close to them, and their differences were even less than 10 ms. Yao’s (2009) study on English spontaneous speech also reported that the VOT differences on places of articulations of word-initial stops were not statistically significant. In both of these studies, the VOTs by some speakers were in the order of /p/ < /t/ < /k/, or of /p/ < /k/ < /t/, or of /t/ < /p/ = /k/.
It was also known that the VOTs of word-initial stops were different depending on contexts. For instance, they differed when they occurred at sentence-initial positions or inside unstressed words. They also differed depending on the height of the following vowels (Chodroff & Wilson, 2017; Nearey & Rochet, 1994; Yao, 2009).
Previous research on closure durations of stops was mostly concerned with word-final stops in English. Voiced word-final stops had short closure durations and their previous vowels were long, whereas voiceless word-final stops had long closure durations and their previous vowels were short (Luce & Charles-Luck, 1985; Raphael, 1972). Kang (2007) showed that the vowels before English voiceless word-final stops were 184 ms in average, and those before voiced word-final stops were 262 ms in average. The latter was 42% longer than the former.
There were not many previous studies on closure durations of intervocalic stops comparing with those of word-final stops. Lisker (1957), which was one of the earliest studies on stops, measured closure durations of English /p/ and /b/ found in minimal pairs such as ‘ruby-rupee’ and ‘rabid-rapid’ produced by 34 native speakers of English, and found that the average of closure durations of /p/ was 120 ms and that of /b/ was 75 ms. In addition, Lisker (1957) also conducted a perception test using the ‘rupee’ and ‘ruby’ continua. Lisker cut off the /p/ closure by 10 ms steps from the ‘rupee’ token for one continuum, and added the /b/ closure by 10 ms steps to the ‘ruby’ token for the other continuum. Lisker found that native speakers of English identified long closure durations as /p/ and short closure durations as /b/ in the identification of both continua.
Stathopoulos & Weismer (1983) also did research on closure durations of English stops. They found that those of voiceless stops were longer than those of voiced stops in the word-final position as well as between vowels, and that closure durations of English stops were longer in word-initial positions than in word-final positions and between vowels. Those of bilabial stops were longer than those of alveolar and velar stops. They were longer in stops of stressed syllables than in stops of unstressed syllables.
It was also found in Japanese (Port et al., 1987) and Arabic (Alghamdi, 1990) that closure durations of voiceless stops were longer than those of voiced stops when they appeared between vowels.
Korean stops have recently undergone VOT changes. Fortis, lenis, and aspirated stops were distinguished by VOTs several decades ago. Their VOTs were in the order of fortis < lenis < aspirated stops (Lisker & Abramson, 1964). However, young generations nowadays distinguish lenis and aspirated stops using f0s of the following vowels (Kang, 2014; Silva, 2006). Kim (2021) used a spontaneous speech corpus to analyze VOTs of Korean word- initial stops produced by Koreans in their 10s through 60s of age. He reported that the Korean speakers in their 10s through 30s produced the word-initial lenis and aspirated stops with almost the same VOTs but the word-initial fortis stops with lower VOTs. However, Korean speakers in their 40s through 60s still produced the VOTs of word-initial stops in the order of fortis < lenis < aspirated stops. The f0s of the following vowels of them were produced in the order of lenis < fortis < aspirated stops by the Korean speakers in all the generations. Thus, Kim’s study proved that the f0 was a phonetic cue for laryngeal contrasts and hence used to distinguish between lenis and aspirated stops produced by young Koreans.
As explained so far, most research on Korean stops was focused on VOTs of word-initial stops and f0s of the following vowels. Research on closure durations was relatively rare. However, a few studies on closure durations between vowels were reported. They showed that the closure durations between vowels were in the order of lenis < aspirated < fortis stops (Han, 1996; Pyo et al., 1999; Silva, 1992).
Pyo et al. (1999) measured closure durations of stops between /ɑ/ vowels (e.g., /ɑpɑ/) produced by 21 Korean adults. Based on places of articulations, closure durations were in the order of velars < alveolars < bilabials. Based on laryngeal contrasts, they were in the order of lenis < aspirated < fortis stops. Kang & Kang (2006) found that as closure durations of fortis stops between vowels were cut off, Koreans’ identification of them as lenis stops were increased whereas as closure durations of lenis stops between vowels were lengthened, Koreans identified them as fortis stops.
Kang & Dilley (2007) also compared lenis and fortis stops that occurred between vowels produced by Koreans, and found that closure durations of fortis stops were much longer than those of lenis stops. Yun (2009) also compared closure durations of Korean stops between vowels, and found that those of aspirated stops and fortis stops were 2.14 times and 2.55 times longer than those of lenis stops, respectively.
Korean is known to have the so-called coda neutralization (Sohn, 1987). When Korean obstruents are produced in coda positions, they are not exploded. Thus, bilabial stops /p, ph, p*/, alveolar stops /t, th, t*/, and velar stops /k, kh, k*/ are produced as [p┓], [t┓], [k┓], respectively, in coda positions. For instance, word-final or syllable- final stops (namely, coda stops) of /kak/ ‘angle’, /puəkh/ ‘kitchen’, and /tak*ta/ ‘to polish’ are produced as [kak┓], [puək┓], and [tak┓t*a], respectively.
In summary, based on previous studies on stops found in foreign languages such as English, Japanese, and Arabic, voiceless stops in general had longer closure durations than voiced stops. In English, closure durations were longer in voiceless stops than in voiced ones in word-final positions. In addition, they were longer in word-initial positions than in word-final positions and between vowels.
Based on previous studies on Korean stops, closure durations between vowels were in the order of lenis stops < aspirated stops < fortis stops in terms of laryngeal contrasts, and in the order of velar stops < alveolar stops < bilabial stops in terms of places of articulations.
The current study investigates closure durations of Korean stops produced by Korean speakers. Since the Korean stops have not been investigated in word-initial and word-final positions, they will be included in this study as well as those found between vowels. The results will be analyzed based on laryngeal contrasts, and places of articulation and gender differences.
The languages discussed before, namely, English, Japanese, and Arabic phonologically distinguish voiced and voiceless stops but Korean does not. Thus, the results of the current study may be different from the findings of those foreign languages. Nevertheless, I cautiously expect the following results. If the Korean stops in word-initial positions show the same pattern as those between vowels in the previous studies, closure durations will be in the order of lenis < aspirated < fortis stops in terms of laryngeal contrasts, and in the order of velar < alveolar < bilabial stops in terms of places of articulations. However, based on coda neutralization, closure durations of word-final stops are expected to be the same in terms of laryngeal contrasts and places of articulations. I also expect that gender difference will not affect closure durations among the participants.
2. Production Experiment
This study analyzes closure durations of Korean stops found in word-initial positions, word-final positions, and between vowels. Frequently used words were typically produced at faster speech rate than less frequently used words in our daily lives. When production experiment was conducted, it was custom to minimize the frequency effects of words used in the experiment as much as possible because they may affect durations of words when they were recorded. Although this may or may not be true in recordings of only one or two-syllable words used in the experiment, the experiment design for the current study was made to minimize the frequency effect. Thus, in this experiment nonsense words containing only /a/ vowels were made. Nevertheless, it was unavoidable to have a few words that were still the same as real Korean words.
Table 1 shows the words used in the experiment. Following customs of Korean consonant transcriptions, lenis, aspirated, and fortis stops were transcribed as, for instance, /pa/, /pha/, /p*a/, respectively.
Korean does not have words that contain the fortis stops /p*/ and /t*/ in word-final positions. However, since the fortis stop /k*/ is found in word-final positions (e.g., /pak*/ ‘outside’), I assume that fortis stops /p*/ and /t*/ may still be present in Korean speakers’ mental lexicon. Thus, all of the fortis stops were used in the word-final positions of the experimental words.
In the production experiment, 11 male and 11 female Korean speakers participated. They were all college students and the Seoul dialect speakers. Their ages were between 20 and 26, and the average was 21 and the standard deviation was 1.68. The participants entered a sound attenuated recording booth and practiced until they were used to the pronunciations of the nonsense words. The carrier sentence for the experimental words were /iʧɛ ___ s*ɨsɛjo/ ‘Now write ___ ’. The participants produced the carrier sentence three times, and the nonsense words in the second carrier sentences were selected for the analyses of closure durations since participants tend to produce the first carrier sentence a little fast and the last one a little slowly.
To record the experimental words, a digital recorder SONY PCM-M10 and a headset Senheiser PC151 were used. The microphone of the headset was placed around five centimeters away from the left end of the participants’ mouth to have constant volume in the recording. The recorded words were digitized at the sampling rate of 44,100 Hz with 16 bit resolution.
To measure closure durations of the Korean stops, a sound analysis program Praat was used (Boersma & Weenik, 2019). To find the start and end of the closure durations, both waveforms of surrounding vowels and spectrograms of the first and second formants were checked, and the measurements were done in the waveforms and the results were analyzed with a statistics program SPSS.
3. Results
The highlighted portion in Figure 1 shows the closure duration of lenis /p/ of /pa/ produced by a Korean male speaker SSO. Its duration is 92 ms.
The averages of closure durations of individual stops are listed in Table 2. The closure durations were also averaged to compare the differences based on positions within words, laryngeal contrasts, and places of articulation. They are listed in Table 3 and also shown in Figure 2 through Figure 4. The durations in the current study are all represented in milliseconds (ms).
One-way analysis of variance (ANOVA) was done with closure durations as the dependent variable and positions within words as the independent variable. The result was statistically not significant but marginal (F=2.510, p=.082). Tukey HSD showed that the difference between closure durations between vowels and those at the word-final position was marginal (p=.066).
Based on one-way ANOVA with closure durations as the dependent variable and laryngeal contrasts as the independent variable, the result was significant (F=148.386, p<.001). Tukey HSD showed that closure durations between lenis and aspirated (p<.001), and lenis and fortis (p<.001), and between aspirated and fortis (p<.001) were all significantly different.
One-way ANOVA with closure durations as the dependent variable and places of articulation as the independent variable showed that they were significantly different (F=3.706, p=.025). Tukey HSD showed that closure durations of bilabial and velar stops were significantly different from each other (p=.018).
Since Korean has coda neutralization, laryngeal contrasts are expected to be neutralized at the word-final position. Thus, a two-way ANOVA was done to check the interaction of position within words and laryngeal contrasts. The result was significant (F=37.648, p<.001). Figure 5 shows that closure durations of lenis, aspirated, and fortis stops at the word-final position are very close among them (97 ms vs. 101 ms vs. 112 ms) comparing with those at the word-initial position and between vowels.
Interaction effect was neither significant between positions within words and places of articulation (F=.538, p=.708), nor significant between places of articulation and laryngeal contrasts (F=.588, p=.671).
As Table 2 shows, closure durations look different between male and female speakers. An independent t-test shows that their closure durations were significantly different from each other (t=4.130, p<.001). The averages were 115 ms vs. 101 ms. Male speakers produced 14 ms longer than female speakers in closure durations in average. Figure 6 represents the difference in a graph.
However, it seems that male speakers’ closure durations were longer than females’ at word-initial and word-final positions but shorter between vowels in Table 2. Thus a two-way ANOVA was done to see if there was an interaction effect between gender and position within words. The result was significant (F=11.975, p<.001). Figure 7 shows that closure durations of male speakers were longer at the word-initial (119 ms vs. 96 ms) and word-final (117 ms vs. 89 ms) positions than those of female speakers but they were the opposite between vowels although the difference was small and hence close to each other (108 ms vs. 117 ms).
4. Discussion
Before the production experiment was conducted, I expected several findings. They were all proved to be true except the gender difference. It was statistically significant that closure durations were in the order of lenis < aspirated < fortis stops in terms of laryngeal contrasts. Tukey HSD showed that closure durations between lenis and aspirated, and lenis and fortis, and between aspirated and fortis were all significantly different.
The difference in closure durations in terms of the places of articulation was also significant. The closure durations were in the order of velar < alveolar < bilabial stops. Tuckey HSD showed that those of velar and bilabial stops were significantly different.
In addition, interaction effect was not found between positions within words and places of articulation, and between places of articulation and laryngeal contrasts. The closure durations at the word-final position were very close among them regardless of laryngeal contrasts. Thus, the coda neutralization effect was also proved.
However, there was an interaction effect between gender and positions within words. Figure 7 shows that male speakers’ closure durations were longer at the word-initial and word-final positions than female speakers’. However, they were the opposite between vowels although the difference was small.
The previous studies on closure durations of Korean stops were mainly focused on those found between vowels (Han, 1996; Kang & Dilley, 2007; Pyo et al., 1999; Silva, 1992; Yun, 2009). The same findings were also found in the current study where the closure durations between vowels were in the order of lenis < aspirated < fortis stops in terms of laryngeal contrasts, and were in the order of velar < alveolar < bilabial stops in terms of places of articulation. The current study found that this pattern was also shown at the word-initial position. But this pattern was not found at the word-final position due to the coda neutralization. As Figure 5 shows, closure duration differences based on laryngeal contrasts were bigger between vowels than at the word-initial position.
Figure 2 and Figure 5 should be noticed. As shown in Figure 2, closure durations were longest between vowels. However, Figure 5 shows that this is a little complicated. In all the three positions within words, the closure durations were in the order of lenis < aspirated < fortis stops. Of course, the difference was very small at the word-final position and hence neutralized due to the coda neutralization as explained above.
But it should be noticed that the closure duration of lenis stops became very short especially between vowels comparing with other positions. This is because of the famous phenomenon in Korean phonology in which lenis stops become voiced between vowels. All of the Korean stops are known to be voiceless, and only lenis stops become voiced in the intervocalic positions such as between vowels (Sohn, 1987).
As discussed in the previous studies for English (Lisker, 1957; Stathopoulos & Weismer, 1983), Japanese (Port et al., 1987), and Arabic (Alghamdi, 1990), closure durations of voiceless stops are longer than those of voiced stops between vowels. Since the Korean lenis stops became voiced between vowels, the closure durations became much shorter than those of fortis and aspirated stops between vowels. Perhaps, this phenomenon affected the reason why closure durations based on positions within words were not significant but statistically marginal. If the closure durations of lenis stops were also much longer between vowels, the statistical results for the positions within words would not be marginal and hence be significant enough.
In Stathopoulos & Weismer’ (1983) study for English stops, closure durations were longer in the word-initial position than in the word-final position and between vowels. This is different from the findings of Korean stops in the current study where the closure durations between vowels were the longest. I assume that this is due to the difference based on the nature of stops in the two languages.
English has voiced and voiceless distinction whereas Korean has three-way laryngeal contrasts. The English word-final stops were identified based on the durations of the previous vowels. The previous vowel before a voiced stop is longer than that of a voiceless stop at the word-final position. Thus, the closure duration seems less important and hence does not have to be long at the word-final position.
The voicing of English stops between vowels are easily identifiable due to the vibration of the vocal folds in terms of articulatory point of view and also due to the voice bar in terms of acoustic point of view. Thus, the closure duration seems to be less important between vowels and hence relatively short. But English word-initial stops are typically voiceless and hence produced without vibration of the vocal folds even though they are phonologically voiced. Thus, VOT difference affects the identification of voicing of them. I also assume that the difference of closure durations of word-initial stops became bigger between voiceless and voiced stops than those in other positions in English. Since the vibration of the vocal folds was not dependable as a phonetic cue to distinguish voicing of word-initial stops, both the VOT and the closure durations became more important.
The laryngeal contrasts in Korean stops were neutralized at the word-final position and hence the closure durations tended to be short. When syllables are formed among a series of consonants and vowels, the maximal onset principle is applied (Baertsch, 2010; Blevins, 1995) in which onset position is filled before coda position. Based on this principle, a series of sounds such as /apa/ should be syllabified as /a.pa/. Thus I assume that the closure durations of stops between vowels are longer than those in other positions to maximize the onset position.
However, for /pa/ which contains a word-initial stop, the syllabification was already done so the maximal onset principle does not have to be applied. Syllabification needs processing time in the brain (Santiago et al., 2000; Stenneken et al., 2007). Thus I assume that this is the reason why the closure durations of stops at the word-initial position are shorter than those between vowels in Korean.
Before the production experiment was conducted, I expected that closure durations of Korean stops were not different between male and female speakers since I assumed that there was no reason for them to be different based on gender. However, the results showed that Korean male speakers produced the closure durations 14 ms longer than female speakers did. But this should not interpreted as something in which pronunciations of male speakers always longer than those of female speakers. Korean female speakers’ closure durations were a little longer than male speakers’ between vowels although the difference was not significant (t=.017, p=.987).
In the studies of Bradlow et al. (2003) and Liu et al. (2004) on clear and fast speech, female speakers tend to speak faster than male speakers when they carry on ordinary conversations. But female speakers tend to speak slower and more clearly than male speakers did when they speak to old people and patients with hearing aids. In other words, female speakers tend to speak faster or slower and clearer when they are supposed to than male speakers did.
I assume that in the production experiment for the current study, the female participants may have considered the stimuli with stops at word-initial and word-final positions such as /pa/ and /kap/ as the list of words that did not have to be pronounced carefully like they did for patients and older people, and hence pronounced them faster than male speakers did.
However, female speakers’ closure durations were a little longer than male speakers’ between vowels although the difference was not significant. I assume that the female speakers produced them more clearly and slowly than male speakers did when they encountered the nonsense words such as /a.pa/ that needed to be syllabified. Thus, they spent more processing time in their brains.
Kim (2021) analyzed word-initial Korean stops and the following vowels, and found that VOT was almost the same in lenis and aspirated stops and was short in fortis stops, whereas f0 of the following vowels was shown in the order of lenis < fortis < aspirated stops. As discussed before, closure durations can also distinguish the laryngeal contrasts. In this study I would like to point out that closure durations can also be used as a phonetic cue for stop distinctions. They are more sensitive to laryngeal contrasts, places of articulations, and positions within words than VOT and f0 in Korean stops.
The previous studies on English stops (Chodroff & Wilson, 2017; Yao, 2009) have shown that VOT may not be a dependable phonetic cue to distinguish places of articulation. Then, closure durations may become a phonetic cue for places of articulation of English stops. To prove this, future research is needed for closure durations of English stops using production experiments in which voicing, places of articulation, and positions within words should be investigated.
5. Conclusion
This study investigated closure durations of Korean stops. The statistical results showed that the closure durations based on laryngeal contrasts and places of articulation were significantly different. In addition, the positions within words were also different although they were marginally significant.
Before the production experiment, it was not expected that the closure durations of male speakers were longer than those of female speakers. I assume that this is because female speakers tend to speak fast or slowly and clearly when they are supposed to. Perhaps they thought that they did not have to produce the experiment materials carefully and slowly and hence they spoke faster than male speakers did. But when they were supposed to syllabify the nonsense words such as /a.pa/ in which stops were between vowels, they produced them more clearly and slower than male speakers did and hence spent more processing time in their brains.
Closure durations should also be considered as a valuable phonetic cue like VOT and f0. Since the VOT may not be a dependable phonetic cue for places of articulations for English stops, closure durations should be investigated for an alternative cue.