External photoglottography, intra-oral air pressure, airflow and acoustic data on the Korean fricatives /s’, s/

Kim, Hyunsoon; Maeda, Shinji; Honda, Kiyoshi; Crevier-Buchman, Lise

doi:10.13064/KSSS.2022.14.3.011

Phonetics Speech Sci. 2022; 14(3):11-25

pISSN: 2005-8063, eISSN: 2586-5854

DOI: https://doi.org/10.13064/KSSS.2022.14.3.011

Phonetics/음성학

External photoglottography, intra-oral air pressure, airflow and acoustic data on the Korean fricatives /s’, s/

Hyunsoon Kim¹^,^*, Shinji Maeda², Kiyoshi Honda³, Lise Crevier-Buchman⁴

Author Information & Copyright ▼

¹Department of English Language and Literature, Hongik University, Seoul, Korea

²CNRS LTCI, Télécom Paristech, Paris, France

³School of Computer Science and Technology, Tianjin University, Tianjin, China

⁴European Hospital of George Pompidou, Paris, France

^*Corresponding author : hyunskim@hongik.ac.kr

© Copyright 2022 Korean Society of Speech Sciences. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Jul 28, 2022; Revised: Sep 13, 2022; Accepted: Sep 13, 2022

Published Online: Sep 30, 2022

Abstract

From simultaneous recordings of the external photoglottography, intra-oral air pressure (P_io), airflow and acoustic data from four native Seoul Korean speakers (2 male and 2 female), we have found that the two fricatives are not significantly different in glottal opening peak and airflow peak height either word-initially or word-medially and that the duration of aspiration is significantly reduced in word-medial /s/, compared to those in word-initial /s/, not in /s’/. We have also found that the duration of a high P_io plateau is significantly longer in /s/ than in /s’/ both word-initially and word-medially and that airflow resistance (R=P_io/U) at the onset and offset of a P_io plateau and at the time of airflow peak height is significantly higher in /s’/ than in /s/ across the contexts. However, the differences in P_io peak and F0 are not significant. In addition, the transition time to reach airflow peak height from the offset of a P_io plateau is found to be significantly longer in /s/ than /s’/ in both word-initial and word-medial positions. No significant differences in glottal opening peak and airflow peak height confirm that /s/ is specified as [–spread glottis] like /s’/. As for the other significant differences, we propose that /s/ is [–tense], and /s’/ [+tense].

Keywords: aerodynamic and acoustic data; external photoglottography; Korean fricatives /s’, s/; [±s.g.] and [±tense]

1. Introduction

Many languages have a phonemic voicing contrast in strident fricatives (e.g. Tranel, 1987 for French; Shibatani, 1990 for Japanese; Ladefoged, 2006 for English; see Ladefoged & Maddieson 1996 for other languages). In contrast, Korean has the two-way phonation contrast in voiceless strident fricatives: the lenis /s/ and the fortis /s’/ which is produced on a pulmonic egressive airstream like fortis stops (e.g. /sata/ ‘to buy’; /s’ata/ ‘to be cheap’) (e.g. Kim et al., 2010b, 2011; Kim & Park 2011; H. Kim 2011). The typologically rare /s’/ vs. /s/ in Korean have received a great deal of attention in the literature (e.g. Kagaya, 1974; Hong et al., 1991; Jun et al., 1998; Cho et al., 2002; Kim et al., 2010b, 2011; Kim & Park 2011). However, the Korean fricatives have been investigated either in articulation or in acoustics/aerodynamics so far. This leads us to make simultaneous recordings of external lighting and sensing photoglottography (ePGG), in one session, intra-oral air pressure (P_io) in the other, and airflow and acoustic data in both in the present study in order to better understand the phonetic characterization of /s’/ and /s/. Beyond the scope of Korean fricatives, the present study would contribute to the simultaneous investigation of speech sounds articulatorily, aerodymanically, and acoustically in other languages as well, further deepening our understanding of speech sounds.^1,2

1.1. Background

In a recent MRI study, Kim et al. (2011) have investigated the Korean fricatives produced by two native speakers (one male and one female) of Seoul Korean, using coronal and sagittal images. What they have found is that there are two independent systematic controls during the oral constriction of the fricatives: (a) glottal opening and (b) linguopalatal constriction along the palate, concomitant with vertical laryngeal position. In both word-initial and word-medial positions, the maximum glottal opening was narrower in /s’/ than in /s/. The comparison of glottal opening in the fricatives to that of the Korean lenis (/p, t, ts, k/), aspirated (/p^h, t^h, ts^h, k^h/) and fortis (/p’, t’, ts’, k’/) stops in the context /_a_a/ (Kim et al., 2005, 2010a) has revealed that the fricatives have a narrower glottal opening than the aspirated stops /p^h, t^h, ts^h, k^h/ like the lenis and fortis stops both word-initially and word-medially. On the other hand, across the contexts, linguopalatal constriction is narrower in /s’/ than in /s/ at the point of maximal raising of the tongue apex while the maximal raising of the tongue blade is the same in the two fricatives; a narrow linguopalatal constriction with the tongue apex and blade is sustained longer in /s’/ than in /s/ during oral constriction; the distance between the maximally posterior point along the midsagittal tongue contour and the pharyngeal wall is longer in /s’/ than in /s/; and laryngeal height tends to be higher in /s’/ than in /s/ though vertical laryngeal positions of the two fricatives were observed to be sometimes the same in word-initial position in their female subject and also in word-medial position in their two subjects. When compared to the Korean stops, /s/ is similar to the lenis stops, and /s’/ to the fortis and aspirated stops in oral closure/constriction duration. As for the tongue apex/blade- laryngeal height coordination, Kim et al. (2011) have suggested that it is associated with the tensing of both the primary articulator (i.e. tongue apex/blade) and the vocal folds during the production of the fricatives and that this tensing is accounted for by the articulatory feature [±tense], as in the three-way phonation contrast in stops (Kim et al., 2005, 2010a). The feature [±tense] in Kim et al. (2010a, 2011) is newly modified from the traditional feature [±tense] in Jakobson et al. (1952) and C.-W. Kim (1965) according to whom the tension of the whole vocal tract is accounted for by the feature.³Kim et al. (2011) have also proposed that the other independent parameter of glottal opening is accounted for by the articulatory feature [±spread glottis] (henceforth, [±s.g.]), as in the stops, in line with Halle and Stevens (1971) with [+s.g.] for a wide glottal opening and [–s.g.] for an opening that is not wide. Thus, /s/ is specified as [–s.g., –tense] like lenis stops, and /s’/ as [–s.g., +tense] like fortis stops, as in Table 1 (a).⁴

Table 1. The laryngeal specification of (a) the fricatives /s, s’/ in comparison to that of (b) the three-way phonation contrast in stops (, , , ; , ).

		(a)		(b)
		/s/	/s’/	/p t ts k/	/p^h t^h ts^h k^h/	/p’ t’ ts’ k’/
i.	[s.g.]	–	–	–	+	–
ii.	[tense]	–	+	–	+	+

Download Excel Table

In addition to the MRI studies (Kim et al., 2005, 2010a, 2011), Kim et al. (2018) have provided further experimental data in favor of the specification in Table 1 (b), using ePGG, P_io, airflow and acoustic data. From simultaneous recordings of the experimental data for the lenis /p, t, k/, aspirated /p^h, t^h, k^h/ and fortis /p’, t’, k’/ plosives, the following investigations were made: (a) the timing relations among glottal opening onset and peak, airflow onset and peak and aspiration onset in relation to acoustic events such as consonant release onset and a vowel onset; (b) how much a glottal opening peak, an airflow peak height and a P_io peak occur; (c) how much it takes to reach a P_io plateau onset from a P_io onset; (d) how long a high P_io plateau is sustained by measuring the time interval between the onset and offset of the plateau and (e) what acoustic conditions such as oral closure duration and F0 arise in accordance with the three-way phonation contrast. The investigations showed that the phasing of glottal opening and the three-way phonation contrast occurs in the order, from early to late, fortis (<) lenis < aspirated plosives, and that glottal opening peak ranges from low to high in the same order. It was also found that a P_io peak, the durations of a high P_io plateau and an oral closure, and F0 are independent of the glottal opening mechanism, varying in the order of lenis < aspirated, fortis plosives. The time to reach a P_io plateau onset also revealed that the aspirated and fortis plosives take a shorter time to reach the plateau onset than the lenis plosives. Based on the results, Kim et al. (2018) have proposed that the phonation- type-specific glottal opening is accounted for by the feature [±s.g.] and that the other phonation-type-specific pattern – a P_io peak, the durations of a high P_io plateau and an oral closure, F0 and the time to reach a P_io plateau onset – has to do with the tensing of the primary articulator and the vocal folds in the sense of Kim et al. (2010a, 2011), being accounted for by the feature [±tense], as in Table 1 (b).

The current paper seeks to do for the fricatives /s’, s/ what Kim et al. (2018) did for the plosives, namely to combine articulatory, aerodynamic, and acoustic data in a study of the phonetic characterization of the phonation contrast. Given that the Korean fricatives have been investigated either in articulation or in acoustics/aerodynamics in the literature, the simultaneous recordings should lead to a more accurate description of the fricatives. In addition, the comparison of simultaneous recordings of articulatory, aerodynamic and acoustic data for the fricatives to those for the lenis (/p, t, k/), aspirated (/p^h, t^h, k^h/) and fortis (/p’, t’, k’/) plosives in Kim et al. (2018) would strengthen our understanding of the speech mechanism of the Korean fricatives. Therefore, the simultaneous recordings in the present study should offer new insight into what acoustic and aerodynamic conditions co-occur when the two fricatives are articulatorily implemented and eventually into the phonetic characterization of the two-way phonation contrast in Korean fricatives.

1.2. The two-way phonation contrast in Korean fricatives: Hypotheses

Based on the findings in Kim et al. (2018), we make the following hypotheses for the fricatives /s’/ and /s/. Hypothesis 1 is that the glottis opens earlier and its peak also occurs earlier in /s’/ than in /s/ across the contexts, as in fortis vs. lenis plosives. Hypothesis 2 is that different from plosives, the two fricatives have no significant differences in glottal opening peak and airflow peak height both word-initially and word-medially, given that a large glottal opening and a high airflow peak are required in order to ensure sufficient rate of airflow for frication during the oral constriction of voiceless fricatives (e.g. Sawashima, 1969; Klatt et al., 1968; Hirose et al., 1978; Löfqvist, 1992; Lindqvist, 1972; Lisker et al., 1969; Löfqvist & Yoshioka 1984; Shadle, 2010). Hypothesis 3 is that the duration of aspiration is less context-dependent in /s’/ than in /s/, as in fortis/aspirated vs. lenis plosives. Hypothesis 4 is that the time to reach a P_io plateau onset from a P_io onset would be shorter in /s’/ than in /s/, as in fortis/aspirated vs. lenis plosives and that the time from the offset of a P_io plateau to airflow peak height would also be shorter in /s’/ both word-initially and word-medially. Hypothesis 5 is that the duration of a high P_io plateau, which is the aerodynamic counterpart of the acoustic frication duration during oral constriction of a fricative, is sustained significantly longer in /s’/ than in /s/, as in fortis/aspirated vs. lenis plosives. Hypothesis 6 is that there is no significant difference in P_io peaks between the fricatives, as among lenis, fortis and aspirated plosives, because P_io which is considered to be equal to the subglottal air pressure tends to be consistent, regardless of phonetic context (e.g. Netsell, 1969). Hypothesis 7 is that the two fricatives are distinguished by airflow resistance (R=P_io/U) at the onset and offset of a P_io plateau as well as at the time of airflow peak height, as in fortis/aspirated vs. lenis plosives.

The present study is structured as follows. Our experimental methods and results are presented in section 2 and section 3, respectively; and discussions of the results are in section 4 with a brief conclusion in section 5.

2. Methods

2.1. Stimuli, data acquisition and processing

As test words, the two fricatives /s’, s/ were put in the contexts /_a_a/, /_ama/ and /ma_a/, as in Table 2. Among the test words, /sama/ ‘(I will) buy’ and /s’ama/ ‘(I will) wrap’ are real words with the ending suffix /ma/ used usually when old people address young Koreans, and the rest are nonsense words.

Table 2. The test words in the contexts (a) /_a_a/ and (b) /_ama/ and /ma_a/.

(a)	/s’a’sa/	/sasa/
(b)	/s’ama/ /mas’a/	/sama/ /masa/

Download Excel Table

Two male (M1, M2) and two female (F1, F2) native speakers of Seoul Korean participated in the present experiments, as in Kim et al (2018). The three subjects (M1, M2, F1) were in their mid-twenties and had been living in Paris for less than six months, and the subject (F2) was in her early-fifties visiting the city, when the present experiments were conducted.⁵ They read the test words in Table 2 and fillers embedded in the frame sentence /nɛka __ palɨmhapnita/ ‘I pronounce __’ five times at a normal speaking rate. The 120 tokens (6 test words x 5 repetitions x 4 subjects) were then analyzed.

The subjects were recorded in two sessions, with ePGG recorded in one, Pio in the other, and airflow and acoustic data in both.⁶ In one session where the first three authors participated, simultaneous recordings of ePGG, airflow and acoustic data were made in the soundproof recording room of the Laboratory of Phonetics and Phonology, CNRS/Sorbonne-Nouvelle (University of Paris 3). The adduction-abduction movement of the glottis during the production of the fricatives was monitored with the light source being infrared light emitting diodes (IR LEDs) placed on the exterior surface of the neck between the hyoid bone and the thyroid cartilage. Normally, two IR LEDs are placed on the sides of the larynx to illuminate the hypopharyngeal wall. This method was used for the subjects M1, M2 and F1, as shown in Figure 1 (a). However, this method cannot be used for subject F2 because she had a thick layer of subcutaneous fat. Instead, the LED light was placed on the midline neck surface to illuminate the base of the epiglottis which reduces the light transmission through the fatty tissue, as in Figure 1 (b).⁷ The light from the LED transmits through the neck tissues and illuminates the cavities above and inside the larynx. In turn, the light in the cavities propagates through the glottis into the lower cavity below the larynx. The strength of illumination of the sub-laryngeal region undergoes modifications of the intensity as a function of the glottal opening/closing variations. The photodiode placed on the neck surface below the cricoid cartilage (see Figure 1) detects, across the lower neck tissues, the time-varying light intensity in the sub- laryngeal region, and outputs an electric current which varies with the glottal area.

Figure 1. External lighting and sensing photoglottography (ePGG) system with a high-power light emitting diode (LED) on (a) the sides and (b) the front of the neck of a subject between the hyoid bone and the thyroid cartilage. Glottal transmission is detected by a photodiode placed on the midline of the neck below the cricoid cartilage.

Download Original Figure

The ePGG system, which is composed of infrared LED(s) and a photodiode, forms a photo-interrupter circuit with ambient light rejection. The IR LED (810 nm, IF=100 mA) was driven by square current pulses at 16 kHz, and the incident light current on the photodiode was amplified and integrated to output ePGG signals. This ePGG technique is noninvasive and can be used with any speech material including the back vowel /a/ and voiced/voiceless consonants. Another advantage of the ePGG method is that it is possible to obtain multichannel (i.e. ePGG, airflow and acoustic) data simultaneously on natural speech from as many subjects as possible.⁸

Airflow rate was measured by the principle of pressure- difference anemometry using a dust-protection mask made of synthetic fibers and a differential pressure sensor. The mask’s airflow resistance was calibrated by airflow from a one-liter air syringe passing through the mask. Air pressure inside the mask was measured by a differential type of pressure sensor, PA-100-100D-W (Copal Electronics, Japan), having the range of ±10 hPa, to calculate the airflow rate in the unit of ml/sec. In addition, in order to prevent air leakage from the gap on the side of the nasal bridge when a subject’s jaw moves, an adhesive tape was used. In the other recording session where all the authors took part, simultaneous recordings of P_io, airflow and acoustic data were made in the European Hospital of George Pompidou (HEGP), Paris. P_io was measured by inserting a pressure probe to the pharyngeal cavity via the nostril with the help of our fourth author who has been a medical doctor in the hospital. The distance between the end of the probe and the glottis was around 4-5 cm. As shown in Figure 2, we used a dust-protection mask having two silicon tubes (one for air pressure and the other for airflow) connected to solid-state pressure transducers (PA-100, Copal Electronics, Japan).

Figure 2. A method for recording P_io above the glottis and oral airflow with a dust-protection mask made of synthetic fibers.

Download Original Figure

One of the transducers is highly sensitive; it, together with the mask, constitutes a Lilly type pneumotachograph. In our system, the paper-like tissue of the mask acts as an airflow resistance. Thus, when expiratory airflow passes through the mask, a small air- pressure raise occurs inside the mask due to the resistance. Conversely, inspiratory airflow causes a small drop in the air pressure. The differential transducer measures these pressure variations inside the mask relative to the outside atmospheric pressure. Our experimental study showed that the airflow resistance of individual masks is constant regardless of the practical airflow levels. However, from mask to mask variations of the resistance are rather large, even from the same lot. Therefore, we need to calibrate each mask for each subject in order to obtain a conversion scaling coefficient from voltage to, for example, liters/sec, in post- processing. We used a one-liter calibration syringe (Cardinal Health, Germany) to determine the value of the scaling coefficient of each mask before the experiment. Calibration of P_io is not necessary because the pressure sensor (Copal PA100-500D-W with the range ±50 hPa) is pre-calibrated, and the maker provides the scaling coefficient to convert the output voltage into hPa. Air pressure was recorded simultaneously with airflow and acoustic data for the fricatives in Table 2, using the same frame sentence, as in our previous data acquisition. The output signals in voltage from the transducer were converted into hPa in post-processing.

All the signals are digitally recorded at the same sampling frequency of 20 kHz with 16-bit resolution using a multichannel data recorder (Dash-8x, Astro-Med). When different sampling frequencies are assigned for different channels, the data file transfer becomes impractically slow. Because of this, we used the same sampling frequency, even though the effective bandwidth of those signals greatly varies from the narrowest, DC to 60 Hz for airflow and P_io signals to the largest, 20 Hz to 7.5 kHz for the audio signal. Table 3 summarizes the low-pass filter types, cutoff frequencies (f_c) and order of analog and digital low-pass filters used in the present study.

Table 3. The low-pass filter types, cutoff frequencies (f_c) and order of analog and digital low-pass filters as a function of signal types.

Signal type	Analog filters (hardware)			Digital filters (software)
Signal type	Type	f_c (kHz)	Order	Type	f_c (Hz)	Order
Audio	Elliptic	7.4	8	_	_	_
ePGG	Butterworth	5.3	8	_	_	_
Airflow	Butterworth	2.2	8	Bessel	60	4
Pio	Butterworth	2.2	8	Bessel	60	4

Download Excel Table

The analog filters play the role of anti-aliasing for the digitization of signals and of bandwidth limitation to improve the noise-to-signal ratio. Here we must consider the time-delay of the analog low-pass filter, which becomes significant in the order of 10 ms, as the cutoff frequency is set as low as 60 Hz for processing airflow and P_io signals. Because the time-delay of the low-pass filter, with the cutoff frequency of 7.4 kHz, for the audio signal is negligibly small, the airflow and P_io signals are delayed by about 10 ms relative to the audio signal. We, therefore, set the cutoff frequency for these signals to 2.2 kHz. The time-delay becomes roughly 0.5 ms, which can be safely neglected in our studies. Because temporal variations of these signals are related to transitions from phoneme to phoneme and therefore slow, they can be filtered with the low cutoff frequency of 60 Hz, as indicated in Table 3. The digital filters are operated in zero-phase mode by processing the digitized signal in forward and then in backward direction in time. As the result, we have 60 Hz low-passed output signals without any additional time-delay. For the ePGG signals, we used the analog filter with DC to 5.3 kHz bandwidth to cover the syllable related slow adduction/abduction of the glottis and fast glottal oscillation during voicing, in which the filter time-delay is not the issue.

2.2. Data measurements and calculations

Figure 3 shows data measured on one male subject (M1), displaying, as a typical example, the wide-band spectrogram (top panel) of the acoustic signal, and the signal itself (second from top); and the ePGG and airflow signals (third and fourth from top) for the second repetition of the word /sasa/. In the ePGG signal, when the curve has a positive slope, it indicates an opening movement of the vocal folds during the oral constriction of /s/. The unit of mV in our ePGG data is an arbitrary one in that the data are not calibrated or normalized. Calibrating ePGG data is impossible, due to the fact that the morphology of the larynx and its vicinities is geometrically complex and greatly varies from one speaker to another. In addition, we have found that the properties of the tissue, which determine how light is transmitted, greatly varies depending on the subject. However, we think that the averaging of ePGG values across repetitions of the same token in the same recording session in each subject should be perfectly legitimate, though uncalibrated ePGG signals might impose a certain limitation on the quantitative data analysis.

Figure 3. The time measurements (a) at a glottal opening onset, (b) at a glottal opening peak, (c) at an airflow peak after a glottal opening peak and (d) at a following vowel onset and the measurement points of ePGG values at (a) and (b) (marked by the arrows in the third panel, respectively) and of an airflow peak at (c) (marked by the arrow in the bottom panel) during the articulation of /s/ and the vowel /a/ in the first syllable of the test word /sasa/ in the second repetition of subject M1. A dotted line is used in the bottom panel to make the airflow peak at (c) visible.

Download Original Figure

Note also that glottal opening starts in a vowel preceding the word-initial fricative /s/ in order to ensure sufficient rate of flow for frication, as reported in the literature (e.g. Klatt et al., 1968; Sawashima, 1969; Löfqvist, 1992; Löfqvist et al., 1995). To make our measurements consistent, however, we considered glottal opening occurring at the offset of a preceding vowel as its onset, as indicated by the first arrow in the third panel from top in Figure 3, with reference to a corresponding wide-band spectrogram in the top panel where the frication of the fricative starts with noise energy above 4 kHz as an alveolar fricative (e.g. Fant, 1960; Kent & Read, 2002) as well as to the signal itself (the second panel from top).

In order to investigate which fricative has the earlier glottal opening onset and peak and the longer duration of aspiration in both word-initial and word-medial positions, we referred to the time at which airflow reaches its peak as a main reference in that the frication offset of a fricative occurs at airflow peak. Given this, four temporal reference points were measured using Matlab, as marked by circles in the bottom panel in Figure 3: the time (a) at which a glottal opening onset occurs (t_GOO), (b) at which a glottal opening peak occurs (t_GOP), (c) at which airflow reaches its peak (t_FWP) and (d) at which a following vowel starts (t_VO).

Time intervals were then calculated between airflow peak and glottal opening onset, between airflow peak and glottal opening peak and between a following vowel onset and airflow peak for the three variables T_GOO, T_GOP and T_ASP for aspiration after the fricatives, respectively, as listed in Table 4.

Table 4. The calculations for time intervals (a) between airflow peak and glottal opening onset (T_GOO), (b) between airflow peak and glottal opening peak (T_GOP) and (c) between vowel onset and airflow peak (T_ASP).

(a) T_GOO = time at airflow peak (t_FWP) – time at glottal opening onset (t_GOO)

(b) T_GOP = time at airflow peak (t_FWP) – time at glottal opening peak (t_GOP)

Download Excel Table

Across all four subjects, the duration of aspiration as a transition from the offset of the frication to the onset of a following vowel was identified by noise covering a broad range of frequencies with relatively weak energy (e.g. Fant, 1960; Kent & Read, 2002) from the time point (c) to (d), compared to the frication phase of the fricative /s/ with noise energy above 4 kHz from the time point (a) to (c).

ePGG values were also measured at the time of glottal opening onset (t_GOO) and at the time of glottal opening peak (t_GOP), as marked by the first and second arrows, respectively, in the third panel in Figure 3, and then the former was subtracted from the latter in order to examine how high the peak of glottal opening gets during the production of the fricatives relative to glottal opening at fricative onset (GOP), as in Table 5 (a). Airflow was also measured at the time of airflow peak (t_FWP) for airflow peak height (FWP), as in Table 5 (b).

Table 5. The calculations for (a) glottal opening peak (GOP) and (b) airflow peak height (FWP) after the peak.

(a) GOP = value of ePGG at t_GOP – value of ePGG at t_GOO

(b) FWP = value of airflow at t_FWP

Download Excel Table

Figure 4, which is taken from the same male subject (M1), displays, as a typical example, the simultaneous recording of the acoustic, P_io and airflow data for his fourth repetition of /sasa/ in the second, third and fourth panel from top, respectively, as a function of time. Its accompanying wide-band spectrogram of /sasa/ is placed above the waveform in the top panel.

Figure 4. The wide-band spectrogram (top panel) of the acoustic signal, and the signal itself (second from top); and P_io and airflow signals (third and fourth from top) for the fourth repetition of the word /sasa/ by subject M1. The measurement points of P_io values at a P_io onset, at the onset of a P_io plateau, at the offset of a P_io plateau, and at the peak of airflow at time points (a), (b), (c) and (d), respectively, in the third panel from top, and those of airflow at time points (b), (c) and (d) in the bottom panel during the articulation of the word-initial /s/.

Download Original Figure

From the simultaneous recording, P_io values were measured at a P_io onset, at the onset and offset of a high P_io plateau and at the time of airflow peak height, as marked by the four arrows, respectively, in the third panel. Airflow was measured at the onset and offset of a high P_io plateau, as marked by the first two arrows in the bottom panel, respectively, and its peak height was measured after the offset of a high P_io plateau, as marked by the third arrow in the same panel. Time measurements (a) at a P_io onset (t_PON), (b) at the onset (t_PPON) and (c) the offset (t_PPOFF) of a P_io plateau and (d) at airflow peak (t_FWP) were also made, as marked by the four circles in the bottom panel, respectively.

Based on the measurements, then, we calculated time interval between a P_io plateau onset and a P_io onset (T_pp) and time interval between airflow peak and the offset of a P_io plateau (T_PPFW), as in Table 6 (a) and (b), respectively, in order to examine transition time before the onset and after the offset of a P_io plateau. We also measured the duration of a P_io plateau (T_PPD), deducting the time of the P_io plateau onset from that of the offset of a P_io plateau, as in Table 6 (c).

Table 6. The calculations for time intervals (a) between a P_io plateau onset and a P_io onset (T_PP) and (b) between airflow peak and the offset of a P_io plateau (T_PPFW), (c) duration of a P_io plateau (T_PPD), (d) P_io peak (P_P) and (e) airflow resistance (i) at the onset of a P_io plateau (R_PPON), (ii) at the offset of a P_io plateau (R_PPOFF) and (iii) at airflow peak (R_FWP).

(a) T_PP = time at a P_io plateau onset (t_PPON) – time at a P_io onset (t_PON)

(b) T_PPFW = time at airflow peak (t_FWP) – time at a P_io plateau offset (t_PPOFF)

(d) P_P = the highest P_io (P_H) – P_io at a Pio onset (t_PON)

(e) i. airflow resistance at a P_io plateau onset (R_PPON)

= P_io at t_PPON – P_io at t_PON /airflow at t_PPON

ii. airflow resistance at a P_io plateau offset (R_PPOFF)

= P_io at t_PPOFF – P_io at t_PON /airflow at t_PPOFF

iii. airflow resistance at airflow peak (R_FWP)

= P_io at tFWP – P_io at t_PON /airflow at t_FWP

Download Excel Table

In addition, a P_io peak (P_P) was measured, deducting a P_io at a P_io onset from the highest P_io during a P_io plateau, as in Table 6 (d). Airflow resistance (R=P_io/U) was calculated at the onset and offset of a P_io plateau and at the time of airflow peak height, as in Table 6 (e).⁹

Acoustic data recorded together with P_io and airflow data were separately stored digitally in WAV format and analyzed using Praat (Boersma, 2001; Boersma & Weenink 2019). From the acoustic data, F0 was measured at the onset of a vowel following the fricatives in both word-initial and word-medial positions.

The experimental data (i.e. ePGG, P_io, airflow and acoustic data) were measured manually by our first author after we sufficiently discussed how to measure the data over several years, and the measurement was cross-checked/verified by the others, especially by our second author who had processed all the recorded data in Matlab for measurements. For the statistical analysis of our measurements, we conducted repeated measures ANOVAs with two between subject factors (laryngeal category (i.e. /s’/ vs. /s/) and word context (i.e. /_a_a/ vs. /_ama/, /ma_a/)) and one within subject factor (word position (i.e. word-initial vs. word-medial position)). Tukey post hoc comparisons were made for the comparisons of word-initial and word-medial /s’/ vs. /s/ and word-initial vs. word-medial /s’/ and /s/.

3. Results

This section is divided into the two main subsections: one is for ePGG, airflow and acoustic data, and the other for duration of a P_io plateau, P_io peak, airflow resistance and F0.

3.1. ePGG, airflow and acoustic data

Among the hypotheses made in the Introduction section, Hypothesis 1 was that the glottis opens earlier and that its peak also occurs earlier in /s’/ than in /s/ across the contexts, as in fortis vs. lenis plosives (Kim et al., 2018). The hypothesis is supported, as in Figure 5 where both glottal opening onset and glottal opening peak in (a) and (b), respectively, start earlier relative to the airflow peak (marked as time “0” on the vertical axis) in /s’/ than in /s/, in all four subjects, both word-initially and word-medially.¹⁰

Figure 5. The average time intervals (a) between airflow peak and glottal opening onset (T_GOO) and (b) between airflow peak and glottal opening peak (T_GOP) in /s’/ and /s/ for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

Statistical results showed that word context has no significant main effect on the time of glottal opening onset (T_GOO) [F(1, 12)=0.0826, p=0.779; ɳ²_p=0.007] and that the interaction of word context and laryngeal category is not significant, either [F(1, 12)=0.115, p=0.741; ɳ²_p=0.009].¹¹ Yet, there is a significant main effect of laryngeal category (i.e. /s’/ vs. /s/) [F(1, 12)=21.726, p<0.001; ɳ²_p=0.644], and the interaction of word position and laryngeal category is also significant [F(1, 12)=124.9782, p<0.001; ɳ²_p=0.912]. In post hoc comparisons, the time of glottal opening onset (T_GOO) is not significant in word-initial /s’/ vs. /s/ [t= –1.98, p=0.243], but significant in word-medial /s’/ vs. /s/ [t= –7.09, p<0.001], whereas it is significant in word-initial vs. word-medial /s’/ [t= –10.93, p<0.001] and also in word-initial vs. word-medial /s/ [t=4.88, p=0.002]. The time of glottal opening peak (T_GOP) is not affected by word context [F(1, 12)=0.775, p=0.396; ɳ²_p=0.061], and the interaction of word context and laryngeal category is not significant, either [F(1, 12)=0.152, p=0.704; ɳ²_p=0.012]. However, it is significantly affected by laryngeal category [F(1, 12)=51.194, p<0.001; ɳ²_p=0.810], and the interaction of word position and laryngeal category is also significant [F(1, 12)=7.4216, p=0.018; ɳ²_p=0.382]. In post hoc comparisons, the difference between word-initial /s’/ vs. /s/ is significant [t= –5.12413, p<0.001] and also between word-medial /s’/ vs. /s/ [t= –7.60987, p<0.001]. The difference between word-initial vs. word-medial /s’/ is significant [t= –3.86156, p=0.011], and that between word-initial vs. word- medial /s/ is not [t= –0.00887, p=1.000].

Figure 6 illustrates the average (a) glottal opening peak (GOP) of /s’/ and /s/ for the subjects M1, M2 and F1(i) and for the subject F2 (ii); and (b) airflow peak height (FWP) and (c) duration of aspiration (T_ASP) for all four subjects both word-initially and word-medially. Note that the scales of glottal opening peak in Figure 6 (a i) and (a ii) are different, because ePGG data were obtained for the subjects M1, M2 and F1 with two light sources on the sides of the neck, as in Figure 1 (a), and for the subject F2 with one light source at the front of the neck, as in Figure 1 (b). Therefore, a vertical scale goes up to 0.08 mV vs. 13 mV. Given the ePGG data in the present study provide information on relative glottal openings of the examined consonants, what is important is the systematic pattern of glottal opening peak among the four subjects, not the differences in the scales.

Figure 6. The average glottal opening peak (GOP) of /s’/ and /s/ for the subjects M1, M2 and F1 (a i) and for the subject F2 (a ii); and (b) airflow peak height (FWP) after glottal opening peak and (c) duration of aspiration (T_ASP) for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

Though the peak of glottal opening is lower in /s’/ than in /s/ both word-initially and word-medially in all the subjects (Figure 6a), there is no significant main effect of word context on glottal opening peak for the subjects M1, M2 and F1 [F(1, 8)=1.021, p=0.342; ɳ²_p=0.113], and the interaction of word context and laryngeal category is not significant either [F(1, 8)=0.182, p=0.681; ɳ²_p= 0.022].¹²

In addition, laryngeal category has no significant main effect [F(1, 8)=0.504, p=0.498; ɳ²_p=0.059], and the interaction of word position and laryngeal category is not significant, either [F(1, 8)=0.4826, p=0.507; ɳ²_p=0.057]. In post hoc comparisons, /s’/ and /s/ are not significantly different either word-initially [t=0.9922, p=0.756] or word-medially [t=0.0635, p=1.000]. The difference between word-initial vs. word-medial /s’/ is not significant, either, in the three subjects [t= –0.1838, p=0.998], and the same is true of that between word-initial vs. word-medial /s/ [t=0.7986, p=0.853]. Airflow peak height is also lower in /s’/ than in /s/ in both word-initial and word-medial positions for all four subjects, as shown in Figure 6 (b). However, word context has no significant main effect [F(1, 12)=6.61, p=0.980; ɳ²_p=0.000], and the interaction of word context and laryngeal category is not significant, either [F(1, 12)=0.00298, p=0.957; ɳ²_p=0.000]. Laryngeal category has no significant main effect [F(1, 12)=1.97465, p=0.185; ɳ²_p=0.141], and the interaction of word position and laryngeal category is not significant, either [F(1, 12)=2.457, p=0.143; ɳ²_p=0.170]. In post hoc comparisons, /s’/ and /s/ are not significantly different, either in word-initial position [t=1.912, p=0.261] or in word-medial position [t=0.668, p=0.908]. The difference between word-initial vs. word-medial /s’/ is not significant, either [t=0.190, p=0.997], and the same is true of that between word-initial vs. word-medial /s/ [t = 2.407, p=0.129]. The statistical results support Hypothesis 2 that different from plosives (Kim et al., 2018), the two fricatives have no significant differences in glottal opening peak and airflow peak height both word-initially and word-medially.

Figure 6 (c) presents the average duration of aspiration (T_ASP) which is longer after /s/ than after /s’/ for all four subjects both word-initially and word-medially. There is no significant main effect of word context on the duration of aspiration [F(1, 12)=0.00214, p=0.964; ɳ²_p=0.000], and the interaction of word context and laryngeal category is not significant either [F(1, 12)=0.11912, p=0.736; ɳ²_p=0.010]. Yet, laryngeal category has a significant main effect [F(1, 12) = 21.12575, p<0.001; ɳ²_p=0.638], and the interaction of word position and laryngeal category is also significant [F(1, 12)=11.085, p=0.006; ɳ²_p=0.480]. In post hoc comparisons, the two fricatives are significantly different word- initially [t=5.672, p<0.001], but not word-medially [t=1.953, p=0.237]. In addition, the duration of aspiration is significantly reduced in word-medial /s/ [t=5.672, p<0.001], compared to that in word-initial /s/, whereas the difference between word-initial vs. word-medial /s’/ is not significant [t= –0.132, p=0.999]. This supports Hypotheses 3 that the duration of aspiration is less context-dependent in /s’/ than in /s/, as in fortis/aspirated vs. lenis plosives (Kim et al., 2018).

3.2. Duration of a P_io plateau, P_io peak, airflow resistance and F0

Figure 7 presents the alignment of the P_io and airflow data for the word-initial /s’/ and /s/ at airflow peak height after a P_io plateau in the context /_a_a/ in the fourth repetition of the subject M1. Accompanying wide-band spectrograms are aligned at the onset of a vowel following the word-initial fricatives. From the alignment of the P_io and airflow data, it is noteworthy that the slope from a P_io onset to the onset of a P_io plateau is almost overlapped in the two word-initial fricatives and that after the offset of a P_io plateau, P_io falls more sharply in /s’/ than in /s/. In addition, after the offset of a P_io plateau, airflow peak height is lower in /s’/ than in /s/ across the contexts. This gives rise to the earlier onset of a following vowel after a brief aspiration noise in /s’/ than in /s/.

Figure 7. The P_io and airflow data for /s’/ and /s/ in /_a_a/ are aligned at airflow peak after a P_io plateau in the word-initial fricatives in the fourth repetition of the subject M1.

Download Original Figure

The average time to reach the onset of a P_io plateau from a P_io onset (T_PP) and to reach airflow peak height from the offset of the P_io plateau (T_PPFW) in /s’/ and /s/ is presented in Figure 8 (a) and (b), respectively. The figure shows that it is not before the onset but after the offset of a P_io plateau that the transition time is always shorter in /s’/ than in /s/ both word-initially and word-medially in all four subjects.

Figure 8. The average transition time (a) to reach the onset of a P_io plateau from a P_io onset (T_PP) and (b) to reach airflow peak height from the offset of the P_io plateau (T_PPFW) in /s’/ and /s/ for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

Laryngeal category has no significant main effect [F(1, 12)=1.51, p=0.242; ɳ²_p=0.112], and the interaction of word position and laryngeal category is not significant, either [F(1, 12)=1.52, p=0.242; ɳ²_p=0.112]. In post hoc comparisons, the two fricatives are not significantly different either word-initially [t=1.74, p=0.326] or word-medially [t= –7.98e-4, p=1]. The time from a P_io onset to the onset of a P_io plateau is significant between word-initial vs. word-medial /s’/ [t=11.55, p<0.001] and also between word-initial vs. word-medial /s/ [t=9.81, p<0.001].

As for the time from the offset of a P_io plateau to airflow peak height, it is not significantly affected by word context [F(1, 12)=0.566, p=0.466; ɳ²_p=0.045], and the interaction of word context and laryngeal category is not significant either [F(1, 12)=0.00773, p=0.931; ɳ²_p=0.001]. However, it is significantly affected by laryngeal category [F(1, 12)=19.81306, p<0.001; ɳ²_p=0.623], and the interaction of word position and laryngeal category is also significant [F(1, 12)=5.880, p= 0.032; ɳ²_p=0.329]. In post hoc comparisons, the two fricatives are significantly different word-initially [t=5.064, p<0.001], and not word-medially [t=2.559, p=0.081]. The difference between word-initial vs. word-medial /s’/ is not significant [t= –0.793, p=0.856], and that between word-initial vs. word-medial /s/ is not either [t=2.636, p=0.088 for /s/]. Given that it is after the offset of a P_io plateau that transition time is significantly different between /s’/ and /s/, Hypothesis 4 is partially supported.

Figure 9 (a) shows that the average duration of a P_io plateau (T_PPD) during oral constriction is much longer in /s’/ than in /s/ at the examined contexts in all four subjects.

Figure 9. The average (a) duration of a high P_io plateau (T_PPD) and (b) P_io peak (P_P) of /s’/ and /s/ for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

It is significantly affected by word context [F(1, 12)=7.871, p=0.016; ɳ²_p=0.396], and the interaction of word context and laryngeal category is not significant [F(1, 12)=0.864, p=0.371; ɳ²_p= 0.067]. Yet, laryngeal category has a significant main effect [F(1, 12)=40.224, p<0.001; ɳ²_p=0.770], and the interaction of word position and laryngeal category is also significant [F(1, 12)=16.008, p=0.002; ɳ²_p=0.572]. In post hoc comparisons, the difference between /s’/ and /s/ is significant, both word-initially [t= –3.9066, p=0.005] and word-medially [t= –7.4593, p<0.001]. The comparison of word-initial vs. word-medial /s/ is significant [t= –5.7322, p<0.001], but that of word-initial vs. word-medial /s’/ is not [t= –0.0739, p=1.000]. The statistical results support Hypothesis 5 that the duration of a high P_io plateau, which is the aerodynamic counterpart of the acoustic frication duration during oral constriction of a fricative, is significantly longer in /s’/ than in /s/ both word-initially and word-medially, as in fortis/aspirated vs. lenis plosives (Kim et al., 2018).

Figure 9 (b) illustrates that the average P_io peak values (P_P) measured as in Table 6 (d) tend to be higher in /s’/ than in /s/ in all four subjects across the contexts.¹³ Word context has no significant main effect [F(1, 12)=0.17823, p=0.680; ɳ²_p=0.015], and the interaction of word context and laryngeal category is not significant either [F(1, 12)=0.00460, p=0.947; ɳ²_p=0.000]. In addition, there is no significant main effect of laryngeal category on P_io peak [F(1, 12)=0.05948, p=0.811; ɳ²_p=0.005], and the interaction of word position and laryngeal category is not significant, either [F(1, 12)=0.0231, p=0.882; ɳ²_p=0.002]. In post hoc comparisons, the two fricatives are not significantly different in P_io peak either word-initially [t= –0.224, p=0.996] or word-medially [t= –0.260, p= 0.994]. The same is true of P_io peak between word-initial vs. word-medial /s’/ [t= –0.423, p=0.973] and also between word-initial vs. word-medial /s/ [t= –0.208, p=0.997]. This supports Hypothesis 6 that there is no significant difference in P_io peaks between the fricatives, as among lenis, fortis and aspirated plosives (Kim et al., 2018), because P_io which is considered to be equal to the subglottal air pressure tends to be consistent, regardless of phonetic context (e.g. Netsell, 1969).

Figure 10 shows that the average airflow resistance at the onset and offset of a P_io plateau and at the time of airflow peak height is higher in /s’/ than in /s/ both word-initially and word-medially (see Table 6 (e) for our calculations). Airflow resistance is not affected by word context at each examined point [F(1, 12)=0.2955, p=0.597; ɳ²_p=0.024 at the onset of a P_io plateau; F(1, 12)=0.103, p=0.754; ɳ²_p=0.008 at the offset of a P_io plateau; F(1, 12)=0.596, p=0.455; ɳ²_p=0.047 at airflow peak height]. The interaction of word contex and laryngeal position is not significant either [F(1, 12)=0.0655, p=0.802; ɳ²_p=0.005 at the onset of a P_io plateau; F(1, 12)=0.166, p=0.691; ɳ²_p=0.014 at the offset of a P_io plateau; F(1, 12)=0.509, p=0.489; ɳ²_p=0.041 at airflow peak height].

Figure 10. The average airflow resistance (a) at the onset of a P_io plateau (R_PPON), (b) at the offset of a P_io plateau (R_PPOFF) and at (c) at airflow peak (R_FWP) in /s’/ and /s/ for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

Yet, laryngeal category has a significant main effect on airflow resistance at each examined point [F(1, 12)=4.9486, p=0.046; ɳ²_p=0.292 at the onset of a P_io plateau; F(1, 12)=5.908, p=0.032; ɳ²_p=0.330 at the offset of a P_io plateau; F(1, 12)=17.014, p=0.001; ɳ²_p=0.586 at airflow peak height]. The interaction of word position and laryngeal category is not significant at the examined points [F(1, 12)=0.0379, p=0.849; ɳ²_p=0.003 at the onset of a P_io plateau; F(1, 12)=0.104, p=0.753; ɳ²_p=0.009 at the offset of a P_io plateau; F(1, 12)=0.532, p=0.480; ɳ²_p=0.042 at airflow peak height]. In post hoc comparisons, the difference in airflow resistance is significant between /s’/ and /s/ at airflow peak height both word-initially [t= –2.837, p=0.044] and word-medially [t= –3.722, p=0.006], not at the onset a P_io plateau [t= –2.016, p=0.225 word-initially; t= –2.152, p=0.181 word-medially] and at the offset of a P_io plateau [t= –2.44, p=0.17 word-initially; t= –2.30, p=0.1481 word-medially]. The comparison of word-initial vs. word-medial /s’/ and that of word-initial vs. word-medial /s/ are not significant, either, at the three points [t=0.113, p=0.999 for /s’/ and t=0.388, p=0.979 for /s/ at the onset of a P_io plateau; t= –1.12, p=0.684 for /s’/ and t= –1.58, p=0.427 for /s/ at the offset of a P_io plateau; t= –1.299, p=0.581 for /s’/ and t= –0.267, p=0.993 for /s/ at airflow peak height]. The significant main effect of laryngeal category on airflow resistance at the three points supports Hypothesis 7 that the two fricatives are distinguished by airflow resistance (R=P_io/U) at the onset and offset of a P_io plateau as well as at the time of airflow peak height, as in fortis/aspirated vs. lenis plosives (Kim et al., 2018).

Finally, Figure 11 shows the average F0 at the onset of a vowel following the fricatives /s’/ and /s/ across the contexts in all four subjects. The raw data for mean F0 values shows gender-related differences, as in other languages (e.g. Pinho et al., 2012). That is, average F0 values are higher in the two female subjects than in the male subjects in both of the fricatives, no matter whether they are in word-initial or in word-medial position. The gender-related differences in F0 are statistically significant [F(1, 12)=21.10849, p<0.001; ɳ²_p=0.638], both word-initially [t= –4.412, p=0.004] and word-medially [t= –4.664, p=0.002]. However, there is no significant effect of word context [F(1, 12)=0.02221, p=0.884; ɳ²_p=0.002], and the interaction of word context and laryngeal category is not significant either [F(1, 12)=0.00582, p=0.940; ɳ²_p=0.000].

Figure 11. The average F0 at the onset of a vowel following /s’/ and /s/ for all four subjects word-initially (WI) and word-medially (WM) with error bars as one standard deviation.

Download Original Figure

Laryngeal category has no significant main effect on F0 [F(1, 12)=0.41116, p=0.533; ɳ²_p=0.033], and the interaction of word position and laryngeal category is not significant, either [F(1, 12)=0.3306, p=0.576; ɳ²_p=0.027]. The comparison of the fricatives is not significant, either word-initially [t= –0.723, p=0.886] or word-medially [t= –0.543, p=0.947]. The difference between word-initial vs. word-medial /s’/ is not significant [t= –1.056, p=0.721], and the same is true of the difference between word-initial vs. word-medial /s/ [t= –1.869, p=0.291]. No significance in F0 between the fricatives either word-initially or word-medially suggests that F0 does not characterize the two-way phonation contrast in fricatives, as in the literature (e.g. Kim et al., 2010b; Kim & Park 2011).

In the next section, the results will be discussed.

4. Discussion

First, as for the statistically significant differences in the duration of a high P_io plateau and airflow resistance (Figures 9 (a) and 10), we propose that they are correlated with the tensing of the primary articulator (i.e. the tongue apex/blade) and the vocal folds, following Kim et al. (2011). In particular, among the phonetic properties related to the tensing, it is the duration of a high P_io plateau and airflow resistance that characterize the difference between the voiceless fricatives more substantially than a P_io peak and F0. That is, as a narrower linguo-palatal constriction often concomitant with a higher laryngeal raising (Kim et al., 2011) leads to a smaller oral cavity and the narrower constriction is sustained longer, a significantly longer duration of a high P_io plateau is sustained (Figure 9 (a)) with a significantly higher airflow resistance at the onset and offset of a P_io plateau and at the time of airflow peak height in /s’/ than in /s/ (Figure 10). However, a P_io peak and F0 are not significantly different across the contexts. As for the nonsignificant difference in P_io peaks, we may recall that P_io tends to be consistent, regardless of segments (e.g. Netsell, 1969). Regarding the nonsignificant difference in F0 between the fricatives (Figure 11), as in Kim et al. (2010b) and Kim and Park (2011) among others, we assume that it may be due to continuous airflow with no vocal fold vibration during oral constriction. This is reminiscent of vertical laryngeal positions which were observed to be sometimes the same in the two fricatives in Kim et al. (2011).

In addition, tenseness affects the time to reach airflow peak height from the offset of a high P_io plateau, as shown in Figure 8 (b), such that the tenser the fricative /s’/ is, the significantly shorter the time needed to reach the point of airflow peak height in both word-initial and word-medial positions. In contrast, the significantly longer transition from the offset of a P_io plateau to the point of airflow peak height after /s/ across the contexts is attributed to the laxness of /s/. The laxness of /s/ is further supported by a phonetic voicing of the fricative in word-medial intervocalic position (e.g. Cho et al., 2002; Kim et al., 2010b; Kim & Park 2011). In the acoustic data of the present study, we found no phonetic voicing of the word-medial fricative /s/ in all four subjects. Yet, for example, Kim and Park (2011) have found that among their 800 tokens with the two fricatives in both word-initial and word-medial positions, partial voicing occurred in word-medial /s/ in 113 tokens, and complete voicing throughout the fricative in 58 tokens.

Moreover, the duration of aspiration is significantly reduced in the word-medial /s/, compared to those in the word-initial /s/ and relatively consistent in /s’/ across the contexts (Figure 6 (c)). As for the significant reduction in the word-medial /s/, we suggest that it has to do with the laxness of /s/, as in lenis plosives (Kim et al., 2018). That is, it is due to the laxness of /s/ that a narrower glottal opening of the word-medial /s/ than that of the word-initial /s/ gives rise to significantly shorter duration of aspiration. In contrast, the tenseness of /s’/ results in relatively consistent duration of aspiration in both word-initial and word-medial positions. As a result, the tenseness of /s’/ is accounted for by [+tense], and the laxness of /s/ by [–tense], as in Table 1 (a ii).

Second, as for the nonsignificant differences in glottal opening peak and airflow peak both word-initially and word-medially, we suggest that the fricatives are specified as [-s.g.]. Given that glottal opening results in airflow, not the reverse, the nonsignificant difference in glottal opening peak between the fricatives empirically substantiates the view that /s/ is specified as [–s.g.] for glottal opening like /s’/, as in Table 1 (a i). One might recall that the non-fortis fricative has been proposed as aspirated (/s^h/) just like aspirated plosives, especially by Kagaya (1974) among others in the literature. In his fiberscopic study, Kagaya (1974) recorded nonsense monosyllables with the fricatives /s^h, s’/ and stops /p, p^h, p’, t, t^h, t’, k, k^h, k’, ts, ts^h, ts’/ in the word-initial contexts /_e/ and /_i/ and in the word-medial contexts /e_e/ and /i_i/ twice in isolation from one male speaker of Seoul Korean.¹⁴ According to the fiberscopic data, the glottal opening of the non-fortis fricative at frication onset is intermediate between the lenis and aspirated stops at their oral release in the word-initial context /_e/ and similar to that of the aspirated plosives /p^h, t^h, k^h/ or wider than that of the aspirated affricate /ts^h/ in the other word-initial context /_i/. However, in the word-medial contexts, “the maximum width of the glottis is about half of the maximum opening” in the word-initial contexts in the case of the non-fortis fricative (Kagaya 1974: 167), and the glottis opens much narrower in the non-fortis fricative at frication onset than in the aspirated consonants at their oral release.

If the non-fortis fricative were aspirated, its glottal opening would be consistent both word-initially and word-medially just like that of the aspirated stops, given that the glottal opening of the aspirated stops remains relatively consistent across the contexts, compared with that of the non-aspirated ones, as shown in the study. Yet, this is not the case. The maximum of glottal opening in the non-fortis fricative is much more reduced in the word-medial contexts than in the word-initial contexts, such that it is similar to that of glottal opening in the fortis fricative in word-medial position. In addition, if the non-fortis fricative were aspirated, the glottal opening of the non-fortis fricative would be wider than that of the aspirated stops both word-initially and word-medially, just as the glottis opens wider in the fortis fricative than in the fortis stops due to its frication throughout oral constriction across the contexts. However, this is not the case, either.

In contrast, what we have found is that there are no significant differences in glottal opening peak and airflow peak height between the two fricatives either word-initially or word-medially (Figure 6 (a) and (b)) and that the duration of aspiration is significantly reduced in the word-medial /s/, compared to those in the word-initial /s/ and relatively consistent in /s’/ across the contexts (Figure 6 (c)), as in lenis vs. fortis/aspirated plosives (Kim et al., 2018). The context-dependent duration of aspiration in the non-fortis fricative as well as no significant differences in glottal opening peak and airflow peak height between the two fricatives confirms that the fricative is not aspirated (/s^h/) but lenis (/s/).

Finally, the context-dependent duration of aspiration in the fricative /s/ supports the view that it is prosodically accounted for as the effect of prosodic structure in an autosegmental-metrical model of intonational phonology (e.g. Pierrehumbert, 1980; Beckman & Pierrehumbert, 1986; Pierrehumbert & Beckman, 1988), as in lenis plosives (Jun, 1993, 1998, 2005a, b). For example, according to Jun (1993), the effect of prosodic position results in the difference in voicing and VOT between word-initial and word-medial lenis plosives, such that within an accentual phrase, the voicing and shorter VOT of lenis plosives do occur, whereas at the beginning of the accentual phrase the plosives are voiceless with a longer VOT. As in the lenis plosives (Kim et al., 2018; H. Kim, 2019b), the same account can be given for the difference between the word-initial and word-medial /s/ in the framework of the intonational phonology. That is, within an accentual phrase, the significantly shorter duration of aspiration occurs sometimes with a phonetic voicing in the word-medial /s/, and at the beginning of the accentual phrase, the word-initial /s/ has a longer duration of aspiration.

5. Conclusion

In order to better understand the phonetic characterization of the Korean fricatives /s’, s/ in comparison with the three-way phonation contrast in plosives (Kim et al., 2018), we have obtained simultaneous recordings of ePGG, airflow and acoustic data in one session and those of P_io, airflow and acoustic data in the other session from four (2 male and 2 female) native speakers of Seoul Korean. What we have found is that different from the plosives, the two fricatives are not significantly different in glottal opening peak and airflow peak height either word-initially or word-medially and that the duration of aspiration is significantly reduced in the word-medial /s/, compared to that in the word-initial /s/, whereas it is relatively consistent in /s’/ across the contexts, as in lenis vs. fortis/aspirated plosives. We have also found that the duration of a high P_io plateau is significantly longer in /s’/ than in /s/ both word-initially and word-medially, as in fortis/aspirated vs. lenis plosives and that airflow resistance (R=P_io/U) at the onset and offset of a P_io plateau is significantly higher in /s’/ than in /s/ as well as at the time of airflow peak height, as in fortis/aspirated vs. lenis plosives, across the contexts. In addition, we have found that the difference in P_io peak is not significant both word-initially and word-medially, as in the three-way phonation contrast in plosives, and that the two fricatives are not significantly different in F0 across the contexts, different from the plosives. It is also found that transition time to reach airflow peak height from the offset of a P_io plateau is significantly longer in /s/ than in /s’/ in both word-initial and word-medial positions.

Based on the results, we have proposed that the phonation- type-specific pattern of the duration of a high P_io plateau and airflow resistance has to do with the tensing of the primary articulator (i.e. the tongue apex/blade) and the vocal folds in line with Kim et al. (2011), being accounted for by the articulatory feature [±tense]. As a result, the laxness ([–tense]) of /s/ gives rise to the significant reduction in the duration of aspiration word- medially, whereas the tenseness ([+tense]) of /s’/ yields to a relative consistency in them both word-initially and word-medially. The laxness vs. tenseness also gives rise to a transition time difference from a P_io offset toward airflow peak height with a statistical significance. On the other hand, no statistical differences in glottal opening peak and airflow peak height lead to the confirmation that both /s’/ and /s/ are specified as [–s.g.] for glottal opening. Consequently, the present experimental data substantiate the laryngeal representations in Table 1 (a), that is, the fricative /s/ is lenis, being specified as [–s.g.] like the fortis fricative /s’/ and as [–tense] for its laxness, different from /s’/ which is specified as [+tense] for its tenseness.

To conclude, the simultaneous investigation of the Korean fricatives /s, s’/ in the articulatory, aerodymanic and acoustic aspects has revealed more phonetic properties of the fricatives compared with studies either in articulation or in acoustics/aerodynamics so far. This has eventually substantiated the laryngeal characterization of the fricatives. Given this, the present study would contribute to investigating speech sounds in articulation, aerodynamics and acoustics simultaneously in order to further deepen our understanding of speech sounds in other languages as well.

Notes

¹ Throughout the paper, the fortis fricative is transcribed with an apostrophe as a deviation from the IPA value of the apostrophe which is used for an ejective, as in Kim et al. (2005, 2010a, 2011, 2018) among others. This is because there is no satisfactory IPA notation for Korean fortis consonants. In addition, Korean affricates are transcribed as alveolar, that is, /ts, ts^h, ts’/ in line with H. Kim (1997, 1999, 2001a, b, 2004, 2012) as well as Skaličková (1960).

² See Kim et al. (2011) for an extensive literature review of the Korean fricatives.

³ See Kim and Clements (2015) for the literature review of the feature and H. Kim (2003, 2005, 2009, 2011, 2014, 2017) for phonological evidence for the newly modified feature. See Kim et al. (2010a, 2011) and H. Kim (2005, 2011) for discussions on other laryngeal feature specifications in comparison to those in Table 1.

⁴ For the phonological arguments for the laryngeal specification of the two fricatives in Table 1 (a), see H. Kim (2009, 2011, 2014).

⁵ Given that it is VOT as well as glottal opening peak, airflow peak height and the duration of aspiration that makes the lenis plosives different from the aspirated plosives (Kim et al., 2018) and that H and L tones are not lexically specified in the representation of aspirated and lenis plosives in current Seoul Korean (H. Kim 2013, 2019a), it would be no problem that one speaker (F2) among the four subjects is 30 years older than the rest.

⁶ In the two recording sessions, our first author paid much attention to each subject’s pronunciation. In case that a subject mispronounced a test word, a correct test word was recorded.

⁷ The placement of the LED light(s) and photodiode in Figure 1 was decided by our third author who had been a medical doctor specialized in the larynx.

⁸ For the past several years, the new ePGG method was tested by obtaining the data from more than ten Seoul Korean subjects in Paris. As it was in better development, we newly obtained ePGG, acoustic and aerodynamic data from another seven Seoul subjects, and the data from four of them were presented in the present study as representative samples, because the other three subjects’ ePGG data were too weak or had poor signal quality due to their thick neck muscles and fatty tissues.

⁹ P_io at t_PON was almost always positive value, rarely zero, probably because the glottis opened before the onset of a frication, as shown in the ePGG data in Figure 3. This is why, for example, a P_io at the onset of a P_io plateau was calculated as P_io at t_PPON minus P_io at t_PON, as in Table 6 (e i). The same is true of our calculations of P_io values at the offset of a P_io plateau and at the time of airflow peak height in Table 6 (e ii) and Table 6 (e iii), respectively, as well as a P_io peak in Table 6 (d).

¹⁰ In this section, interactive graphics (e.g. Matejka & Fitzmaurice, 2017; Weissgerber et al., 2015, 2017) are made for the distribution of our measured data for all four subjects.

¹¹ The effect size (i.e. partial ɳ²) denoted by ɳ²_p is added in the result and the following results as well.

¹² The statistical result for the subject F2 is not available in a repeated measures ANOVA.

¹³ It is noteworthy that P_io peak values are just below or above 9 hPa in the subject M1, whereas they are a little above or below 5 hPa in the other three subjects. As for the differences between the subject M1 and the other subjects, we may assume that the three subjects produced speech with a low subglottal air pressure (P_sub), compared to the subject M1. Given that there is no solid referential data about the possible range of P_sub during speech production, we may leave this for further research in the future.

¹⁴ The other male speaker in the fiberscopic study did not utter the two fricatives.

Acknowledgements

We would like to express our thanks to all the subjects who participated in our experiment and also to three anonymous reviewers. All errors remain our own.

References

Beckman, M. E., & Pierrehumbert, J. (1986). Intonational structure in Japanese and English. Phonology Yearbook, 3. 255.309.

Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341-345.

Boersma, P., & Weenink, D. (2019). Praat: Doing phonetics by computer [Computer program]. Retrieved from http://www.praat.org

Cho, T., Jun, S. A., & Ladefoged, P. (2002). Acoustic and aerodynamic correlates of Korean stops and fricatives. Journal of Phonetics, 30(2), 193-228.

Fant, G. (1960). Acoustic theory of speech production. Hague, Netherlands: Mouton.

Halle, M., & Stevens, K. N. (1971). A note on laryngeal features. Research Laboratory of Electronics Progress Report, 101, 198-212.

Hirose, H., Yoshioka, H., & Niimi, S. (1978). A cross language study of laryngeal adjustment in consonant production. Annual Bulletin, Research Institute of Logopedics and Phoniatrics, 12, 61-71.

Hong, K., Niimi, S., & Hirose, H. (1991). Laryngeal adjustments for Korean stops, affricates and fricatives: An electromyographics study. Annual Bulletin Research Institute of Logopedics and Phoniatrics, 25, 17-31.

Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, MA: MIT Press.

10.

Jun, S. A. (1993). The phonetics and phonology of Korean prosody (Doctoral dissertation). Ohio State University, Columbus, OH.

11.

Jun, S. A. (1998). The accentual phrase in the Korean prosodic hierarchy. Phonology, 15(2), 189-226.

12.

Jun, S. A. (2005a). Korean intonational phonology and prosodic transcription. In S. A. Jun (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 201-229). Oxford, UK: Oxford University Press.

13.

Jun, S. A. (2005b). Prosodic typology. In S. A. Jun (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 430-458). Oxford, UK: Oxford University Press.

14.

Jun, S. A., Beckman, M. E., & Lee, H. J. (1998). Fiberscopic evidence for the influence on vowel devoicing of the glottal configurations for Korean obstruents. UCLA Working Papers in Phonetics, 96, 43-68.

15.

Kagaya, R. (1974). A fiberscopic and acoustic study of the Korean stops, affricates and fricatives. Journal of Phonetics, 2(2), 161-180.

16.

Kent, R. D., & Read, C. (2002). Acoustic analysis of speech (2nd ed.). Albany, NY: Singular Thomson Learning.

17.

Kim, C. W. (1965). On the autonomy of the tensity feature in stop classification (with special reference to Korean stops). Word, 21(3), 339-359.

18.

Kim, H. (1997). The phonological representation of affricates: Evidence from Korean and other languages (Doctoral dissertation). Cornell University, Ithaca, NY.

19.

Kim, H. (1999). The place of articulation of Korean affricates revisited. Journal of East Asian Linguistics, 8(4), 313-347.

20.

Kim, H. (2001a). A phonetically based account of phonological assibilation. Phonology, 18(1), 81-108.

21.

Kim, H. (2001b). The place of articulation of the Korean plain affricate in intervocalic position: An articulatory and acoustic study. Journal of the International Phonetic Association, 31(2), 229-257.

22.

Kim, H. (2003). The feature [tense] revisited: The case of Korean consonants. Proceedings of the the thirty-fourth annual meeting of the North East Linguistic Society (pp. 319-332). Stony Brook, NY.

23.

Kim, H. (2004). Stroboscopic-cine MRI data on Korean coronal plosives and affricates: Implications for their place of articulation as alveolar. Phonetica, 61(4), 234-251.

24.

Kim, H. (2005). The representation of the three-way laryngeal contrast in Korean consonants. In M. van Oostendorp, & J. van de Weijer (Eds.), The internal organization of phonological segments (pp. 287-316). Berlin, Germany: Mouton de Gruyter.

25.

Kim, H. (2009). Korean adaptation of English affricates and fricatives in a feature-driven model of loanword adaptation. In A. Calabrese, & W. L. Wetzels (Eds.), Loan phonology (pp. 155-180). Amsterdam, Netherlands: John Benjamins.

26.

Kim, H. (2011). What features underline the /s/ vs. /s’/ contrast in Korean? Phonetic and phonological evidence. In G. N. Clements, & R. Ridouane (Eds.), Where do phonological features come from?: Cognitive, physical and developmental bases of distinctive speech categories (pp. 99-130). Amsterdam, Netherlands: John Benjamins.

27.

Kim, H. (2012). Gradual tongue movements in Korean palatalization as coarticulation: New evidence from stroboscopic cine-MRI and acoustic data. Journal of Phonetics, 40(1), 67-81.

28.

Kim, H. (2013). Seoul Korean subjects’ perception of Japanese pitch-accent: Evidence for the absence of tonogenesis in Korean. In B. Frellesvig, & P. Sells (Eds.), Japanese/Korean linguistics (pp. 217-232). Stanford, CA: CSLI Publications.

29.

Kim, H. (2014). An L1 grammar-driven model of loanword adaptation: Evidence from Korean. Korean linguistics, 16(2), 144-186. Amsterdam: John Benjamins Publishing Company. DOI:

30.

Kim, H. (2017). Korean speakers’ perception of Japanese geminates: Evidence for an L1 grammar-driven borrowing process. In H. Kubozono (Ed.), The phonetics and phonology of geminate consonants (pp. 340-369). Oxford, UK: Oxford University Press.

31.

Kim, H. (2019a). The effects of L1 AP-initial boundary tones and laryngeal features in Korean adaptation of Japanese plosives followed by a H or L vowel. Glossa: A Journal of General Linguistics, 4, 49. DOI:

32.

Kim, H. (2019b). The context-dependent VOT of Korean lenis plosives in young Seoul Koreans: The interaction of the laxness of lenis plosives with prosodic position. Proceedings of the Segmental Processes in Interaction with Prosodic Structure (SPIPS). Tromsø, Norway.

33.

Kim, H., & Clements, G. N. (2015). The feature [tense]. In A. Rialland, R. Ridouane, & H. Hulst (Eds.), Features in phonology and phonetics: Posthumous writings by Nick Clements and coauthors (pp. 159-178). Berlin, Germany: De Gruyter Mouton.

34.

Kim, H., Honda, K., & Maeda, S. (2005). Stroboscopic-cine MRI study of the phasing between the tongue and the larynx in the Korean three-way phonation contrast. Journal of Phonetics, 33(1), 1-26.

35.

Kim, H., Maeda, S., & Honda, K. (2010a). Invariant articulatory bases of the features [tense] and [spread glottis] in Korean plosives: New stroboscopic cine-MRI data. Journal of Phonetics, 38(1), 90-108.

36.

Kim, H., Maeda, S., & Honda, K. (2011). The laryngeal characterization of Korean fricatives: Stroboscopic cine-MRI data. Journal of Phonetics, 39(4), 626-641.

37.

Kim, H., Maeda, S., Honda, K., & Crevier-Buchman, L. (2018). The mechanism and representation of Korean three-way phonation contrast: External photoglottography, intra-oral air pressure, airflow, and acoustic data. Phonetica, 75(1), 57-84. DOI:

38.

Kim, H., Maeda, S., Honda, K., & Hans, S. (2010b). The laryngeal characterization of Korean fricatives: Acoustic and aerodynamic data. In S. Fuchs, M. Toda, & M. Zygis (Eds.), Turbulent sounds: An interdisciplinary guide (pp. 143-166). Berlin, Germany: De Gruyter Mouton.

39.

Kim, H., & Park, C. L. (2011). An acoustic study of the Korean fricatives /s, s’/: Implications for the features [spread glottis] and [tense]. In J. A. Goldsmith, E. Hume, & L. Wetzels (Eds.), Tones and features (Vol. 107, pp. 176-194). Berlin, Germany: Mouton de Gruyter.

40.

Klatt, D. H., Stevens, K. N., & Mead, J. (1968). Studies of articulatory activity and airflow during speech. Annals of the New York Academy of Sciences, 155(1), 42-55.

41.

Ladefoged, P. (2006). A course in phonetics (5th ed.). Boston, MA: Thomson Wadsworth.

42.

Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages. Oxford, UK: Blackwell.

43.

Lindqvist, J. (1972). Laryngeal articulation studied on Swedish subjects. STL-QPSR, 13(2-3), 10-27.

44.

Lisker, L., Abramson, A. S., Cooper, F. S., & Schvey, M. H. (1969). Transillumination of the larynx in running speech. The Journal of the Acoustical Society of America, 45(6), 1544-1546.

45.

Löfqvist, A. (1992). Acoustic and aerodynamic effects of interarticulator timing in voiceless consonants. Language and Speech, 35(1-2), 15-28.

46.

Löfqvist, A., Koenig, L. L., & Mcgowan, R. S. (1995). Vocal tract aerodynamics in /aCa/ utterances: Measurements. Speech Communication, 16(1), 49-66.

47.

Löfqvist, A., & Yoshioka, H. (1984). Intrasegmental timing: Laryngeal-oral coordination in voiceless consonant production. Speech Communication, 3(4), 279-289.

48.

Matejka, J., & Fitzmaurice, G. (2017, May). Same stats, different graphs: Generating datasets with varied appearance and identical statistics through simulated annealing. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1290-1294). Denver, CO.

49.

Netsell, R. (1969). Subglottal and intraoral air pressures during the intervocalic contrast of /t/ and /d/. Phonetica, 20(2-4), 68-73.

50.

Pierrehumbert, J. (1980). The phonology and phonetics of English prosody (Doctoral dissertation). MIT, Cambridge, MA.

51.

Pierrehumbert, J. B., & Beckman, M. E. (1988). Japanese tone structure. Cambridge, MA: MIT Press.

52.

Pinho, C. M. R., Jesus, L. M. T., & Barney, A. (2012). Weak voicing in fricative production. Journal of Phonetics, 40(5), 625-638.

53.

Sawashima, M. (1969). Devoiced syllables in Japanese: A preliminary study by photo-electric glottography. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 3, 35-41.

54.

Shadle, C. H. (2010). The aerodynamics of speech. In W. J. Hardcastle, J. Laver, & F. E. Gibbon (Eds.), The handbook of phonetic sciences (pp. 39-80). Oxford, UK: Wiley-Blackwell.

55.

Shibatani, M. (1990). The language of Japan. Cambridge, UK: Cambridge University Press.

56.

Skaličková, A. (1960). The Korean consonants. Prague, Czech: Nakladatelstvi Československé akademie věd.

57.

Tranel, B. (1987). The sounds of French: An introduction. Cambridge, UK: Cambridge University Press.

58.

Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: Time for a new data presentation paradigm. PLOS Biology, 13(4), e1002128.

59.

Weissgerber, T. L., Savic, M., Winham, S. J., Stanisavljevic, D., Garovic, V. D., & Milic, N. M. (2017). Data visualization, bar naked: A free tool for creating interactive graphics. Journal of Biological Chemistry, 292(50), 20592-20598.