1. Introduction
This study investigates the imitation patterns of adults and children speaking North Kyungsang Korean by measuring log-produced f0 (fundamental frequency) intervals. Previous studies have suggested that North Kyungsang Korean has a lexical pitch accent (Chung, 1991; Jun et al., 2006; Kenstowicz & Sohn, 1997; Kim, 1976; Kim, 1988; Kim, 1997). For example, [kaci] is produced as HL ‘kind’, LH ‘eggplant’, and HH ‘branch’. This study is focused on pitch range variations imitated by North Kyungsang adult and child speakers, and investigates whether pitch range variation in North Kyungsang Korean is contrastive when using log-produced f0 intervals.
Previous studies (Dilley, 2010; Dilley & Brown, 2007; Pierrehumbert & Steele, 1989) have examined whether pitch range variation shows a discrete pattern in English intonation. Pierrehumbert & Steele (1989) investigated whether pitch range variation in the English rise-fall-rise pattern is categorical or continuous in an imitation task. The data used by Pierrehumbert & Steele (1989) showed the distinctions between two intonational categories (e.g., L*+H vs. L+H*), giving the shapes of individual distributions. The histograms for some participants were bimodal, showing a categorical difference. On the other hand, one participant’s second mode of the histogram of the two categories did not show a peak and was overlapped with the first. Dilley & Brown (2007) showed that the timing of f0 peaks and valleys was imitated categorically. There was a significant difference in mean f0 peak time and valley time for stimuli 1–5 versus stimuli 6–10. Dilley (2010) investigated whether three intonation continua (i.e., H* vs. L+H*, H* with ‘peak delay’ vs. L*+H, and %H L* vs. L*) with pitch range variation have the properties of categorization. The result was that the imitations of participants exhibited gradient variation in pitch range, conveying meaningful distinctions.
Infants acquire prosodic features before they acquire segmental features (Kaplan & Kaplan, 1971; Spring & Dale, 1977). Previous studies have reported that the development of prosodic patterns represents universal or language-specific effects (Chen & Kent, 2009; Fernald & Mazzie, 1991; Jusczyk et al., 1993; Levitt, 1993). Chen & Kent (2009) measured early f0 variation in Mandarin-learning infants and examined whether infants produced language-universal or language-specific prosodic properties. Their results showed that prosodic patterns with falling f0 contours were more significantly found in Mandarin-learning infants’ production than level and rising f0 contours. Additionally, the patterns of high pitch were produced more frequently than mid and low f0 patterns. Jusczyk et al. (1993) investigated whether the preference for strong and weak syllables within English words exists within the process of lexical development. Their results reported that 6-month-old infants did not show significant sensitivity for strong and weak stress patterns, but that American infants at 9 months of age showed a preference for predominant stress patterns of English words. That is, 9-month-old infants predominantly listen longer for words with strong/weak stress patterns than weak/strong stress patterns. This means that the sensitivity to the strong/weak stress patterns in English words has developed in the process of lexical development. Levitt (1993) investigated whether French and American infants show the differences of prosody in f0 contour, rhythm, or amplitude. Especially for f0 contour, French infants distinguished equally between a falling and a rising contour. About 75% of American infants recognized a falling contour. Regarding rising f0 contours, French infants rated higher than American infants.
The present study is also concerned with the imitations of North Kyungsang child speakers to determine whether the reproductions of children reflect the prosodic characteristics of North Kyungsang adult speakers. Imitation production has been regarded as a critical method for observing the process of language acquisition (Alivuotila et al., 2007; Kent, 1979; Kent & Forner, 1979; Kuhl & Meltzoff, 1996; Leonard et al., 1978; Stine & Bohannon, 1983). Kent (1979) examined adults and 6-year-old children imitating synthesized English and non-English vowels. The result shows that the vowel imitations of children were more overlapped than those of the adults. Additionally, the children did not exhibit the distinct imitations of English and non-English vowels that the adults did in their imitations. This suggests that children’s imitations are performed more variably than those of adults. Alivuotila et al. (2007) reported that South Western Finnish children were strongly affected by their native vowel system in the imitation task, and that their adult counterparts were clearly influenced by phonetic experience and sharpened with age. This study implies that children’s cognitive and articulatory skills are not fully developed when compared to those of adults. Kuhl & Meltzoff (1996) investigated infants’ vocal imitations in response to adults’ speech at three ages, 12, 16, and 20 weeks. Infants imitated three vowels, /a/, /i/, and /u/. The vowel categories of infants became increasingly separated between 12 and 20 weeks of age. In other words, the three vowel categories were more closely and firmly clustered at 20 weeks of age than at 12 weeks of age.
This study is focused on the data of North Kyungsang Korean. North Kyungsang Korean is spoken using a lexical pitch accent in the southeast area of Korea. Previous studies on North Kyungsang Korean have traditionally demonstrated that high and low tones are associated with syllables, reflecting a phonologically approached theory (Chung, 1991; Jun et al., 2006; Kenstowicz & Sohn, 1997; Kim, 1976; Kim, 1988; Kim, 1997). Recently, Kim (2012, 2015, 2018) has researched the lexical pitch accent in North Kyungsang Korean using an imitation task with the data of children and adults. Kim (2012) investigated whether the adult and child speakers in the North Kyungsang and South Cholla regions show a categorical production by measuring the differences of f0 values at the midpoints of the first and second syllables. North Kyungsang adult speakers showed categorical imitation productions on the pitch contours of lexical pitch accent contrasts. On the other hand, North Kyungsang and South Cholla child speakers tended to track the changes of f0 values, reflecting continuous imitation productions. South Cholla adult speakers exhibited the irregular patterns of f0 changes. Kim (2018) reported the interaction between categorization and production for lexical pitch accent contrasts using imitation and production performance. The HL-LH patterns among lexical pitch accent patterns showed strict categorical boundaries for both production and imitation performance. The other pitch accent patterns were variable for most of the speakers.
This study deals with pitch range variation on the f0 contour of North Kyungsang Korean, paying particular attention to imitations produced by adult and child North Kyungsang speakers. The imitation performance is assessed by log-produced f0 intervals. The goal of the study is to investigate whether the imitations have categorical functions using log-produced f0 intervals. The analysis of the results will show the mean log-produced f0 intervals and the individual imitation patterns of log-produced f0 intervals produced by adults and children in the North Kyungsang region.
2. Experimental Method
Ten adults and five children participated in the experiment. All participants were native speakers of North Kyungsang Korean, and were born and raised in Daegu, the central city of the North Kyungsang region. The adult speakers ranged from 20–27 years of age and the child speakers ranged from 6–7 years of age. All speakers had no hearing or speaking impairments. The participants in the study were all paid volunteers.
For the study, three minimal pairs with the lexical pitch accent contrasts of North Kyungsang Korean were selected. The three words are presented as follows.
(1) | [mo.i] | HL: “feed,” LH: “conspiracy” |
(2) | [mo.ɾe] | HL: “sand,” HH: “the day after tomorrow” |
(3) | [yaŋ.mo] | LH: “wool,” HH: “adoptive mother” |
These three target words were recorded within a carrier sentence (e.g., [yəŋmi-ka moi hako malhes-nɨnteye], “Youngmi said feed/conspiracy.”) by a native North Kyungsang speaker. The recordings were conducted in a sound-attenuated room using Praat and a high-quality microphone. Each target word was produced several times to record the most natural sound with f0 pattern. The target words were taken in an excerpt from the carrier sentences and resynthesized using Praat, a pitch-synchronous overlap, and the addition of (PSOLA) algorithms. The manipulation created by shifting the f0 level across each target word in equal logarithmic steps. As shown in Figure 1, the HL contour is manipulated by dividing the scale of pitch shape until the endpoint of HL becomes the endpoint of LH. The manipulation underwent nine steps of scaling pitch contours to produce both endpoints of pitch contrasts.
Regarding the imitation task, the participants were equipped with a headset and were instructed to pronounce what they heard most carefully. The imitation task consisted of 216 trials (i.e., 3 blocks X 72 trials) for both adult and child speakers. For child participants, the picture cards were used to help them understand the target words in the experiment.
The study measured f0 values at the mid points of the first vowel and second vowel to examine the f0 intervals of imitation productions. This measurement was adopted from the idea of Dilley (2010). The study of Dilley (2010) measured the f0 values of V1 and V2. V1 was the mean f0 value of the first syllable and V2 was the peak f0 value or mean f0 value of the second syllable.1 The present study measured the f0 value of the first syllable’s vowel and the f0 value of the second syllable’s vowel. The measurement was conducted by displaying the spectrogram and waveforms. Figure 2 shows the f0 points on pitch contours, indicating V1 and V2. For example, in Figure 2, the f0 measurement point of V1 for HL pitch contour is a little lower than the f0 point of V2. The f0 measurement point of V1 for LH pitch contour is much lower than the f0 point of V2. F0 values for points of pitch contours were manually measured in this study.
The present study calculates the log-produced f0 intervals (ratios) using the f0 values measured on the pitch contours in (4). This equation is done in the same way with the logarithmic scale used in making stimuli. The equation proposed by Dilley (2010) and Kim (2015) is shown below.
For the statistical analysis, a mixed-effect linear regression model, using the lmer function in the lme4 package (Bates et al., 2015) in R (version 3.2.2.), was conducted. The dependent variables were the log-produced f0 ratios for pitch contours. The fixed effects predictors are three lexical pitch contours, HL-LH, HH-HL, HH-LH. The individual speakers were used as the random-effects predictor. The random effect for ‘speakers’ describes the distinctive nature triggered by individual variation. To find the p-value, the Markov Chain Monte Carlo (MCMC) package (Martin et al., 2011) was also conducted. The p-values were computed for the specific analysis of the mean log-produced f0 intervals, according to the different pitch contours and log-produced f0 intervals on imitations produced by individual speakers.
3. Results
The imitation productions of adults and children were compared using mean log-produced f0 intervals. In Figure 3, the lexical pitch accent contours were divided into HL-LH, HH-HL and HH-LH. The adult speakers showed a significant difference between HH-HL and HH-LH (β=–0.02, t=–8.818, p<.001). However, there was no significant effect shown between HH-HL and HL-LH. The boxplots in Figure 3 show that the log-produced f0 intervals were lower for HH-LH than HH-HL and HL-LH. The Log-produced f0 intervals between HH-HL and HL-LH seemed to be similar, even though HL-LH had more variations than HH-HL.
For children’s imitation productions, the log-produced f0 intervals showed a significant effect on differences for all pitch contours (i.e., HH-HL vs. HH-LH: β=–0.05, t=–14.382, p<.001, HH-HL vs. HL-LH: β=–0.016, t=–4.645, p<.001). As shown in Figure 4, the log-produced f0 intervals for HH-LH were much lower than HH-HL. The f0 intervals for HH-HL are higher than HL-LH.
The range of variations in the log-produced f0 intervals for children was not different with those of adult speakers. Also, both adult and child speakers show that the log-produced f0 intervals of HH-LH are lower than the other tone patterns.
The imitation productions of adult speakers for HL-LH showed more or less different shapes, as shown in Figure 5. Of the individual patterns in Figure 5, some speakers (i.e., S1, S2, S3, and S5) showed a significant change on two clusters of HL-LH, indicating that there were categorical boundaries. Statistically, the imitation of S1 was different to S4 (β=–0.016, t=–2.263, p<.05), S6 (β=–0.018, t=–2.552, p<.05), and S9 (β=–0.02, t=–2.727, p<.01). For other speakers, there were no significant effects showing continuous or categorical contours.
Figure 6 showed the histogram of imitation responses based on log-produced f0 intervals for HL-LH. The histogram was bimodal. The plot between –0.05 and 0.00 and the plot between 0.00 and 0.05 showed the peak respectively, indicating that there were two categories in HL-LH.
The imitation patterns of HH-HL, as depicted in Figure 7, show the differences between speakers’ productions. S1 significantly differed with S2 (β=–0.018, t=–2.873, p<.01), S3 (β=–0.012, t=–1.988, p<.05), S4 (β=–0.016, t=–2.568, p<.05), S5 (β=–0.027, t=–4.389, p<.001), S6 (β=–0.049, t=–7.774, p<.001), S7 (β=–0.022, t=–3.510, p<.001), S8 (β=–0.035, t=–5.613, p<.001), S9 (β=–0.024, t=–3.946, p<.001), and S10 (β=–0.025, t=–3.980, p<.001). In this sense, these contour shapes in Figure 7 are statistically different among individual speakers’ imitation productions. However, S1, S2, S3, and S4 displayed shifts at the midpoint of the curves, indicating that there were categorical functions. The other contours showed some continuous shapes, though there were deviations.
The overall distribution of the histogram in Figure 8 would be unimodal. The highest peak was located between –0.05 and 0.00. This assumes that the peak of HH overlapped with the peak of HL for the imitation of adult speakers.
Figure 9 shows the individual speakers’ curves of their plots for the log-produced f0 intervals in HH-LH grouped by adult speakers. Of the individual patterns, S1 was significantly different to S2 (β=–0.042, t=–8.134, p<.001), S3 (β=–0.035, t=–6.889, p<001), S4 (β=–0.039, t=–7.565, p<.001), S5 (β=–0.033, t=–6.444, p<.001), S6 (β=–0.02, t=–3.871, p<.001), S7 (β=–0.01, t=–1.987, p<.05), S8 (β=–0.029, t=–5.667, p<.001), and S10 (β=0.042, t=8.209, p<.001). S9 had no significant effect, and the slope of S1 was similar to that of S9. In Figure 9, the contours showed a gradient effect for most of the individual speakers. For S3, the plot displayed a shift, showing that there was a categorical distinction.
Figure 10 shows the histogram of the imitation responses in HH-LH grouped by adult speakers. The distribution of the histogram is skewed to the left. The highest peak is between –0.05 and 0.00 in the log-produced f0 intervals for HH-LH. There are no noticeable differences between the pitch contours of HH-LH.
In Figure 11, the plot displays the imitation responses of children for HL-LH using log-produced f0 intervals. Regarding the statistical results, S1 was significantly different to S5 (β=0.024, t=2.615, p<.01), whereas the other speakers (i.e., S2, S3, and S4) did not have a significant effect. As shown in Figure 11, the contour shapes of S1, S2, S3, and S4 are presented as continuous curves. However, S5 shows the shift in the middle of a contour curve, indicating a categorical distinction. Additionally, the curved patterns for S2 and S4 present as shifts in the middle of contours, but they are not statistically significant.
The histogram plot of Figure 12 exhibits an asymmetric-shaped distribution for the imitation of HL-LH produced by children. The frequency is the highest around zero towards the left. The highest peak is between -0.1 and 0.0, displaying a left-skewing distribution. This indicates that the imitation production of HL is more likely to occur for children than that of LH.
The imitation responses of Figure 13 show the HH-HL patterns produced by children using log-produced f0 intervals. S1 was significantly different to S3 (β=0.015, t=2.446, p<.05), S4 (β=0.041, t=6.595, p<.001), and S5 (β=0.042, t=6.668, p<.001). The response of S2 did not show a significant difference. In Figure 13, the curves increase significantly from the left end to the right end, showing that the curves are continuous. S3 exhibits an abrupt shift on the curve, indicating a categorical function.
The histogram of HH-HL in Figure 14 is not conventionally distributed. The frequency does not form a symmetric bell-shaped histogram plot. The distribution with high peaks of frequency tends to appear between 0.00 and 0.05 using log-produced f0 intervals. The highest peak is at around 0.05 and the second peak is at around 0.00. In this histogram, two peaks for HH-HL seem to occur, indicating a categorical boundary.
Figure 15 displays the imitation responses of HH-LH produced by children. This figure presents the log-produced f0 intervals. The response of S1 shows a significant difference to that of S4 (β=–0.023, t=–4.007, p<.001), S5 (β=0.015, t=2.619, p<.01), and S2 (β=0.014, t=2.399, p<.05). S3 was not significant. In Figure 15, the f0 intervals of all speakers appear as flat patterns in the data. Nevertheless, for S4 and S5, the slopes of f0 intervals are higher when the curves are increasing from left to right than for S1. There is a very small difference between the f0 intervals of S1 and S2, considering the outliers.
The histogram of Figure 16 shows an asymmetric-shaped distribution, indicating that the peak is the highest around zero toward the left. In this histogram, this implies that the patterns of peak on pitch contour are overlapped for HH-LH.
4. Discussion and Conclusion
This study examines whether pitch range variation in North Kyungsang Korean shows a categorical function when using log-produced f0 intervals. The imitation patterns produced by adult and child speakers in the North Kyungsang region were statistically analyzed. The experiment showed two interesting findings. First, for adult and child speakers in the North Kyungsang region, the pitch range variation showed a categorical function in HL-LH for most speakers. That is, the log-produced f0 intervals played a significant role for the HL-LH patterns, showing the shifts for categorical boundaries. For the HH-HL and HH-LH patterns, the adult and child speakers produced a continuous or gradient function which did not include any shifts. Second, for the imitations of pitch range variation, the adult speakers showed more obvious categorical patterns than the child speakers. The children’s imitations were more variable in terms of f0 changes than adults’ imitations when using log-produced f0 intervals. In other words, there were more continuous curves than categorical distinctions in the children’s imitations.
In the imitation responses, the adult and child speakers in the North Kyungsang region tended to show more categorical boundaries than continuous curves, particularly for the HL-LH patterns. For adult speakers, pitch range variation measured by log-produced f0 intervals often showed a shift, reflecting the characteristics of categorization. The HL-LH pitch patterns had two peaks in the histogram and showed two pitch categories with different f0 qualities. Even for the children, there was a shift in the f0 continuum of HL and LH for some speakers, due to the function of categorization.
For the HH-HL patterns, there was a sign of categorization, but this was limited to a few adult speakers. These categorization patterns were not distinctive in the histogram. That is, there was only one peak which is overlapped for the continuum of HH and HL. For the HH-LH patterns, most of the data from adult speakers showed continuous contours. In other words, the patterns measured by log-produced f0 intervals were flat. The histogram of the HH-LH patterns also displayed one peak, indicating a left-skewing distribution. This indicates that the properties of categorization were lacking for these pitch patterns.
The prosodic features such as pitch, loudness, and tempo are obtained during early language acquisition (Kaplan & Kaplan, 1971; Spring & Dale, 1977). Considering the language-specific properties, a falling and rising contour tends to be acquired from the beginning of prosodic development (Chen & Ken, 2009; Jusczyk et al., 1993; Levitt, 1993). Kim (2018) reported that in North Kyungsang Korean the characteristics of categorization were well expressed in the HL-LH pitch patterns using an imitation and production task. The results of this study support the work of Kim (2018). The log-produced f0 intervals for HL-LH seemed to show an appropriate function for imitating the f0 timing of the two pitch categories. The properties of f0 movement for HL-LH in North Kyungsang Korean regarding children’s imitations reflect the benefits of early prosodic acquisition. The signs of prosodic categorization appearing in children’s imitative behavior enable us to find semantic contrasts. In this sense, the words with the HL-LH contrasts may well recognizable for adults and children in the North Kyungsang region.
In this study, the adults’ and children’s imitation data exhibited different patterns. Compared with the adults’ data, the data for the children’s lexical pitch accent imitations are more varied. The children’s HL-LH patterns showed categorical functions at the midpoint of the curves, but the shifts were not as distinctive as those displayed in the adults’ data. Moreover, the histogram for the HL-LH pattern showed two peaks indicating two lexical pitch accent categories for the adults, whereas for the children the two categories in the histogram were overlapped. That is, the HL-LH pattern was more distinctive for the adults than for the children. However, for the children’s imitations, the HL-LH pattern was more distinctive than the other lexical pitch accent categories. Most of the children showed continuous curves for the HH-HL patterns imitated by children. Only one child exhibited a shift on the curve, but this was not as clearly distinctive as those in the adults’ data. For the HH-LH patterns, no children showed a shift on the curve indicating categorical properties. The curves imitated by children were flat. That is, for the children, the HH-LH patterns seemed not to be clearly distinctive as two separate categories.
In the adults’ imitations, the HH-HL and HH-LH patterns were less variable than those of the children’s imitations. The children’s imitations are shown as continuous curves for the HH-HL patterns. These children’s imitations tended to be reflected in the f0 changes of the adults’ data. That is, the imitations of the adult speakers showed gradient or continuous curves for the HH-HL patterns using the log-produced f0 intervals. This indicates that both adults and children track the changes of f0 in the continuum of HH and HL, though there are not any categorical boundaries. Moreover, when the adult speakers imitated the continuum of HH-HL, some speakers showed the categorical boundaries on their curves. On the other hand, for the HH-LH patterns, both adults and children imitated poorly, not showing a categorical function. The curves tended to be flat rather than continuous or gradient for both adults and children. Only one adult speaker showed a categorical boundary with a shift in the continuum. In this experiment, the children’s imitations did not show the function of categorization for the HH-LH patterns.
The present study revealed that children imitated pitch range variation more poorly than adults. In speech behavior, the study of Kent (1979) suggests that, regarding the perceptual and articulatory features, children perceive and produce more inadequately than adults. The present experiment conducted an imitative task, and the results show that the factors of perception and production were different for children than for adults. The imitative response provides reliable and interpretable factors of the children’s speech behavior. This study found that the children’s imitative responses in the North Kyungsang region play a significant role in HL-LH pitch accent patterns. Moreover, the children’s imitations showed the process of acquiring the lexical pitch accent patterns. The imitative responses of children tended to follow the perceptual and productive trace of adults’ speech behavior.
The study employed an imitation task to examine how children and adults acquire prosodic characteristics in a given dialect, involving the acquisition of the f0 factor in North Kyungsang Korean. Children in the North Kyungsang region imitated the primary f0 patterns which are similar to adults’ speech behavior. Children showed that their imitative responses play a critical role in acquiring their native dialect’s prosodic properties. The present study analyzed five children’s imitative responses; in future studies, the number of child participants should be increased in order to collect more varied data, and different additional tasks should be conducted as part of the experiment.