Formant-Frequency Variation and Its Effects on Across-Formant Grouping in Speech Perception



Fig. 36.1
Influence of rate of formant-frequency variation on the effect of competitors (F2C+F3C) on intelligibility (synthetic-formant sentences). Mean scores and standard errors are shown separately for the constant- and reversed-amplitude conditions. The results for the control and dichotic-reference conditions are shown on the left and right sides, respectively. The bottom axis indicates the relative rate for the competitor formants (Reproduced from Summers et al. (2012) with kind permission from Springer Science + Business Media B.V.)






5 Experiment 3


Remez (1996) reported that reducing the frequency and amplitude variation in a competitor created by time-reversing F2 reduced its impact on the intelligibility of sine-wave speech. We extended this approach to synthetic-formant speech and refined it by manipulating the depth of variation in the frequency contour of a time-varying F2C without changing the amplitude contour.


5.1 Method


In a preliminary study, the depth of variation in the frequency contour of each target formant was scaled to a range of values about its geometric mean (100 to 0 % [i.e. constant]). The acoustical effect of this manipulation is similar to that of physically constraining the excursions made by the main articulators away from their average positions. The results indicated that scaling the frequency contours to 50 % depth had relatively little effect on diotic intelligibility. Hence, in the main experiment, we were able to use three-formant analogues of the target sentences whose formant-frequency contours were scaled to 50 % depth, but which were reasonably intelligible. This made it possible to explore the effect of scaling F2Cs to have greater, as well as smaller, variation in their formant-frequency contours than that of the target formants without exceeding the natural range. For each sentence, a set of F2Cs was created using a constant amplitude contour (matching the RMS power of F2). The frequency contour of F2C was derived from that of F2 by inversion about its ­geometric mean and scaling to one of five values (depth  =  100 to 0 %, 25 % steps). Stimuli were presented dichotically (F1+F2C; F2+F3). There were seven conditions – five experimental (depth of F2C was varied), one control (no F2), and the dichotic reference (no F2C).


5.2 Results and Discussion


The results (n  =  21) were as follows. The control condition indicated that intelligibility was near floor when F2C was added full scale without the target F2. When the target F2 was present, adding F2C reduced intelligibility; this reduction was greatest for 100 %-depth (19.6 % points) and least for 0 %-depth (constant) F2Cs (6.3 % points). The smooth and progressive decline in intelligibility as the scaling factor for the inverted F2C increased indicates that competitor efficacy depends on the overall depth of its frequency variation, not its depth relative to that of the other formants (all set to 50 % depth). This pattern is similar to that observed for the effect of differences in competitor rate. Though not conclusive, this outcome is consistent with the idea that the grouping heuristic involved is not speech specific. In contrast, Remez et al. (1994) interpreted their findings in terms of the plausibility of speech-like variation in the competitor. One way to evaluate this interpretation is through manipulation of the acoustic properties of F2C in ways that change its articulatory plausibility.


6 Experiment 4



6.1 Method


The importance of speech-like variation for across-formant grouping was explored using an F2C with a regular and arbitrary formant-frequency contour. A triangle wave was used, which does not constitute a plausibly speech-like frequency contour. Specifically, it differs from natural formant contours in having precise periodicity and a wave shape with sharp peaks and troughs. Sharp peaks and troughs would not be expected from a dynamical system like the vocal tract, composed of articulators having mass and, when in motion, momentum. The same dichotic configuration and procedure were used as for experiment 3. There were eight conditions – five experimental (depth of triangle-wave contour for F2C was varied from 100 to 0 % in 25 % steps), the dichotic reference (no F2C), and two controls. One control comprised F2+F3 alone to provide a measure of intelligibility when F1 does not contribute to the sentence. The other was the 100 %-depth inverted F2C condition from experiment 3, as a comparator for the 100 %-depth triangle-wave case. For each sentence, the triangle-wave frequency contour was matched to the average rate and depth of modulation for its inverted F2C counterpart, derived from F2. Modulation rate was set in relation to zero crossings at the geometric mean frequency of the target F2. Peak-to-trough depth was matched to that of F2 on a log-frequency scale and centred on the geometric mean frequency. All F2Cs were synthesised using a constant amplitude contour.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Apr 7, 2017 | Posted by in OTOLARYNGOLOGY | Comments Off on Formant-Frequency Variation and Its Effects on Across-Formant Grouping in Speech Perception

Full access? Get Clinical Tree

Get Clinical Tree app for offline access