Fig. 44.1
Mean PSDTs (black symbols) compared to the maximum phase shift calculated from the mistuning detection thresholds (MDTs) of Klinge and Klump (2010) (gray symbols). The phase shift in a complex with a frequency-shifted component calculated from MDTs gradually develops in the ongoing stimulus. PSDT data could not be obtained for the 400 Hz/50 ms condition. Error bars = 2 × SEM
4 Discussion
Many experiments investigating phase effects in the human auditory system showed that humans can exploit phase cues if the frequencies of a complex stimulus are unresolved in the auditory periphery, that is, if more than one component falls within an auditory filter (e.g., Mathes and Miller 1947; Licklider 1957; Patterson 1987; Moore and Glasberg 1989). Studies in humans using two- and three-component signals with unresolved frequency components (e.g., sinusoidal amplitude modulated [SAM] tones vs. quasi-frequency modulated [QFM] tones with the carrier frequency phase shifted by 90°; Mathes and Miller 1947; Goldstein 1967; Nelson 1994) showed that phase differences result in timbre differences that may be used to distinguish between such signals based on envelope cues. Similar observations have been made for harmonic complex stimuli with a larger number of components (Licklider 1957; Plomp and Steeneken 1969; Patterson 1987; Moore and Glasberg 1989). Licklider (1957), Patterson (1987), and Moore and Glasberg (1989) demonstrated that phase shifts were better detectable by the human auditory system for higher harmonic numbers and for complexes with lower fundamental frequencies (F0s).
Here we show that the gerbil auditory system is also highly phase sensitive. Similar to results from Moore and Glasberg (1989) obtained in humans, PSDTs increased with decreasing harmonic number. However, while gerbils were still able to detect a phase shift in the second component of a 200-Hz complex, PSDTs for such a low harmonic could not be obtained in humans. Gerbils had smaller PSDTs than humans for higher harmonics of the 200-Hz complex.
Further evidence of high phase sensitivity in gerbils is provided by the observation that mistuning detection thresholds (MDTs) for gerbils were significantly higher for random-phase than for sine-phase harmonic complexes. In contrast, humans showed no significant differences in MDTs between sine-phase and random-phase harmonic complexes, thus indicating a lower importance of phase processing for mistuning detection (Klinge and Klump 2009). Similar to gerbils, several bird species showed high sensitivity for both mistuning detection and detection of phase changes (Lohr and Dooling 1998; Dooling et al. 2002; Klump and Groß 2013, unpublished). The results suggest that gerbils (and maybe also these bird species) exploit changes in the phase relationship between components as a cue for mistuning detection whereas humans probably do not.
A possible cue that has been proposed for detecting phase differences is the presence of cubic difference tones that can produce phase-dependent changes in the internal spectral representation of sounds which, for example, could be used to distinguish between SAM and QFM tones (Nelson 1994). The results of the control experiment with a pink noise background ruled out the use of such cues by the gerbils.
The gerbil’s MDTs (Klinge and Klump 2010) can be used to calculate the maximum phase shift gradually developing in ongoing mistuned stimuli and these can be compared to the PSDTs (Fig. 44.1). If we assume that gerbils exploit cues related to the phase change in the output of auditory filters to detect mistuning in a harmonic complex, then PSDTs should be similar to the phase shifts calculated from the MDTs. There were no significant differences for the second component of the harmonic complex, but for the 32nd component, PSDTs were significantly lower than the MDT phase shifts (all p < 0.05, student’s unpaired t-test, NPSDT = 6, NMDT = 4 [400 ms], NMDT = 2 [100, 50 ms]). Thus, gerbils detected a smaller phase shift when it was presented as a constant phase shift than when it gradually developed due to the mistuning. A further difference between results from the two experiments was the effect of stimulus duration on thresholds. PSDTs increased with decreasing stimulus duration, but no such trend was observed for the phase shifts calculated from the MDTs.
In order to assess possible mechanisms explaining the gerbil’s ability to detect a phase shift in a component of a harmonic complex, we simulated the signal processing by gerbil auditory filters when excited by a 200-Hz complex with stimulus durations of 400, 100, and 50 ms (Fig. 44.2) without and with a phase-shifted component. Generally, the temporal waveform of the output of the various filters differs between a harmonic complex with and without a phase-shifted component. For the 32nd-component condition (right panels in Fig. 44.2), the differences in the temporal pattern evident in the fast fluctuations of the envelope (FFEs) at the filter output are due to the interaction of harmonics within the filter (compare third and fifth columns). The interaction of the components starting in sine phase produces temporal waveforms having FFEs with portions of the amplitude being almost zero. Phase shifting a component results in a level increase for the low-amplitude portion. Such changes in the FFEs of the waveform may result in differences in the neural response that could be used by successive stages of the auditory system to detect the phase shift either by sequential comparisons within or simultaneous comparisons across filters. For example, temporal and rate responses of neurons from the ventral cochlear nucleus and the inferior colliculus of guinea pigs showed an asymmetry in response to temporally asymmetric FFEs (Pressnitzer et al. 2000). Moore (2002) describes the possible exploitation of the changes in the low-amplitude portion of the envelope as “listening in the dips” and proposes that this ability might be limited by the temporal resolution of the auditory system. For the second-component condition (left panels in Fig. 44.2), the changes concern phase shifts in the auditory filter containing the phase-shifted component relative to the neighboring filters. Thus, for detection the auditory system has to make simultaneous comparisons across different auditory filters. A possible neural mechanism suitable for processing such cues may rely on coincidence detectors comparing the filter outputs. Phase locking to the temporal fine structure has been shown to still be possible in this frequency range (e.g., Palmer and Russell 1986 for guinea pigs). A comparison of the output patterns of the auditory filters for the constant phase shifts of the present experiment with those for gradual phase shifts in the mistuning detection experiment (Klinge and Klump 2010) reveals similar changes in the temporal waveform of the filter output suggesting that similar processing mechanisms may account for both MDTs and PSDTs.