Fig. 48.1
Sketch of stimulus configuration in the frequency domain
Twenty-one listeners (21–47 years, mean 25 years, 12 females) participated in the study. None of the listeners had a history of neurological illness, head injury, or hearing impairment in the explored frequency range. Written informed consent was obtained from all participants. The study was approved by the local ethics committee of the University of Oldenburg.
3 Detection Thresholds
All psychoacoustic measurements were carried out in a double-walled sound booth. Masked thresholds for the target in the presence of the UN and the CM masking noise were determined using an adaptive three-alternative, forced-choice procedure (1 up 2 down algorithm for the target level, minimum step size 1 dB) for all four combinations of target-masker configuration: S 0 N 0-UN, S π N 0-UN, S 0 N 0-CM, and S π N 0-CM. The digital signals were played via D/A converters RME ADI-8DS, a headphone amplifier TDT-HB7, and headphones Sennheiser HD 650 at a fixed overall level of 57 dB SPL for the masker. The mean results for all 21 participants for the four different target-masker conditions are shown in Fig. 48.2 (filled symbols connected by dashed lines). In summary, the size of the CMR effect for this signal-masker configuration is about 6 dB, irrespective of the interaural phase for the target signal, while the BMLD effect is about 10 dB, irrespective of the masker being uncorrelated or comodulated. Although the CMR and BMLD were somewhat different from listener to listener (the standard deviation across listeners is indicated by error bars), the overall observation was largely consistent across listeners. This apparent lack of interaction between CMR and BMLD is consistent with the previous interpretation of largely independent processing of monaural and binaural cues involved in the release from masking (Epp and Verhey 2009).
Fig. 48.2
Results from detection threshold measurements. The filled symbols represent the mean detection thresholds for all four signal-masker combinations, i.e., unmodulated (triangles) and comodulated (filled circles) masker for a diotic configuration (S 0 N 0) and for a target tone with an interaural phase difference of 180° (S π N 0); error bars show standard deviation across all 21 participants. The open symbols indicate those signal-to-noise conditions (SNR1, SNR2, SNR3) which are chosen for following fMRI experiment
4 Functional MRI Experiment
4.1 Stimulus Selection, Procedure, and Data Analysis
The psychoacoustic results were used to choose three signal-to-noise ratios for each listener to explore the effect of SNR on an fMRI correlate of target audibility. The first S/N (SNR1) was selected so that the target signal was audible for both noise conditions and both interaural phase conditions. This S/N was set to 6 dB above the masked threshold for the S 0 N 0-UN configuration, i.e., the most effective masker condition. The second S/N (SNR2) was selected so that the target signal in the S 0 N 0 configuration was not audible for the uncorrelated noise but audible for the comodulated noise (“halfway” between S 0 N 0-UN and S 0 N 0-CM thresholds). This S/N was picked to search for a correlate of the CMR effect alone, irrespective of binaural cues. The third S/N (SNR3) was selected so that the target was inaudible in both S 0 N 0 configurations but audible for both S π N 0 configurations (“halfway” between S 0 N 0-CM and S π N 0-UN thresholds), to identify a correlate of the BMLD effect, largely independent of the effect of the modulation of the masker signal.
Thirteen different sound conditions in total were used during the fMRI recording for each of the 21 listeners. These included the conditions S 0 N 0-UN at SNR1, SNR2, and SNR3; S 0 N 0-CM at SNR1, SNR2, and SNR3; S π N 0-UN at SNR1 and SNR3; S π N 0-CM at SNR1 and SNR3; N 0-UN and N 0-CM alone with no target signal; and finally a condition with no signal presentation as a baseline control. The selected signal-to-noise ratios are shown as additional symbols in Fig. 48.2, to illustrate the relationship of the fMRI design relative to the results from the preceding detection experiments. As mentioned above, signal-to-noise ratios were chosen individually based on the results for the masked thresholds determined before. A subgroup of 16 listeners showed largely homogeneous results in their detection thresholds. The mean selected S/N for these listeners were SNR1 = 0.1 (1.1) dB, SNR2 = −9.2 (1.1) dB, and SNR3 = −14.7 (1.8) dB. Five more listeners showed bigger deviations from this general trend, mainly with respect to the size of the CMR effect, which was very small for some of the participants. They were still included the fMRI study, with correspondingly different signal-to-noise ratios for the experimental conditions (not listed in detail here).
Each stimulus presentation was a sequence of twelve 400-ms noise bursts together with the respective target at 1 kHz and jittered by ±24 Hz from burst to burst. After each presentation of a stimulus block (i.e., during the respective scan acquisition), the participants indicated by button presses whether they heard the slightly varying tone in the masking noise. This task was introduced to maintain the participants’ attention. Stimuli were played via MR-compatible insert earphones (Sensimetrics S14) using a fixed masker level of 60 dB SPL. Throughout the full fMRI experiment, each condition was repeated 36 times, giving a total of 468 brain images. The order of conditions was fully randomized, and the complete session was divided into three runs with 156 scans each. Functional MRI data were acquired using a Siemens Sonata 1.5 T MRI system. For the functional data, 21 axial EPI slices (in-plane resolution 3 × 3 mm; thickness 5 mm, echo time TE = 63 ms to maximize BOLD contrast) were acquired covering most of the cortex, including the whole of the temporal lobes and frontal regions. Sparse imaging with clustered volume acquisition was used (Hall et al. 1999). The total volume acquisition time TA was 2.7 s. On each trial, there was a 5-s stimulus interval followed by the 2.7-s scanning interval, making a total repetition time of TR = 7.7 s. A T 1-weighted high-resolution anatomical image (176 sagittal slices, TR = 2.11 s, TE = 4.38 ms) was also collected for each subject.