Fig. 18.1
The tritone paradox and contextual modulation of the perceived change in pitch. (a) The ambiguity in the tritone paradox stems from the Shepard tones’ repetitive structure (see Sect. 2). (b) Steps of up to almost 6 st are perceived as up/down steps, whereas the half-octave (tritone) step is ambiguous and influenced by various factors (acoustic context, pitch class, etc.). (c) Preceding an ambiguous Shepard pair by a sequence of Shepard tones within the half octave above (‘up bias’, dark grey) or below (‘down bias’, light grey), the first tone (d) influences the percept of directionality for human listeners. (e) As indicated in the Sect. 1 and detailed in the Sect. 4,” the standard decoder assumes the absolute minimal angle between the Shepard pair to determine the perceived direction of pitch change (perceived = light grey, prediction = dark grey). We show that the perceived pitches move away (i.e. >6 st) from the bias (actual = black), thus necessitating a different decoder. (f) A relative decoder takes the acoustic context into account and makes relative judgements from the acoustic history, consistent with the present data set. The dark grey areas indicate acceptable perceived pitches of the tones to still be heard according to the bias for the relative decoder
2.2 Acoustic Stimuli
All stimuli were composed of sequences of Shepard tones (Shepard 1964). A Shepard tone is a complex tone built as the sum of octave-spaced pure tones (a flat spectral envelope was used here). A Shepard tone can be characterized by its position in an octave, termed pitch class (in units of semitones), w.r.t. a base tone. Across the entire set of experiments the duration of the Shepard tones was 0.1 s and the amplitude 70 dB SPL.
The biased Shepard pairs consisted of a bias sequence (‘bias’) followed by an ambiguous, i.e. 6 st separated, Shepard pair. The bias precedes the pair at various temporal separations ([0.05,0.2,0.5,1]s) and consists of a sequence of Shepard tones (lengths, 5 and 10 stimuli), which are within 5 semitones above or below the first Shepard tone in the pair. These biases are called ‘up’ and ‘down’ bias, respectively, as they bias the perception of the ambiguous pair to be ‘ascending’ or ‘descending’, respectively, in pitch (Chambers and Pressnitzer 2011). Altogether we presented 32 conditions (4 base pitch classes ([0,3,6,9]st), 2 randomization, 2 bias lengths ([5,10] stimuli), ‘up’/‘down bias) and different bias sequences, which in total contained 240 distinct Shepard tones, finely covering one octave. In the present study, we use one of the simpler versions of this contextual influence—more detailed psychophysics will be described in a forthcoming study by CC, SAS and DP.
Further, we used pitch comparison sequences in the psychophysical studies. The pitch comparison sequences consisted of a bias, a reference Shepard tone, and a target Shepard tone. The bias that preceded the reference was followed by a target (drawn from the set [−3,−2.9,…,2.9,3]st) 3 s later. Subjects were asked to report whether the target’s pitch was higher or lower than the reference’s.
2.3 Population Decoding
The perceived stimuli in the ambiguous pair were estimated from the neural responses by training a decoder on the biasing sequences and then applying the decoder to the neural response of the pair. We first build a matrix of responses which had the (240) different Shepard tones occurring in the bias running along one dimension and the neurons along the other dimension. The PCA (Principle Component Analysis) decoder performed a linear dimensionality reduction, interpreting the stimuli as examples and the neurons as dimensions of the representation. The data was projected to the first three dimensions, which represented the pitch class as well as the position in the sequence of stimuli. To assign a pitch class to the decoded stimuli of the test pair, we projected them onto the ‘pitch circle’ formed by the decoded stimuli from the bias sequences. More precisely we estimated a smoothed trajectory through the set of bias tones which was assigned a pitch class at every point, by averaging the pitch classes of the closest 10 bias stimuli weighted by their distance to the point. The pitch class of the test tone was set to the pitch class of the closest point on the trajectory.
2.4 Statistical Analysis
Nonparametric tests were used throughout the study to avoid assumptions regarding distributional shape.
3 Results
We obtained single-unit recordings from 555 neurons in the primary auditory cortex of seven awake ferrets and conducted psychophysical experiments with ten subjects under various stimulus conditions.
3.1 Neurons in Auditory Cortex Exhibit Tuning to Shepard Tones
A considerable subset of neurons in auditory cortex responded to the presentation of Shepard tones with a significant change in response rate compared to spontaneous rate (55, 43 % increased, 12 % decreased; p < 0.05), while 41 % of the neurons also had a significantly tuned response. A well-tuned unit is shown in Fig. 18.2a, where the firing rate varies as a function of the pitch class of the Shepard tone. Neurons typically exhibited a single peak of varying width, although multipeaked tuning curves existed as well (∼30 % out of the tuned cells). Overall, the median tuning width was 2.06 [25 %, 0.82; 75 %, 6.44]st (2 SD of a Gaussian fit to the tuning). Neurons exhibited strongest tuning in onset, sustained, and offset responses in similar proportions (onset, 38 %; sustained, 33 %; offset, 29 %).