Chirp variants in cotton-top tamarins: type A used in mobbing; type B used in investigation of novel objects; type C used during foraging, and type D used during eating. Type E chirps serve as alarm calls. Type F chirps are given in response to hearing calls of novel animals. Type G chirps are exchanged between calm animals within a group and type H chirps are used as mild alarms. (Modified from Snowdon 1982)
Syntax in animal signals refers to the orderly sequencing of multiple calls or notes. Much of bird song is highly organized in terms of the structure and sequencing of different notes or themes, and there is also evidence of this in family-living primates. The songs of gibbons are highly structured with a series of notes produced in the duetting song and coordination of singing between the male and female (see Sect. 6.2). While the same notes are also found in songs that are given in response to predators, the structural organization of the notes differs. In white-handed gibbons, out-of-sight animals responded differently to the two types of songs, indicating that they were using the sequencing of notes rather than the notes themselves to discriminate between the two types (Clarke et al. 2006). In red-bellied titi monkeys, several calls are repeated, and these calls are organized into sequences involving different call types. These sequences were quite regular, and when playbacks of calls in altered sequences were presented to titi monkeys, they showed some ability to discriminate between normal and abnormal sequences (Robinson 1979a).
Tamarins and marmosets also show examples of syntax. Cleveland and Snowdon (1982) described several sequences in calls of cotton-top tamarins with a few general rules. Chirp-like calls always preceded longer constant-frequency calls within a sequence and, within a series of constant-frequency calls, each successive note was higher pitched than the previous one. In most cases, the sequence could not be decomposed into separate parts. That is, the sequenced call did not have the same function as each of the component parts did individually. This is phonological syntax, akin to the use in speech of different phonemes to create different meanings, such as “dog” versus “god.” However, cotton-top tamarins showed a few examples of lexical syntax, wherein each component of the sequence has its own context and the sequence represents the combination of these contexts. For example, after an alarming event an animal will combine an alarm call with an affiliation call, and after this, other group members become active again. A second example is calling in response to the calls of novel animals: the male and female initially each use different calls but combine both types of calls at the peak of arousal (McConnell and Snowdon 1986). Miller et al. (2005) presented tamarins with manipulated long calls and found that recognition of call type and of caller occurred in separate stages of sensory processing.
6.4.3 Turn Taking
Duetting between mated pairs was discussed previously in Sect. 6.3.2 on pair bonding, but coordination of calling among group members is also seen outside of the calling between mates. In a group of three pygmy marmosets, Snowdon and Cleveland (1984) found that each animal within the group was more likely to call before another animal called a second time, and one possible order of turn taking (e.g., ABC, BCA, or CAB) was more common than the other order (CBA, BAC, or ACB). The development of turn taking is dependent upon the ability to recognize each individual based on voice alone.
Several studies have looked at antiphonal calling (the exchange of calls between two or more individuals or groups), which is common among marmosets and tamarins. The results included evidence of individual recognition within antiphonal calling (Miller and Thomas 2012), different structure in initial calls versus answering calls (Miller et al. 2010), and evidence of learning turn-taking behavior during development (Chow et al. 2015). Vocal turn taking by marmosets shows similar dynamics as vocal turn taking by humans, implying a converging evolution of cooperative vocal behavior in these two cooperatively breeding species (Takahashi et al. 2013).
6.4.4 Vocal Memory
Individual recognition by voice is critically important in any social group of primates, and recognition of voices of mates and of other family members is important in family-living species (also see Sect. 6.3). Little work has been done on long-term memory for vocalizations. However, in the natural environment where animals of both sexes disperse and form new family groups, recognition of the voices of relatives might be important in avoiding inbreeding. One study of cotton-top tamarins demonstrated that memories of calls of former family members last up to 5.5 years (Matthews and Snowdon 2011). To date, this is the longest duration of vocal memory in any nonhuman primate.
In human speech, phonemes are produced along a variety of continua, such as voice onset time or place of articulation, and human perceptual systems organize these vocal continua into discrete categories that allow the perception of distinct phonemes instead of multiple variations. Do similar processes exist in other species? Pygmy marmosets produce many variants of trills, which are sinusoidal, frequency-modulated calls varying in bandwidth and duration (see Fig. 6.2). Although several variants are used in similar contexts (see Sect. 6.8), two trill types are used in distinct contexts: the closed mouth trill is used as an affiliative contact call, whereas the open mouth trill is used in agonistic contexts. The main structural difference between these two calls in a captive population was duration with all closed mouth trills being shorter than 250 ms and all open mouth trills being longer. Snowdon and Pola (1978) synthesized trills and varied them along dimensions of bandwidth, rate of frequency modulation, and duration and played these synthesized trills to the marmosets. On the duration dimension, there was a clear category boundary at 250 ms with calls on either side of the boundary (varying by only 8 ms) eliciting different responses. Closed mouth trills elicited an immediate antiphonal response, whereas open mouth trills did not.
Trill variants in pygmy marmosets: (A) closed mouth trill; (B) open mouth trill; (C) quiet trill; (D) juvenile trill; (E) J-call. (From Snowdon 1982, reprinted with permission of Cambridge University Press)
Masataka (1983) played synthesized alarm calls to Goeldi’s monkeys (Callimico goeldii) and found that an increase of 0.2 kHz in the frequency range of the modulating sweep was sufficient to induce different behavioral responses, from a response appropriate to a mobbing call (i.e., approaching the caller to attack a predator) at a low-frequency range to a response appropriate to an alarm call (i.e., freezing) at a higher frequency range. Thus, both pygmy marmosets and Goeldi’s monkeys show a human-like categorical perception of their own calls.
In a perceptual study of cotton-top tamarins, Ghazanfar et al. (2001) played back partial phrases or complete combination long calls and found that isolated tamarins responded significantly more to the entire call than to any component parts. They concluded that, from a tamarin’s perspective, the entire long call forms the unit of perception. Bauers and Snowdon (1990) selected the two most acoustically similar of the eight chirps produced by cotton-top tamarins (F and G chirps, see Fig. 6.1) and found a clear difference in behavioral responses between the two playbacks.
6.4.6 Summary: Cognitive Aspects
There is considerable evidence for cognitive complexity in vocal communication in family-living primates. Referential signals communicate about food quality and predator types, and there is evidence of subtle variation in call structure that is correlated with specific contexts. Several species have call sequences that are consistent and predictable, and different sequences are used in different contexts. Many species show turn-taking behavior that indicates rule-based structures governing who will call as well as individual recognition of group members. There is some evidence of long-term vocal memory that may be important in avoiding inbreeding, and the perception of vocalizations has several parallels to the perception of speech sounds by humans.
6.5 Vocalizations in Social Learning and Teaching
Studies of social learning and teaching rarely mention the role of communication, yet vocal communication may play an important role. This section examines two sets of findings: one on how vocal communication might influence social learning and the other on putative teaching behavior in tamarins.
6.5.1 Social Learning
Although there is good evidence that rodents and birds can learn from others to avoid noxious foods (Galef and Giraldeau 2001), there has been little evidence among nonhuman primates. An illustrative example is on tufted capuchin monkeys (Cebus apella), which are not pair bonded or cooperatively breeding. When invisible white pepper was added to a familiar preferred food, mozzarella cheese, Visalberghi and Addessi (2000) found that capuchin monkeys learned to avoid the food individually. That is, there was no effect of watching other animals sample the adulterated food.
In a replication of the food avoidance study, in this case with cotton-top tamarins, Snowdon and Boe (2003) added white pepper to highly preferred tuna fish and found that only a third of the tamarins ever sampled the adulterated tuna, meaning that the other two-thirds of the animals avoided this previously preferred food. Furthermore, when tuna was later presented without any pepper, several animals continued to avoid eating tuna for more than a year after the initial experiment. What could account for the difference between these two studies? There was no evidence of any communication between the non-family-living capuchin monkeys that Visalberghi and Addessi studied, whereas cotton-top tamarins that sampled the adulterated tuna significantly reduced the number of food calls produced and increased the number of alarm calls (a novel use of alarm calls, see Sect. 6.7.4). The monkeys that first sampled the food also gave an increased frequency of visual disgust responses. Thus, the use of vocalizations (and visual signals) by the tamarins that first sampled the adulterated tuna may have facilitated the rapid and enduring social learning to avoid tuna.
The existence of teaching in nonhuman animals has long been controversial. However, Caro and Hauser (1992) provided a simple operational definition. They have four criteria: (1) the teacher must alter its behavior only in the presence of a naïve animal; (2) the teacher must incur some cost or at least no immediate benefit; (3) the teacher’s behavior encourages, punishes, or sets an example for the naïve animals; and (4) as a result, the naïve animal acquires a skill faster than it might otherwise. An additional criterion might be that the teacher is sensitive to the changes in the learner’s behavior and alters its own behavior accordingly.
Tamarin and marmoset species are interesting because adults often share food with infants beginning at the time of weaning. This appears to modulate any weaning conflict and leads to young animals being able to feed on solid food at an earlier age than they might otherwise. Vocalizations play an important role in this process. Infants of many species beg for food, but adult tamarins who are prepared to share food with infants give distinct variations on normal food calls (see Sect. 18.104.22.168). Adults produce not only more bouts of food calls but also produce many more calls within a bout at a much faster rate than they do with only adults present (Joyce and Snowdon 2007). The probability of an infant being able to obtain food from an adult is dependent on the adult producing the call (Roush and Snowdon 2001; Joyce and Snowdon 2007). Adults have modified their vocal behavior specifically for use in the food sharing context. Since these calls are energetically more costly than normal food calls and the adults are giving up some of their food, they are clearly incurring a cost. When twins are present (twinning is common among marmosets and tamarins), adults begin to give these rapid food calls and to share food almost a month earlier than when there is only a single infant present. Twins who receive food sharing at an earlier date also begin to forage on their own earlier than singletons, suggesting that the initially naïve animals are acquiring skills as a result of the adult behavior.
Food sharing begins at the end of the second month of life, peaks during the third month, and is rarely seen by five months of age. At this point all young tamarins are foraging successfully by themselves and giving food-associated calls similar to those of adults. Humle and Snowdon (2008) tested juvenile cotton-top tamarins seven months and older on a novel foraging task. Two opaque tubes with a food container suspended inside each tube were introduced first to the parents, and each parent was trained on a different method of solution. One solution was to walk along a branch and reach up into a tube to obtain food. The other solution was to hang suspended from the ceiling and pull up the food container hand over hand. Once the adults were well-trained, a juvenile was introduced. Even though food sharing and infant forms of food calling had not been observed for more than two months, the adults again began to give infant food calls and shared with the juveniles, but they only did this in the presence of the novel task and not on control days when food was present in a food dish. However, as soon as the juvenile was successful in obtaining food from the apparatus, the adult model stopped vocalizing and no longer engaged in food sharing. This is clear evidence that adult tamarins are sensitive to the changes in the learner’s behavior and are adjusting their own behavior.
Parallel results have been reported in both captive and field studies of golden lion tamarins. Captive golden lion tamarins are more likely to share novel or difficult-to-process foods with infants (Rapaport 1999), and in the wild, where young tamarins have difficulty catching insect prey, adults successively withhold assistance from juveniles as their insect-catching skills improve (Rapaport 2006; Rapaport and Ruiz-Miranda 2006). In both golden lion tamarins and cotton-top top tamarins, adults have been observed calling near a prey source or assisting a young animal in obtaining food. This scaffolding behavior is a mark of human teaching, and its presence in tamarins contrasts sharply with the absence of any coaching or scaffolding behavior in chimpanzees, even when young individuals are feeding on potentially painful biting ants (Humle et al. 2009). However, despite the evidence for adults appearing to be sensitive to the abilities of young animals in cotton-top tamarins and lion tamarins, research on common marmosets did not show evidence of such sensitivity (Brown et al. 2005).
6.5.3 Summary: Vocalizations in Social Learning and Teaching
Vocal signals play an important role in both social learning and in teaching behavior in tamarins, and one is tempted to argue that such communication may be responsible for facilitating the rapid social learning seen in these species and absent in capuchin monkeys and chimpanzees. However, this is a hypothesis that needs to be tested closely in other family-living species as well as nonhuman primates with other forms of social organization. Most researchers on social learning have not been interested in the role of communication, but this may prove to be important.
6.6 Vocal Development
As noted in Section 6.1, it is commonly thought that vocal structures are innate in primates with little or no developmental modification. However, family-living primates appear to demonstrate a greater influence of social and environmental factors on vocal structure than has been seen in other nonhuman primates. This section first reviews various models and methods of studying vocal development followed by information about babbling and consideration of some naturalistic and experimental studies that suggest that vocal development of family-living primates is sensitive to social and environmental factors. Section 6.7 then examines plasticity in adult vocal structure and usage.
6.6.1 Models and Methods of Vocal Development
Three aspects of the development of vocal communication can be distinguished: (1) signal structure; (2) appropriate usage; and (3) comprehension of signals. Each of these may be subjected to different developmental processes. Four models can be used to explain developmental processes in vocal communication. These include (1) innate or genetic determination, whereby signal structure, usage, or comprehension are fixed at birth; (2) maturation, whereby signal structure, usage, or comprehension changes as a function of physical or social maturation but without any explicit learning process; (3) limited learning, whereby only certain aspects of signal structure, usage, or comprehension can be developed and only during a limited period in development; and (4) open-ended learning where structure, usage, or comprehension can be modified throughout an animal’s life span.
It is generally accepted that nonhuman primates display developmental flexibility in the usage and comprehension of signals, but vocal structures are innate and not susceptible to modification by experience (Seyfarth and Cheney 1997). Janik and Slater (1997, 2000) have argued that evidence of vocal learning requires that an animal be able to acquire vocalizations from outside their natural species-specific repertoire. They further state that only songbirds and a few other genera of birds, cetaceans, bats, and humans show this ability, whereas no nonhuman primates do. This view has been reinforced by early studies of squirrel monkeys (Saimiri sciureus) and rhesus macaques (Macaca mulatta) that were reared in isolation. The isolate-reared squirrel monkeys had a normal adult vocal repertoire and responded with appropriate vocalizations in the proper contexts (i.e., giving alarm calls to predators never seen before) (Winter et al. 1973; Herzog and Hopf 1983, 1984). Similarly, isolate-reared rhesus macaques showed only minor perturbations in the structure of their coo vocalizations (Newman and Symmes 1974). When isolate-reared rhesus macaques were tested in a situation where one animal saw a stimulus that indicated a shock and a second animal could only see the facial expression of the monkey seeing the stimulus but had to respond to save both animals from getting shocked, the isolate-reared animals were effective communicators, but they could not “read” the signals of another monkey when they had to respond (Miller 1967). This suggests that, whereas the production of the signal and its use in an appropriate context were not affected by isolate rearing, the comprehension of the signal was impaired.
Isolate rearing of nonhuman primates is not ethically acceptable today, but cross-fostering and hybridization are two less invasive methods. In a study that cross-fostered rhesus and Japanese macaques with mothers of the opposite species, there was no evidence that the cross-fostered infants acquired the vocalizations of its foster species, but the foster mothers rapidly learned to respond appropriately to the calls of the foster infant (Owren et al. 1993). Hybridization between two species of squirrel monkeys found that the hybrid offspring tended to acquire the call characteristics of their mothers (Newman and Symmes 1982). However, in the wild, male squirrel monkeys are typically excluded from the group after mating, so it is possible that infant squirrel monkeys normally learn call structure from their mothers. Two studies on hybrid gibbon infants found that the calls of infants did not resemble those of either parent and, in some cases, contained aspects of the vocal structure of unrelated species. The mechanisms of vocal development in gibbons are complex and not easily related either to direct inheritance from one or both parents or to vocal learning from parents (Geissmann 1984; Tenaza 1985).
However, with the exception of the gibbons, none of these species reviewed so far are family living. Would developmental processes be different in family-living species? There are two types of examples: the spontaneous babbling-like behavior of pygmy marmosets (Cebuella pygmaea) and the naturalistic study of vocal development combined with some experimental manipulations in pygmy marmosets, common marmosets, and cotton-top tamarins. Little is known about other family-living species, and this material is reviewed in the final section.
6.6.2 Babbling-Like Behavior
From the first two weeks of life, young pygmy marmosets engage in long vocal bouts that contain a variety of call types (Elowson et al. 1998). These bouts share many characteristics with the babbling behavior of human infants. The majority of the calls produced was similar to adult calls and, indeed, represent a subset of adult calls. The calls (e.g., alarm calls, food calls, contact calls, etc.) are given out of context, given in a haphazard order, and often repeated several times with no relationship to the normal adult context for calls. Finally, adults respond to calling infants by approaching them and making physical contact. The main difference in comparison to human babbling is that the pygmy marmosets do not have a phonetic structure; thus babbling consists of calls rather than phonemes. Often the subsong and plastic song of songbirds is treated as a parallel to human babbling behavior (Marler 1970), but there are some fundamental differences. Song is typically produced only by male birds and subsong and plastic song appear only as birds undergo puberty. In contrast, pygmy marmoset babbling begins in infancy and is seen equally in both sexes.
What are the consequences of babbling? Snowdon and Elowson (2001) reported that greater babbling early in infancy led to improved vocal production and a greater number of adult-like vocalizations after weaning. However, vocal development was not completed at weaning. The most commonly used adult call is the trill, and marmosets continued to improve on the production of adult trills throughout puberty and adolescence, reaching adult-like trill structure only as breeding adults, much like the food-associated calls of cotton-top tamarins (see below). Interestingly, submissive adult marmosets regress to babbling behavior during aggressive encounters, implying a plasticity of usage of infant vocalizations.
6.6.3 Naturalistic and Noninvasive Experimental Approaches
Studies of cotton-top tamarins found some plasticity in vocal development. In a feeding context, when adults gave specific food-associated calls as approaching and leaving food (Elowson et al. 1991), infant and juvenile tamarins produced calls that did not match adult structure and were considerably more variable. These young animals also produced other vocalizations (not heard from adults) in feeding contexts (Roush and Snowdon 1994). Curiously, there was no developmental progression toward the production of adult-like vocalizations in this context, even in animals that were past puberty. In an experimental study, Roush and Snowdon (1999) recorded feeding vocalizations in cotton-top tamarins while living in family groups and after they were paired with a mate and separated from their natal families. There was a rapid (within 2–3 weeks) change in feeding vocalizations, including the elimination of the other calls and development of a clear adult structure for the food calls. This suggests that social context may serve as a constraint on adult vocal production. As tamarins are cooperative breeders, in which only the adult pair reproduce and other group members act as nonreproductive helpers, it may be that young animals inhibit their adult usage of calls until they become reproductively active themselves.
Cotton-top tamarins produce eight chirp-like vocalizations (short, high-pitched, frequency-modulated calls, see Fig. 6.1) with each chirp type being used in a discrete context (e.g. feeding, mobbing, alarming, responding to a stranger’s call, and responding to a group member) (Cleveland and Snowdon 1982). Castro and Snowdon (2000) carried out an experimental study of how infant tamarins used these calls. Adult tamarins used the appropriate chirp type in each of the different contexts. Infants, unlike adults, typically did not produce discrete chirps but instead produced a sequence of chirps with descending frequency. Over the period of infant dependence through weaning, each of the infants tested produced some of the chirp types in an appropriate context, but no one individual produced all of the chirp types and no experimental context elicited an appropriate chirp type from each infant. These results suggest a relatively slow process of development and show that young tamarins are not able to produce adult calls at birth, in marked contrast to non-family-living squirrel monkeys. Although cotton-top tamarins did not show the babbling-like behavior seen in pygmy marmosets, they did show great variation in chirp structure and only rarely produced adult-like calls. If there are innate templates for vocal structures, they need to be shaped and sharpened through experience.
Elowson et al. (1992) recorded pygmy marmoset trills throughout ontogeny and found that trills changed during the course of development, suggesting they are not produced in adult-like ways at birth. Given that maturational processes are involved in development, all animals should show a similar pattern of vocal development. However, young marmosets, even twins within a litter, showed different patterns of trill development that were not consistent with a simple maturational model. Evidence of adult plasticity in vocal production and usage (presented in the next section) suggests that marmosets and tamarins can adjust vocal production throughout their lives.
A study of common marmosets shows quite elegantly that adult caregivers play an important role in shaping the vocal development of their offspring. Takahashi et al. (2015) studied the development of the phee call, a frequent call given when marmosets are separated from one another. They found that the calls became more stereotyped over the first two months with increased duration, decreased central frequency, and decreased entropy. Four discrete clusters of calls were seen in neonates, but these were reduced to one or two clusters by two months of age. At first glance this may seem to support a simple maturational model of vocal development. However, changes in phee quality were not correlated with age, body weight, or physiological development of the respiratory system. Takahashi et al. (2015) recorded infants both when alone and when in vocal contact with one of their parents. Parents generally respond to infant calls with well-formed adult phees. Rates of parental responsiveness to infants correlated directly with the age at which infants began producing well-formed phees of their own, suggesting that parental responsiveness to infant cries directly influences an infant’s trajectory toward an adult call. Although studies of babbling in pygmy marmosets showed a higher rate of adult social interaction with babbling versus nonbabbling infants (Elowson et al. 1998), this is the first experimental demonstration of parental influence on vocal development in any nonhuman primate. However, there are clear parallels to vocal development in other taxa, including birds and humans (West and King 1988; Goldstein et al. 2003).
6.6.4 Vocal Development in Other Family-living Species
In hybrid gibbons, the song structure was complicated with few direct structural features inherited or learned from parents (Geissmann 1984; Geissmann and Orgeldinger 2000). However, Merker and Cox (1999), studying a single female gibbon, reported that vocal development was a slow process with different components of female great call structure appearing at different ages, much like the relatively slow development reported for marmosets and tamarins. There was also increased coordination of the infant’s calling with that of its mother as the infant grew older, suggesting that the mother may serve as a model. Further support for mothers serving as models for gibbon vocal development comes from Koda et al. (2013) who found acoustic matching of songs between mothers and daughters. Mothers adjusted their songs to be more stereotyped when co-singing with daughters, especially with daughters who co-sang less. Thus, for female gibbons at least, there appears to be a form of coaching behavior that may serve like the contingent responding in marmosets to shape vocal development in the young.
6.6.5 Summary: Vocal Development
In contrast to the general view that primate vocal structures are innate and not modified through learning processes, the data from family-living primates clearly show that development of adult vocal structures is a gradual process that cannot be attributed solely to maturation. Social variables, such as contingent responding by adults to infant babbling in pygmy marmosets, in response to infant cries by common marmosets and coaching songs by gibbons, can influence the rapidity of acquisition of adult-like calling. At the same time the suppression of breeding in adult helpers, inherent in the structure of cooperative breeding, may also inhibit the expression of some adult-like vocalizations until animals achieve breeding status. There are several parallels between development in family-living primates and that of humans that have not yet been reported in species with other breeding systems. Does this plasticity seen in young animals carry over into adult vocal production?
6.7 Flexible Adult Vocal Structure and Usage
Another characteristic of family-living primates is that vocal communication can be used flexibly by adults, with evidence for change in structure and usage in different social and environmental contexts. This is especially evident in four areas: (1) adjustment and convergence of vocal structures with pair or group formation (Sect. 6.7.1); (2) population specific dialects (Sect. 6.7.2); (3) structural change in response to environmental noise (Sect. 6.7.3); and (4) novel responses to captive environments (Sect. 6.7.4).
6.7.1 Modification and Convergence of Calls with Pair Formation
In a wide array of species, ranging from birds through dolphins to humans, there is evidence of vocal convergence with preferred social partners (Snowdon and Hausberger 1997), but there has been little evidence of vocal change in nonhuman primates. The primary examples again occur among family-living primates. Elowson and Snowdon (1994) recorded calls from two different colonies of pygmy marmosets and subsequently combined the colonies. Within 10 weeks of housing the colonies together, adult and subadult members of both colonies showed an increase in bandwidth of trill calls as well as an increase in pitch. There is no obvious reason for calls to change in this way, but the results demonstrate vocal flexibility in response to a changed social environment. In a parallel study on Weid’s black-tufted-eared marmosets, Ruckstalis et al. (2003) recorded phee calls in marmosets under stable social conditions and reported strong individual differences in call structure. Subsequently, some of the animals remained in the same colony room, but others were moved to a different colony room with unfamiliar conspecifics. When phees were recorded eight weeks later, phee calls of the marmosets in the stable social condition could still be identified, whereas those with changed social environments also exhibited changes in their individual call structure.