Vowel sounds made by baboons show that the roots of human speech may go back 25 million years

Human speech is based on using vowels as the kernel of a sound and placing consonants around those vowels, and, for a long time, researchers assumed that nonhuman primates couldn’t make vowel-like sounds. (Jan. 12, 2017)


Listen closely to those baboon calls. They may tell you a thing or two about human speech.

Scientists who studied baboons’ wahoos, yaks, barks and other vocalizations have found evidence of five vowel-like sounds — a sign that the physical capacity for speech may have evolved over much longer timescales than previously thought.

The findings, described this week in the journal PLOS One, could have significant implications for our understanding of the development of human speech and the emergence of language.

Scientists studying the evolution of speech are in a tricky bind because, unlike bones or shells, spoken words leave no fossil imprints in the geological record. How do you study the development of something as insubstantial as a sound?


Luckily, there are physical structures we can study — the mouths that make those sounds. By comparing the vocal tract of humans and their close primate relatives, researchers can get a sense of which particular traits were necessary for the emergence of speech. They might even identify physical characteristics that would have impeded it.

Speech “engages anatomical traits that might leave fossil clues, as well as overt anatomical, physiological, and behavioral aspects for which parallels can be sought in living primates,” the study authors wrote.

In large part, human speech uses vowels as the kernel of a sound and places consonants around those vowels. So the number of different vowels you can make is important, because it means you can make a greater variety of potentially meaningful chunks of sound.

Think about “cat,” “kit,” “cut,” “coat,” “coot,” “keet,” and “caught” — seven words with distinct meanings. Each has a “k” sound at the beginning and a “t” at the end; what separates them is their vowels. Without each of those subtly distinguishable vowels, English speakers wouldn’t be able to tell those words apart.

Languages have different inventories and patterns of vowel and consonant usage, but they all rely on roughly the same vocal tract shape. And for a long time, many researchers assumed that nonhuman primates couldn’t make vowel-like sounds because their larynxes (or voice boxes) sat much higher in the neck than human larynxes do. That assumption had major implications for theories on the emergence of language, which remains a uniquely human ability.

“This theory has often been used to buttress the theoretical claim of a recent date for language origin, e.g. 70,000-100,000 years ago,” the study authors explained. “It also diverted scientists’ interests away from articulated sound in nonhuman primates as a potential homolog of human speech, and thus lent support to less direct explanations of language evolution, involving communicative gestures, complex cognitive or neural functions, or genetics.”


But recent research has begun to challenge that assumption about the larynx, the study authors wrote.

Lowered larynxes have been found in other animals that have no ability to make vowel sounds. And human babies, who have very high larynxes, can still generate the same vowel range as adults. Scientists have begun to realize, thanks to computer modeling work, that the movement and control of the tongue’s position is actually much more important in making vowel sounds than the height of the larynx.

To test this idea, a French-led team of scientists studied vocalizations from 15 Guinea baboons (12 females and three males) living in an outdoor enclosure at the National Center for Scientific Research’s primate center in Rousset-sur-Arc, France. They focused in particular on the half-hour before feeding, when the baboons were particularly vocal, and avoided recording during the dinner hour, when they were busy munching on their meals.

The scientists analyzed the recordings looking for “formants.” These are concentrations of acoustic energy around key frequencies in human speech, and their distribution is defined in part by the shape of our vocal tract.

The individual formants found in a vowel can tell you the configuration of the mouth that made it — for example, whether the lips are rounded, how high the tongue is, and whether the tongue is pushed forward toward the teeth or back in the mouth.

In human speech, each vowel has a particular blend of formants that make it a unique, easily identifiable sound.


The scientists focused on five types of baboon vocalizations that also appeared to feature formants — grunts, wahoos, barks, yaks and mating calls. After analyzing the 1,335 spontaneous vocalizations (and after splitting the wahoos into their wa- and -hoo subunits), the researchers concluded that the recordings held 1,404 “vowel-like segments.”

The scientists also verified that the baboons were physically capable of making these sounds by dissecting and analyzing the tongues of two baboons who died of natural causes unrelated to the study. For the ability to make specific vowel-like sounds, it seemed that tongue position really was more important than the larynx’s height.

Many scientists have thought that human speech may have evolved recently — within the last 100,000 years or so. In part, that guess was based on the assumption that humans’ primate ancestors didn’t have vocal tracts that were physically capable of generating speech.

But the new findings show that this isn’t true: The ability to articulate vowel-like sounds, necessary for the development of human speech, was probably shared by the last common ancestor of both humans and baboons some 25 million years ago.

The scientists suggest that the ability to create distinct vowels using the vocal tract became more sophisticated over time in the ancestors of modern humans.

“Whatever the course of the emergence of language and speech, the evidence developed in this study does not support the hypothesis of the recent, sudden, and simultaneous appearance of language and speech in modern Homo sapiens,” they wrote.


Follow @aminawrite on Twitter for more science news and “like” Los Angeles Times Science & Health on Facebook.


Rallying support for economic fairness? Better chase off the needy

In a first, scientists detect ‘fast radio bursts’ from beyond the Milky Way galaxy

‘Hidden Figures’ may feature NASA’s history, but it resonates in the present