Op-Ed: We are our data, our data are us

Illustration of an evolution of a protozoa to an ape to a caveman to a human with 1s and 0s written on them.
(Martin Gee / For The Times)

By one count, more than 2.5 quintillion bytes of digitized information are generated on Earth every day. That’s more data than all the words spoken by all the humans who ever existed, using generous estimates for both. Texts, emails, pet videos, influencer TikToks, financial transactions and data about data itself all inundate the world in a seemingly never-ending flood.

As much as this glistening informational ocean supports and amplifies our abilities as a species, it also burdens us. The energy and resource demands of data storage and computation are enormous, and despite ever-improving efficiencies and innovations, the sheer growth of data seems to continually outpace us: The largest individual data centers in the world, tucked into the semi-arid landscape outside the city of Hohhot in Inner Mongolia, each now extend across hundreds of acres to keep up with demand. Yet their microprocessors and storage devices that are state-of-the-art today will be ready for imperfect recycling tomorrow.

There are even technological “mutations,” such as blockchains and cryptocurrencies, that are deliberately, computationally and energetically demanding (or else we could all beat the system, mining wealth on a whim). So much so that Elon Musk announced that Tesla won’t use bitcoin further until the cryptocurrency gets its environmental act together, or at least uses clean energy for 50% of its operations. Even with such precautions, some projections suggest that within 20 years our global electronic data infrastructure may demand as much electrical power as the total amount we generate as a species today. We should venerate the engineers and coders who optimize and improve our data to save energy as much as we praise any green new deal.

Really, though, none of this is new for humans, it’s just on a bigger scale than before. Our externally held information, not encoded in our genetic material yet heritable and evolvable, has always asked a lot of us. For centuries much of this “dataome” has existed in billions upon billions of printed books, each requiring energy and raw materials to be fabricated and accessed. Throughout human history, the dataome has taken all manner of resources and attention to build and sustain; from pigments on cave walls to clay in tablets, to the bricks and mortar of libraries and school rooms.


The information of the dataome is also constantly reinvented and rewritten into new physical forms, leaping from words to film and video, from archived documents to digital databases. Less than 100 years ago it jumped onto computer punch cards, whose peak use may have sucked up as much as 10% of the United States’ annual coal-burning energy budget. But, despite the demands of the dataome, we’ve seldom stopped to ask why we comply. Why are we so compelled by data and its information?

In a Darwinian sense, we understand that our informational world confers amazing survival advantages onto us as a species. We can explicitly learn from the past. We can interact with people we’ll never physically meet, or, indeed, who lived 1,000 years ago. We can survey and model the world to evaluate risk and reward with astonishing fidelity. Yet these advantages come with mounting challenges, whether in energy and environmental change or in misleading, corrupted information and the sociopolitical instabilities it can cause.

To gauge smoking’s toxicity, you’d study lungs not just smoke. Now apply that to toxic social media.

There is another way to see all of this. It involves a realignment of how we think about ourselves as distinct from our data. But if we take that leap, pieces of the puzzle of our existence fall into place. The key is understanding the dataome as just another part of a process that has been unfurling here on Earth over most of the past 4 billion years. That process is life itself.

Biological genes are built out of organic molecules, but underneath they are all about information. Our dataome may be built differently than us, but it has all the hallmarks of a living system, including a deeply symbiotic relationship to its originators, Homo sapiens. Like all symbioses — think us and our microbiomes, or sea anemones and clownfish — that relationship is often, but not always, mutually beneficial. We rely on our dataome, and our dataome relies on us, but the precise needs of each may not always align.

The energy burden of the dataome and the impact on the planetary environment is an example. The dataome won’t stop growing, but that growth can negatively affect the ecosystems that humans rely on. At the same time, humans must have the dataome to function as a species. The outcome is rapid technological and behavioral evolution as we run faster and faster to try to stay stationary, trying to find sustainability with our information.

Critically, by seeing the dataome as a living thing, we might gain a better understanding of just how unstable a situation we’re in, and how to sway the symbiosis more toward humanity’s benefit. It’s hard to find a cure for dysfunction if you don’t even recognize the kind of organism you’re trying to nurse back to health.

We humans need to let go of any sense of supremacy. Put simply, we have never been alone. We have always been bound together with our dataome, a symbiotic entity of biology, language and tools that burst onto the scene some 200,000 years ago, and that has been reshaping Earth ever since.


Caleb Scharf is director of astrobiology at Columbia University. His new book is “The Ascent of Information: Books, Bits, Genes, Machines, and Life’s Unending Algorithm.”