All it takes is a single selfie.
From that static image, an algorithm can quickly create a moving, lifelike avatar: a video not recorded, but fabricated from whole cloth by software.
With more time, Pinscreen, the Los Angeles start-up behind the technology, believes its renderings will become so accurate they will defy reality.
“You won’t be able to tell,” said Hao Li, a leading researcher on computer-generated video at USC who founded Pinscreen in 2015. “With further deep-learning advancements, especially on mobile devices, we’ll be able to produce completely photoreal avatars in real time.”
The technology is a triumph of computer science that highlights the gains researchers have made in deep neural networks, complex algorithms that loosely mimic the thinking of the human brain.
What used to take a sophisticated Hollywood production company weeks could soon be accomplished in seconds by anyone with a smartphone.
Not available for a video chat? Use your lifelike avatar as a stand-in. Want to insert yourself into a virtual reality game? Upload your picture and have the game render your character.
Those are the benign applications.
Now imagine a phony video of North Korean dictator Kim Jong Un announcing a missile strike. The White House would have mere minutes to determine whether the clip was genuine and whether it warranted a retaliatory strike.
What about video of a presidential candidate admitting to taking foreign cash? Even if the footage proved fake, the damage could prove irreversible.
In some corners of the internet, people are using open-source software to swap celebrities’ faces into pornographic videos, a phenomenon called Deep Fakes.
It’s not hard to imagine a world in which social media is awash with doctored videos targeting ordinary people to exact revenge, extort or to simply troll.
In that scenario, where Twitter and Facebook are algorithmically flooded with hoaxes, no one could fully believe what they see. Truth, already diminished by Russia’s misinformation campaign and President Trump’s proclivity to label uncomplimentary journalism “fake news,” would be more subjective than ever.
The danger there is not just believing hoaxes, but also dismissing what’s real.
“If anything can be real, nothing is real,” said a Reddit user in a manifesto defending the Deep Fakes forum, which has since been banned for producing porn without consent from the people whose faces were used.
The consequences could be devastating for the notion of evidentiary video, long considered the paradigm of proof given the sophistication required to manipulate it.
“This goes far beyond ‘fake news’ because you are dealing with a medium, video, that we traditionally put a tremendous amount of weight on and trust in,” said David Ryan Polgar, a writer and self-described tech ethicist. “If you look back at what can now be considered the first viral video, it was the witnessing of Rodney King being assaulted that dramatically impacted public opinion. A video is visceral. It is also a medium that seems objective.”
To stop the spread of fake videos, Facebook, Google and Twitter would need to show they can make good on recent promises to police their platforms.
Last week’s indictment of more than a dozen Russian operatives and three Russian companies by special counsel Robert S. Mueller III showed how easily bad actors can exploit the tech companies that dominate our access to information. Silicon Valley was blindsided by the spread of trolls, bots and propaganda — a problem that persists today.
Tech companies have a financial incentive to promote sensational content. And as platforms rather than media companies, they’ve fiercely defended their right to shirk editorial judgment.
Critics question whether Facebook, Google and Twitter are prepared to detect an onslaught of new technology like machine-generated video.
“Platforms are starting to take 2016-style misinformation seriously at some levels,” said Aviv Ovadya, chief technologist at the Center for Social Media Responsibility. “But doing things that scale is much harder.”
Fake video “will need to be addressed at a deeper technical infrastructure layer, which is a whole different type of ballgame,” Ovadya said.
(Facebook and Twitter did not respond to interview requests. Google declined to comment.)
The problem today is that there isn’t much in the way of safeguards.
Hany Farid, a digital forensics expert at Dartmouth College who often consults for law enforcement, said watching for blood flow in the face can sometimes determine whether footage is real. Slight imperfections on a pixel level can also reveal whether a clip is genuine.
Over time, though, Farid thinks artificial intelligence will undermine these clues, perpetuating a cat-and-mouse game between algorithms and investigators.
“I’ve been working in this space for two decades and have known about the issue of manipulated video, but it’s never risen to the level where everyone panics,” Farid said. “But this machine-learning-generated video has come out of nowhere and has taken a lot of us by surprise.”
That includes researchers at the Defense Advanced Research Projects Agency. The U.S. military’s high-tech research lab, better known as DARPA, meets regularly with experts in media forensics like Farid and Li from Pinscreen. Discussion at a recent get-together in Menlo Park turned to Deep Fakes and ways to detect ultra-realistic fake video. The consensus was bleak.
“There’s basically not much anyone can do right now,” Li said about automated detection tools.
The same conundrum faced the software company Adobe years ago when it became clear that its photo-editing program, Photoshop, was also being used for trickery. The company looked into including tools that could detect if an image had been doctored. But Adobe ultimately abandoned the idea, determining fraudsters could exploit the tool just as easily, said Kevin Connor, a former Adobe executive who now works with Farid.
“I think Photoshop is an overwhelmingly good thing,” Connor said. “But that doesn’t mean a good thing can’t be used for bad.”
Proponents of artificial video say fake imagery is an old problem that’s regularly debunked. Consider the doctored photo that emerged in 2004 of then-presidential candidate John Kerry with Jane Fonda at an anti-Vietnam War rally. Even an 1860 portrait of Abraham Lincoln turned out to be manipulated. The president’s body was replaced with a more heroic-looking John Calhoun.
The chances of stopping technology like computer-generated video from advancing is highly unlikely, experts say.
That means the onus is on those who read the news and those who report it to verify footage the best they can. Students at a young age also need to be taught how to wade through news sources critically, said Nonny de la Pena, an early practitioner of immersive journalism, which often leans on virtual reality.
“To shy away from technology because of fears it can be dangerous is a huge mistake,” she said. “Technology is scary. You’re going to have negative consequences. But the positive potential far outweighs the bad.”
Computer-generated avatars could bolster communication by bringing the subtleties of body language into digital conversation, said Pinscreen’s Li.
“It’s not our purpose to create a technology that people can use for evil,” said Li, who also teaches and conducts research at USC.
Pinscreen’s photo-realistic avatar technology isn’t publicly available yet. The company, which operates out of a Wilshire Boulevard high-rise, is primarily focused on an app that turns ordinary selfies into animated 3-D avatars.
Li, 37, had a hand in developing the technology Apple used to make animojis. The cartoon creature avatars use augmented reality sensors in the iPhone X’s camera to move in tandem with a user’s face.
Li said he’s received overtures from large tech companies about acquiring Pinscreen, but turned them down. He envisions building his own social media app where users can communicate with their playful avatars in computer-generated backdrops.
“The main difference between what we do and Instagram and Snapchat or Facebook is they basically track your face and add things to it,” Li said of the apps’ augmented reality filters. “Our aim is to build an entire CG world.”
To demonstrate its technology, Pinscreen turned a photo of me into a video. Armed with data about human movement, Pinscreen used its deep neural network to plant my head onto what it deemed a 3-D image of me would look like if it were living and breathing.
The process took minutes, about one frame per second. But Li thinks he’ll be able to render lifelike avatars in real time by the end of the year that will also include convincingly realistic hair.
Li, who is German and the son of Taiwanese parents, said the key to preventing a breakdown of trust in video is to build awareness of the capabilities of computer-generated video.
“If you watch ‘Jurassic Park,’ there are no dinosaurs, but they gave you that experience,” he said.
“Of course you can hire any digital effects team to make Donald Trump or Kim Jong Un look like they’re starting a new war if you wanted to,” Li continued. “What’s different now is it becomes very easy to do these things and it can get into the hands of anyone. The important thing is to educate people. People will get used to it just like Photoshop.”