Hundreds of millions of Facebook user records exposed on Amazon cloud servers

Facebook Chief Executive Mark Zuckerberg delivers a speech at a developers conference in 2018.
(Justin Sullivan / Getty Images)

Facebook Inc. user data is still showing up in places it shouldn’t.

Researchers at UpGuard, a cybersecurity firm, found troves of user information hiding in plain sight, inadvertently posted publicly on Inc.’s cloud computing servers. The discovery shows that a year after the Cambridge Analytica scandal exposed how unsecure and widely disseminated Facebook users’ information is online, companies that control that information at every step still haven’t done enough to seal up private data.

In one instance, Mexico City-based digital platform Cultura Colectiva, openly stored 540 million records on Facebook users, including identification numbers, comments, reactions and account names. The records were accessible and downloadable for anyone who could find them online. That database was closed Wednesday after Bloomberg alerted Facebook to the problem and Facebook contacted Amazon.

Facebook shares fell on the news. They closed down 0.4% at $173.54.

Another database for a long-defunct app called At the Pool listed names, passwords and email addresses for 22,000 people. UpGuard doesn’t know how long they were exposed, as the database became inaccessible while the company was looking into it.


Facebook shared this kind of information freely with third-party developers for years before cracking down more recently. The problem of accidental public storage could be more extensive than those two instances. UpGuard found 100,000 open Amazon-hosted databases for various types of data, some of which it expects aren’t supposed to be public.

“The public doesn’t realize yet that these high-level systems administrators and developers, the people that are custodians of this data, they are being either risky or lazy or cutting corners,” said Chris Vickery, director of cyber risk research at UpGuard. “Not enough care is being put into the security side of big data.”

Cultura Colectiva is a digital platform that posts stories about celebrities and culture and largely targets a Latin American audience. The company’s website says it creates content through data and technology and has more than 45 million followers on Facebook, Instagram, Twitter, YouTube and Pinterest.

For many years, Facebook allowed anyone making an app on its site to obtain information on the people using the app, as well as on those users’ friends. Once the data is out of Facebook’s hands, the developers can do whatever they want with it.

About a year ago, Facebook Chief Executive Mark Zuckerberg was preparing to testify to Congress about a particularly egregious example: A developer who handed over data on tens of millions of people to Cambridge Analytica, a political consulting firm that helped Donald Trump on his presidential campaign. That one instance has led to government investigations around the world and threats of further regulation for the company.

Last year, Facebook started an audit of thousands of apps and suspended hundreds until they could make sure they weren’t mishandling user data. Facebook now offers rewards for researchers who find problems with its third-party apps.


A Facebook spokesperson said the company’s policies prohibit storing Facebook information in a public database. Once the company was alerted to the issue, it worked with Amazon to take down the databases, the spokesperson said, adding that Facebook is committed to working with the developers on its platform to protect people’s data.

In the Cultura Colectiva dataset, which totaled 146 gigabytes, researchers had difficulty determining how many unique Facebook users were affected. UpGuard also had trouble working to get the database closed. The firm sent emails to Cultura Colectiva and Amazon over many months to alert them to the problem. It wasn’t until Facebook contacted Amazon that the leak was addressed. Cultura Colectiva didn’t respond to a request for comment.

This latest example shows how the data security issues can be amplified by another trend: the transition many companies have made from running operations predominantly in their own data centers to cloud-computing services operated by Amazon, Microsoft Corp., Alphabet Inc.’s Google and others.

Those tech giants have built multibillion-dollar businesses by making it easy for companies to run applications and store troves of data — including corporate documents and employee information — on remote servers.

Programs such as Amazon Web Services’ Simple Storage Service, essentially an internet-accessed hard drive, offer clients the choice of making the data visible to just the person who did the upload, other members of their company, or anyone online.

Sometimes that information is designed to be public-facing, as in the case of a cache of photos or other images stored for use on a corporate website.


Other times it isn’t. In recent years, information stored on several cloud services — U.S. military data, personal information of newspaper subscribers and cellphone users — has been inadvertently shared publicly online and discovered by security researchers.

In the last two years, Amazon has beefed up protocols to keep customers from exposing sensitive materials: It has added prominent warning notices, made tools for administrators to more simply turn off all public-facing items, and offered for free what was formerly a paid add-on to check a customer’s account for exposed data.

“Originally I would have put a lot of this on AWS” — that is, Amazon Web Services, said Corey Quinn, who advises businesses that use Amazon’s cloud at the consulting firm Duckbill Group. But since Amazon has taken steps to address the issue, companies such as Cultura Colectiva should be aware, he said. “With all of this in the news, and all of this continuing to come out, if you’re still opening AWS buckets [to the public], you’re not paying attention.”

Amazon isn’t the only company that periodically gets caught up in cases of private records mistakenly made public. But the Seattle company has a wide lead in the business of selling rented data storage and computing power, putting a spotlight on its practices. An Amazon Web Services spokesman declined to comment.