235 Million Instagram, TikTok And YouTube User Profiles Exposed In Massive Data Leak
Data – The security research team at Comparitech today disclosed how an unsecured database left almost 235 million Instagram, TikTok and YouTube user profiles exposed online in what can only be described as a massive data leak.
Recently there has been a spate of reports concerning account data appearing on dark web cybercrime forums. From the dark web audit suggesting there are currently 15 billion stolen logins from 100,000 breaches out there, to the hacker giving away 386 million stolen records for free. Not all of this data will have been hacked, at least not in the usual sense of the word: some, as was likely the case in the Utah Gun Exchange incident, will have been exposed by an unsecured database.
The unsecured database problem
Unsecured databases are fast becoming such a huge data protection problem that it’s thought a vigilante security researcher is behind the spate of “Meow” attacks that have overwritten the indexes of thousands of such databases. And it was such an unsecured database that the Comparitech researchers, led by Bob Diachenko, discovered on August 1, leaving the personal profile data of nearly 235 million Instagram, TikTok and YouTube users up for grabs.
The data was spread across several datasets; the most significant being two coming in at just under 100 million each and containing profile records apparently scraped from Instagram. The third-largest was a dataset of some 42 million TikTok users, followed by just under 4 million YouTube user profiles.
Comparitech says that, based on the samples it collected, one in five records contained either a telephone number or email address. Every record also included at least some, sometimes all, the following information:
- Profile name
- Full real name
- Profile photo
- Account description
Statistics about follower engagement, including:
- Number of followers
- Engagement rate
- Follower growth rate
- Audience gender
- Audience age
- Audience location
- Last post timestamp
“The information would probably be most valuable to spammers and cybercriminals running phishing campaigns,” Paul Bischoff, Comparitech editor, says. “Even though the data is publicly accessible, the fact that it was leaked in aggregate as a well-structured database makes it much more valuable than each profile would be in isolation,” Bischoff adds. Indeed, Bischoff told me that it would be easy for a bot to use the database to post targeted spam comments on any Instagram profile matching criteria such as gender, age or number of followers.
Tracing the source of the leaked data
So, where did all this data originate? The researchers suggest that the evidence, including dataset names, pointed to a company called Deep Social. However, Deep Social was banned by both Facebook and Instagram in 2018 after scraping user profile data. The company was wound down sometime after this.
A Facebook company spokesperson told me that “scraping people’s information from Instagram is a clear violation of our policies. We revoked Deep Social’s access to our platform in June 2018 and sent a legal notice prohibiting any further data collection.”
Once the researchers found the database and the clues to its origin, “we sent an alert to Deep Social, assuming the data belonged to them,” Bischoff says. The administrators of Deep Social then forwarded the disclosure to a Hong Kong-registered social media influencer data-marketing company called Social Data. “Social Data shut down the database about three hours after our initial email,” Bischoff says.
This article originally appeared on forbes.com To read the full article and see the images, click here.
Nastel Technologies helps companies achieve flawless delivery of digital services powered by middleware. Nastel delivers Middleware Management, Monitoring, Tracking and Analytics to detect anomalies, accelerate decisions, and enable customers to constantly innovate. To answer business-centric questions and provide actionable guidance for decision-makers, Nastel’s Navigator X fuses:
- Advanced predictive anomaly detection, Bayesian Classification and other machine learning algorithms
- Raw information handling and analytics speed
- End-to-end business transaction tracking that spans technologies, tiers, and organizations
- Intuitive, easy-to-use data visualizations and dashboards