Over 4 billion lead-generation records exposed, including LinkedIn profiles

Cybernews has discovered an unprotected 16TB database leaking 4.3 billion lead-generation records. The data included professional and corporate intelligence data such as LinkedIn URLs. The leak has now been closed, but it is unclear how long the data was exposed before Cybernews discovered it.

Key findings:

Nine collections of data were uncovered inside the leaked dataset, containing a total of 4.3 billion records.
At least three collections included personally identifiable information (PII), such as full names, emails, phone numbers, LinkedIn data, location, and social media accounts.
The leak most likely stemmed from a common mistake where databases are left exposed without proper authentication due to human error.
The data may have been collected within the last two years, spanning multiple regions worldwide.

The dataset likely belongs to a specific lead-generation company that helps 700 million professionals connect with each other. After researchers notified the company about the potential data leak, the exposed instance was closed the next day. However, there is a chance another party is at fault, which is why we have refrained from naming the company.

For more information on this, here’s the full report: https://cybernews.com/security/database-exposes-billions-records-linkedin-data/

UPDATE: I have some commentary on this news:

Noelle Murata, Sr. Security Engineer, Xcape, Inc.:

“This data leak is shocking, not just because of its sheer size, over 4 billion records and 16 terabytes, but because it’s meticulously organized. It’s LinkedIn-sourced information, mapping individuals, their employers, and company connections, which is exactly what attackers need for sophisticated phishing and business email compromise (BEC) attacks. The unique data collections and intent suggest a curated enrichment process, transforming scraped data into a ready-to-use targeting tool.

“Leaving a MongoDB instance unprotected is a basic error, yet the ramifications are significant: years of employment histories, contact networks, and social connections, all difficult to change or mitigate. With the owner still unidentified, victims can’t even hold anyone accountable or demand fixes, a concerning trend in large-scale data breaches.

“This isn’t a hack, but a blatant oversight: a simple misconfiguration exposed a huge amount of sensitive corporate relationship data for an unknown period. The unknown owner now faces immense liability, essentially providing bad actors with an unauthorized, pre-built resource.”

“When security posture management is ignored, a single misconfigured database becomes a multi-billion-dollar master key for global corporate espionage.”

Aaron Colclough, VP of Operations, Suzu Labs:

“This isn’t the first time we’ve seen MongoDB misconfigurations expose millions of data points, and it likely won’t be the last. The ‘secure by default’ principle still isn’t being followed leaving these databases often deployed with authentication disabled for convenience during development, then pushed to production without remediation.

“4.3 billion records with 16 terabytes of enriched professional data represents one of the largest exposures of business intelligence data we’ve seen. It’s complete professional dossiers including employment history, education, certifications, and behavioral intent data. This is a social engineering goldmine. The ‘intent’ collection with over 2 billion documents is particularly concerning. Combined with the profile data, this enables highly targeted spear-phishing campaigns that reference specific professional interests or recent activities.

“Most professionals don’t realize that their LinkedIn profile, employment history, and even behavioral patterns are being aggregated, enriched, and sold by platforms they’ve never heard of. When these data brokers fail to secure their databases, the professionals whose data they’ve collected suffer the consequences, but have no contractual relationship to seek damages.”

Hom Bahmanyar, Global Enablement Officer, Ridge Security Technology Inc.:

“The widespread misconception that detection of weak credentials across an organization’s assets requires specialized GPUs and scheduled downtime has unfortunately led to inaction on the part of many organizations.

“Brute-force detection of weak credentials is an easy win that’s often ignored. It can serve as a practical interim measure and later be expanded into more sophisticated solutions.

“Security Validation platforms generally provide credential dictionaries for various applications, databases, and protocols to support brute-force weak credential detection. Incidents like the unsecured MongoDB breach could have been easily avoided with such measures.”

This entry was posted on December 10, 2025 at 12:39 pm and is filed under Commentary with tags Cybernews. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The IT Nerd