New Research from Unit 42 Reveals DeepSeek is Vulnerable to Jailbreaking
Palo Alto Networks’ threat intelligence team, Unit 42, released research revealing that DeepSeek is concerningly vulnerable to jailbreaking and can produce nefarious content with little to no specialized knowledge or expertise.
The new research exposes the security risks of employees using unauthorized third-party LLMs and stresses the need to address these vulnerabilities when integrating open source LLMs into business processes.
The research reveals:
- High bypass/jailbreak rates, highlighting the potential risks of emerging attack vectors that can be used by malicious actors
- Jailbreak methods can elicit explicit guidance for malicious activities and could greatly accelerate their operations
- Malicious activities include creating keyloggers—software or hardware designed to record keystrokes on a computer or device—as well as stealing and exfiltrating data, demonstrating the security risks to businesses.
In addition to the research, the team shared commentary from Sam Rubin, SVP of Consulting and Threat Intelligence of Unit 42, discussing the findings.
Unit 42’s DeepSeek jailbreaking research shows that we can’t always trust that LLMs will work as they intend — they are able to be manipulated. It’s important that companies consider these vulnerabilities when building open source LLMs into business processes. We have to assume that LLM guardrails can be broken and safeguards need to be built in at the organizational level.
And, as organizations look to leverage these models, we have to assume threat actors are doing the same—with the goal of accelerating the speed, scale, and sophistication of cyberattacks. We’ve seen evidence that nation state threat actors are leveraging OpenAI and Gemini to launch attacks, improve phishing lures, and write malware. We expect attacker capabilities will get more advanced as they refine their use of AI and LLMs and even begin to build AI attack agents.
You can read the research here.
January 31, 2025 at 1:17 pm
[…] off the heels of this report about a jailbreak related to DeepSeek, Wallarm published a new analysis revealing that its security researchers have discovered a novel […]