Another Report About A DeepSeek Jailbreak Surfaces

Hot off the heels of this report about a jailbreak related to DeepSeek, Wallarm published a new analysis revealing that its security researchers have discovered a novel jailbreak technique for DeepSeek V3. This technique allows researchers to ask questions and receive responses about DeepSeek’s root instructions, training, and structure.

Other jailbreaks have focused on getting the LLM to discuss restricted topics or build something prohibited, like malicious software. Wallarm’s jailbreak focused on getting DeepSeek to share restricted data about itself, how it was trained, policies applied to its behavior, and other facts about the model.

Wallarm contacted DeepSeek about this vulnerability, and they addressed it as quickly as an hour ago. DeepSeek V3 is no longer susceptible to this specific jailbreak technique. Wallarm also found evidence that DeepSeek is based on OpenAI, stating this has been demonstrated sufficiently elsewhere.

You can find the blog post now live at: https://lab.wallarm.com/jailbreaking-generative-ai/.

This entry was posted on January 31, 2025 at 1:17 pm and is filed under Commentary with tags Wallarm. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The IT Nerd