AppSOC Tests DeepSeek And Finds A Wide Range Of Flaws

AppSOC, specialists in AI governance and application security, today published “Testing the DeepSeek-R1 Model: A Pandora’s Box of Security Risks detailing in-depth model testing that reveals a wide range of flaws with high failure rates. 

Through a combination of automated static analysis, dynamic tests, and red-teaming techniques, the DeepSeek-R1 model was put through scenarios that mimic real-world attacks and security stress tests using AppSOC’s AI Security Platform and risk scoring.

Among alarming results: 

  • Jailbreaking: Failure rate of 91%. DeepSeek-R1 consistently bypassed safety mechanisms meant to prevent the generation of harmful or restricted content.
  • Prompt Injection Attacks: Failure rate of 86%. The model was highly susceptible to adversarial prompts, resulting in incorrect outputs, policy violations, and system compromise.
  • Malware Generation: Failure rate of 93%. Tests showed DeepSeek-R1 capable of generating malicious scripts and code snippets at critical levels.
  • Supply Chain Risks: Failure rate of 72%. The lack of clarity around the model’s dataset origins and external dependencies heightened its vulnerability.
  • Toxicity: Failure rate of 68%. When prompted, the model generated responses with toxic or harmful language, indicating poor safeguards.
  • Hallucinations: Failure rate of 81%. DeepSeek-R1 produced factually incorrect or fabricated information at a high frequency.

You can read the research here.

Leave a Reply

Discover more from The IT Nerd

Subscribe now to keep reading and get access to the full archive.

Continue reading