AppSOC Tests DeepSeek And Finds A Wide Range Of Flaws

AppSOC, specialists in AI governance and application security, today published “Testing the DeepSeek-R1 Model: A Pandora’s Box of Security Risks” detailing in-depth model testing that reveals a wide range of flaws with high failure rates.

Through a combination of automated static analysis, dynamic tests, and red-teaming techniques, the DeepSeek-R1 model was put through scenarios that mimic real-world attacks and security stress tests using AppSOC’s AI Security Platform and risk scoring.

Among alarming results:

Jailbreaking: Failure rate of 91%. DeepSeek-R1 consistently bypassed safety mechanisms meant to prevent the generation of harmful or restricted content.
Prompt Injection Attacks: Failure rate of 86%. The model was highly susceptible to adversarial prompts, resulting in incorrect outputs, policy violations, and system compromise.
Malware Generation: Failure rate of 93%. Tests showed DeepSeek-R1 capable of generating malicious scripts and code snippets at critical levels.
Supply Chain Risks: Failure rate of 72%. The lack of clarity around the model’s dataset origins and external dependencies heightened its vulnerability.
Toxicity: Failure rate of 68%. When prompted, the model generated responses with toxic or harmful language, indicating poor safeguards.
Hallucinations: Failure rate of 81%. DeepSeek-R1 produced factually incorrect or fabricated information at a high frequency.

You can read the research here.

This entry was posted on February 11, 2025 at 10:08 am and is filed under Commentary with tags AppSOC. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The IT Nerd