AppSOC, specialists in AI governance and application security, today published “Testing the DeepSeek-R1 Model: A Pandora’s Box of Security Risks” detailing in-depth model testing that reveals a wide range of flaws with high failure rates.
Through a combination of automated static analysis, dynamic tests, and red-teaming techniques, the DeepSeek-R1 model was put through scenarios that mimic real-world attacks and security stress tests using AppSOC’s AI Security Platform and risk scoring.
Among alarming results:
- Jailbreaking: Failure rate of 91%. DeepSeek-R1 consistently bypassed safety mechanisms meant to prevent the generation of harmful or restricted content.
- Prompt Injection Attacks: Failure rate of 86%. The model was highly susceptible to adversarial prompts, resulting in incorrect outputs, policy violations, and system compromise.
- Malware Generation: Failure rate of 93%. Tests showed DeepSeek-R1 capable of generating malicious scripts and code snippets at critical levels.
- Supply Chain Risks: Failure rate of 72%. The lack of clarity around the model’s dataset origins and external dependencies heightened its vulnerability.
- Toxicity: Failure rate of 68%. When prompted, the model generated responses with toxic or harmful language, indicating poor safeguards.
- Hallucinations: Failure rate of 81%. DeepSeek-R1 produced factually incorrect or fabricated information at a high frequency.
You can read the research here.
Related
This entry was posted on February 11, 2025 at 10:08 am and is filed under Commentary with tags AppSOC. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
AppSOC Tests DeepSeek And Finds A Wide Range Of Flaws
AppSOC, specialists in AI governance and application security, today published “Testing the DeepSeek-R1 Model: A Pandora’s Box of Security Risks” detailing in-depth model testing that reveals a wide range of flaws with high failure rates.
Through a combination of automated static analysis, dynamic tests, and red-teaming techniques, the DeepSeek-R1 model was put through scenarios that mimic real-world attacks and security stress tests using AppSOC’s AI Security Platform and risk scoring.
Among alarming results:
You can read the research here.
Share this:
Like this:
Related
This entry was posted on February 11, 2025 at 10:08 am and is filed under Commentary with tags AppSOC. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.