Meta AI fooled into teaching weapon creation…. Yikes!

Cybernews researchers discovered that Meta’s personal assistant, which is integrated into Messenger, WhatsApp, Instagram, and other apps, is easy to manipulate into revealing harmful information. The Llama 4-based chatbot was easily tricked into providing instructions on making a Molotov cocktail.

The assistant was easily tricked by utilizing the so-called “narrative jailbreaking” practice. The technique masks the harmful request by asking the bot to tell a “story” to bypass safety filters. To execute the jailbreak, the team simply asked the chatbot to tell a story about the Winter War between Finland and the Soviet Union, requesting details about how the incendiary devices were made back then.

While it’s unlikely that people will flock to Meta for advice on Molotov cocktail-making, the issue highlights the possibility of abusing the chatbot for purposes that appear to be beyond the scope of what an AI assistant ought to be capable of.

The team disclosed the issue to Meta immediately after discovering it. After the publication went live, the company told Cybernews it had resolved the problem.

Also, Cybernews researchers recently discovered that Lenovo’s customer service assistant, Lena, had an XSS vulnerability that allowed the running of remote scripts on corporate machines if you asked nicely.

Meanwhile, another chatbot, used by the travel agency Expedia, allowed users to ask for a recipe for making a Molotov cocktail. The company eventually fixed the issue, and the chatbot stopped advising on making incendiary devices.

To read the full research report, please click here.

This entry was posted on September 30, 2025 at 2:58 pm and is filed under Commentary with tags Cybernews. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The IT Nerd