Meta is stepping up its support for the open-source AI community. This week, the company launched a set of Llama AI protection tools designed to keep applications safe from misuse and cyber threats.
Leading the rollout is Llama Guard 4, a tool that helps detect unsafe inputs across both text and image data. It’s built to work across different AI tasks and is now available through a preview version of Meta’s new Llama API.
Meta also introduced LlamaFirewall. This tool watches over how prompts interact with AI models. It blocks harmful injections, insecure code snippets, and unsafe plugin behaviors. It also connects with existing Meta tools to make sure all bases are covered.
Another upgraded tool is Llama Prompt Guard 2. This version spots jailbreaks and prompt hacks faster than before. Meta also released a lighter version, called Prompt Guard 2 22M, which is faster and easier to run on smaller systems.
Beyond developer tools, Meta is launching features for cybersecurity teams too. Two new tools, CyberSOC Eval and AutoPatchBench, are now part of the CyberSec Eval 4 benchmark suite. CyberSOC Eval checks how well AI tools perform in security operation centers. AutoPatchBench tests how quickly AI can fix bugs in software code.
To go even further, Meta unveiled the Llama Defenders Program. This offers early access to advanced security tools. One example is a document scanner that labels sensitive content automatically. Others include tools that detect AI-generated audio scams or identify fake voice clips with watermark tech.
Meta is also testing a privacy-focused feature for WhatsApp. It’s called Private Processing. It lets users clean up or summarize messages using AI—without giving Meta or WhatsApp access to the messages themselves. Meta says it’s working with researchers to make sure this feature is safe and private before a full launch.
With this latest release, Meta signals a serious focus on AI safety. The new Llama AI protection tools aim to help developers, researchers, and companies build with confidence—knowing their models have solid safeguards in place.