How we approached safetyWe believe that the only way to build agents that will benefit everyone is to be responsible from the start. AI agents that control computers introduce unique risks, including intentional misuse by users, unexpected model behavior, and prompt injections and scams in the web environment. Thus, it is critical to implement safety guardrails with care.We have trained safety features directly into the model to address these three key risks (described in the Gemini 2.5 Computer Use System Card).Further, we also provide developers with safety controls, which empower developers to prevent the model from auto-completing potentially high-risk or harmful actions. Examples of these actions include harming a system's integrity, compromising security, bypassing CAPTCHAs, or controlling medical devices. The controls:Per-step safety service: An out-of-model, inference-time safety service that assesses each action the model proposes before it’s executed.System instructions: Developers can further specify that the agent either refuses or asks for user confirmation before it takes specific kinds of high-stakes actions. (Example in documentation).Additional recommendations for developers on safety measures and best practices can be found in our documentation. While these safeguards are designed to reduce risk, we urge all developers to thoroughly test their systems before launch.
First seen: 2025-10-07 20:11
Last seen: 2025-10-08 20:15