10 Tips for Governing AI Agents

Co-Author: Marina Kaganovich

The rapid evolution from generative artificial intelligence, commonly known as genAI, to agentic AI is a stark reminder to organizations to prioritize “going back to basics” with security fundamentals, including data governance, cyber hygiene, IAM provisioning, and driving a risk-informed culture. While most of these concepts are not new, agentic AI introduces a hyper-autonomous twist!

The trend to at least experiment with AI agents is indisputable, even as adoption may vary (example and one more). Successful organizations should leverage critical forcing functions to reassess their approach, taking the opportunity to revamp security and information governance processes to enable the transformation that agentic AI makes possible.

The core issue isn't a lack of AI-specific solutions, but the challenge of governing and deploying these tools with the same rigor applied to human accounts, cloud infrastructure, and enterprise software. This is why leaders in security, compliance, risk, and data governance are all trying to figure out how to empower their teams while staying secure and compliant.

We previously wrote about genAI governance best practices, and as AI development has evolved from genAI chatbots to AI agents, we'd like to update our thoughts on this topic, highlighting some key areas of distinction along the way. The same thinking is routinely and successfully applied to IAM principles: 1980s concepts of least privilege inform best practices for AI tool deployment in the 2020s. Agentic AI may theoretically amplify risks due to its autonomous nature and interconnectedness with systems and other agents, potentially creating a greater attack surface. Therefore, you should consider the following steps for updating, planning, and implementing governance for AI agents in the enterprise, drawing on core security principles and programmatic best practices.

Define the AI agent’s sphere of influence: Before an agent does anything, rigorously define its operational scope using Agent Permissions controls, outlining WHAT it can do (e.g., which APIs it can call, systems it can touch, data it can modify, etc.) and WHERE (i.e., dev, test, prod environments). This requires extending the concept of least privilege by dynamically aligning the agent’s permissions with its specific purpose and current user intent. Alignment measures should help ensure agents behave consistently with the principal’s values, intentions, and interests.

All agents must have defined, unambiguous goals and measurable success criteria that are tailored to their specific roles to help ensure they operate within your defined ethical and compliance boundaries. Complex, multi-agent orchestration may introduce compounded risks stemming from interactions between probabilistic components. Therefore, it is advisable to design multi-agent systems to help prevent potential cascading failures by identifying integration points and implementing runtime policy enforcement and mechanisms, such as rollback infrastructure, that automatically halts AI operations across systems when unexpected behavior is detected.

Manage and regulate component interactions through policy as code. For Google Cloud users, leverage Policy Controller or Config Connector to help enforce these policies as code and govern component interactions deterministically. Agent autonomy risks may cause unforeseen harm, so agent powers must be limited. Misconfigurations in access provisioning may be exacerbated with both human and AI agent access rights.

Establish "Agent ID & Badging" with Clear Attribution: Think of your AI agent as a new employee who learns really fast but has zero inherent loyalty or common sense. You wouldn't give a newbie keys to the entire kingdom, right? Agents need tightly scoped identities. Grant only the permissions absolutely necessary for their tasks and for the shortest required time as determined by the desired outcome. Every action performed by an agent must be unequivocally attributable to that agent instance. No anonymous agents running around and no orphan abandoned agents.

Note that this does not mean that an AI agent is treated like a human employee in IAM; an agent identity carries both human user and workload characteristics. In Google Cloud environments, leverage Workload Identity Federation and granular IAM roles to achieve this scoped, non-human identity for your agents.

When multiple agents (and humans) interact with systems, you should know who/what did what for accountability and incident response. Think granular IAM controls, but for agentic AI. Misconfigurations in access provisioning continues to be a major pain point, which will only be amplified when organizations are faced with managing human and AI agent access rights. Assign unique identifiers (Agent IDs) to agents to help trace all actions, decisions, and outcomes back to the responsible entity (human or agent) for auditability. Identity policies must clearly attribute agent actions via composite identities, linking the agent to the human user directing it. Resource access should be confirmed and attributable back to the human user.

Control Resource and Rate Limiting: Implementing effective controls requires enforcing strict limitations on agent powers, aligning them dynamically with the agent's purpose and acceptable risk tolerance. These constraints should impose definable and enforceable maximum permission levels that govern access to compute resources, tool usage, and the frequency of external interaction. This is crucial for helping to mitigate two key risks: potential financial loss and system disruption. Don't let your agent become a potential denial-of-service attack against your own cloud bill.

Preventing Excessive Consumption: Agentic AI systems, particularly those leveraging advanced reasoning models, may often require substantially more processing power and this can unexpectedly drive up energy and server costs. To prevent losses from runaway execution (such as an agent entering an infinite loop of costly operations or API calls), comprehensive governance structures should consider tracking cost drivers like token consumption and resource utilization. Deployers should establish clear bounds on how often agents act and manage agent spend by tracking message consumption. Consider establishing rate-limiting alerts that require human authorization beyond certain spend thresholds.

Helping to Mitigate Security and System Risks: To help prevent potential compromised or misbehaving agents from consuming more than their fair share of cloud resources, deterministic runtime policy enforcement should be used. This strategy acts as a security chokepoint by monitoring and controlling agent actions before they execute. By limiting action capabilities and providing predictable hard limits, this control effectively restricts the possibility of worst-case impact of agent malfunction and helps contain exploitation designed to overwhelm compute and memory resources, thus minimizing the potential for downstream denial of service (DoS) attacks.

Securing the Agent Supply Chain with Tools and Data Dependencies: Agentic AI systems may present a heightened security risk due to their expanded potential attack surfaces and reliance on integrating external tools and interfaces. Because the orchestration phase can translate rogue plans into real-world impact, all tools—especially dynamically incorporated third-party ones—should undergo compliance vetting and due diligence. System deployers should enforce clear constraints on tool usage and set granular permissions to limit the agent’s actions and manage data sharing.

These best practices address traditional supply chain vulnerabilities, but are compounded by AI’s heavy reliance on data. Agents process significant volumes of data for memory and reasoning, requiring strong data governance. Governance should require auditable documentation of data lineage and classification, strict access controls, protection against the collection of sensitive PII, and the tracking of dataset version history within an AI registry.

Build the Agent Sandbox: Test agents in a robust, isolated environment that accurately mimics production but prevents real-world impact. An AI agent sandbox is a technical control defined as a secure, isolated environment where AI agents operate with restricted permissions and monitored boundaries. This intervention is key for pre-deployment testing and confines the agent to restrict the worst-case potential impact of malfunction or exploitation.

Sandboxes enable the safe simulation necessary for end-to-end testing in environments mimicking the real world. Crucially, they enforce the principle of limited powers by validating all inputs and outputs before they cross system boundaries. However, the reliability of a sandbox in effectively bounding highly-capable AI agents remains a difficult challenge. Simulating "doing" is much harder than simulating "talking." The sandbox needs to reflect the complexities of real-world interactions. Experiment, but do so more securely.

Implement the Big Red Button: Implementing robust controls is critical, starting with shutdown and interruption mechanisms to serve as a critical backstop against accidental or intentional harm. Implement a clear "off switch" or point of human intervention capable of a graceful shutdown to help ensure operations cease in a controlled, orderly manner, rather than an abrupt termination.

To help prevent agents from getting into loops or potentially causing cascading failures (where one error triggers a potentially catastrophic chain reaction), consider employing automated interventions that act as an automatic braking system. Design a fail-safe switch to turn an agent off based on predefined thresholds, leveraging policy engines to provide reliable, deterministic hard limits on actions. A timeout mechanism can also stop agent operation after a specified amount of time or number of actions.

It is vital that the system deployer designs solutions so agents cannot halt or tamper with the user's attempt to shut them down, explicitly designing the system to prioritize a "shut down gracefully when requested by the user" goal above all others. Define fallback protocols, helping to ensure agents are capable of pre-constructing a fallback procedure if they are interrupted mid-action sequence. Furthermore, rollback infrastructure should be integrated, allowing agent actions to be voided or undone in the event of a malfunction, potentially functioning like how banks void fraudulent transactions.

Because agents act as proxies inheriting privileges, they require mechanisms for human oversight and intervention. Human-in-the-Loop (HITL) processes should be mandated for high-stakes decisions, or for critical or irreversible actions, requiring explicit human confirmation before proceeding.

Clarify Accountability: Accountability is essential since AI agents act as proxies that inherit privileges, necessitating clear human oversight. It is vital that organizations explicitly declare enterprise accountability for agentic behavior to elevate collective and individual responsibility.

To ensure accountability is established before deployment, roles for oversight must be clearly defined for the model developer, system deployer, and the user. The goal is that at least one human entity (individual, corporation, or other legal entity, not the AI system itself) is accountable for every uncompensated potential direct harm caused by an agentic AI system.

Effective governance relies on establishing ownership for agent decisions and actions. This means adopting a cross-functional governance framework and appointing an AI governance leader with the authority to pause noncompliant projects. To enable this form of oversight, systems should enable transparency and accountability through tools that track responsible individuals and legal signoffs. Furthermore, agent actions must be observable and auditable, supported by assigning unique identifiers for agents and humans to ensure traceability. This systematic approach supports holding actors accountable and is necessary for establishing future liability regimes.

Institute Human-in-the-Loop: Instituting effective Human-in-the-Loop (HITL) controls is paramount, as AI agents' ability to take direct actions may increase the potential stakes of failure or exploitation.

Organizations should adopt a risk-based approach to determine the necessary human oversight requirements. This means mandating HITL processes, especially when agents operate with greater autonomy or handle critical decisions. For actions deemed critical or irreversible, the system should require explicit human confirmation before the agent proceeds. These high-risk operations include, but are not limited to, deleting large amounts of data, authorizing significant financial transactions, or changing security settings.

System deployers should clearly define the stages in the workflow where human review and/or approval is required before execution. This acts as a safety net, helping to ensure human judgment can intervene if an agent's reasoning is unclear, high-stakes, or potentially problematic. Policy engines can be leveraged to deterministically enforce these rules, for example, by requiring user confirmation for purchases over a predefined threshold. Furthermore, agents should be designed to flag ambiguous situations and escalate these instances to human review when needed. Transparency into the agent’s reasoning helps the user properly contextualize the action they are approving. This governance requires continuous human evaluation and feedback to refine agent performance.

Plan for Lifecycle Management: AI Agent Lifecycle Management is crucial, as agents are dynamic and adaptive, constantly learning and autonomously adjusting to changing conditions. Because agent properties are not static, ongoing model development—including changes to internal structure or new affordances—can significantly impact their function. This requires governance to be a modular, adaptive, and continuous process that avoids becoming obsolete.

Adopt proven frameworks, such as Google's Secure AI Framework (SAIF), to ensure a consistent approach that addresses security, risk management, privacy, and compliance. Organizations should consider articulating and ranking AI use cases based on business priority and the degree of risk they pose, tailoring controls accordingly.

Lifecycle governance mandates continuous evaluation across the agent's entire cycle, providing ongoing training and updates as policies, regulations, or organizational objectives evolve. It is necessary to monitor for specific life cycle triggers (such as model retraining or dataset updates). The final phase requires a decommission stage, where agents no longer fit for purpose are safely retired to minimize residual risks.

Mandate comprehensive auditability: Mandating comprehensive auditability and visibility into agent actions is essential for building trust, enabling security auditing, and supporting incident response. This visibility is crucial because agents, operating with complex and autonomous reasoning, can otherwise be opaque.

Robust logging must capture critical information, including inputs received (user prompts), actions taken or attempted, tools invoked, and outputs generated. Ideally, logs should also capture intermediate reasoning steps—the agent's "thought process"—to allow humans to understand how conclusions were reached. Activity logging provides the necessary immutable audit trails of all actions and decisions for accountability. Specialized formats like Agent Actions Cards can track API calls and execution traces.

Effective Agent Observability requires investment in secure, centralized logging systems. The granularity of detail recorded should use a risk-based approach, balancing the activity's inherent risks with privacy considerations. These logs feed into automatic monitoring systems, which can use AI models to continuously review the primary agent’s behavior for reliability, performance, and security risks in dynamic environments. Finally, assigning unique identifiers to agents and humans ensures traceability and attribution for every action taken.

In conclusion, as AI agents become increasingly integral to enterprise operations, the need for robust governance is paramount. By embracing these foundational security principles, organizations can leverage Google Cloud’s Secure AI Framework (SAIF) and advanced security controls to navigate the evolving landscape.

By rigorously defining their sphere of influence, establishing clear attribution, controlling resources, securing the supply chain, building sandboxes, implementing kill switches, clarifying accountability, instituting human oversight, planning for lifecycle management, and mandating comprehensive auditability, organizations can harness the transformative power of AI agents while mitigating potential risks.

Embracing these foundational security principles and programmatic best practices will not only help enable secure AI adoption but also drive a risk-informed culture essential for navigating the evolving landscape of artificial intelligence. For further reading, refer to our recently published papers on Google’s Approach for Secure AI Agents, guidance on agent orchestration and shadow AI threats to look out for.

These best practices are provided for informational purposes only. Please adapt these best practices to your organization’s specific needs and requirements.

Be the first to reply!

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded