Modular Security Expertise: Leveling Up AI Runbooks with Agent Skills

Forum|Forum|4 months ago
March 4, 2026
0 replies
568 views

mluwork
Staff

Authors:
Ming Lu, AI Security Engineer
Dan Dye, Adoption Engineer, Security, Google Cloud

Introduction: The Context Conundrum

In the previous AI runbooks post, we introduced the concept of AI Runbooks for Google Security Operations (SecOps), demonstrating how we can engineer context for Large Language Models (LLMs) in order to perform complex security tasks using Model Context Protocol (MCP) servers. By codifying our operational procedures into Markdown files within a `rules_bank` directory and referenced by Claude, Gemini and Cline, we gave the AI a "brain" tailored for the SOC. Since the original release, we’ve added to those runbooks YAML frontmatter (for metadata) and rubrics (to facilitate agent-as-judge). Today, we are exploring another major improvement: SecOps Agent Skills.

As we added more runbooks—covering everything from phishing response to APT hunting—we hit a common bottleneck: Context Window Clutter. Loading every single procedure, template, and reference script into the model's immediate memory is inefficient. It consumes expensive tokens and, more importantly, can lead to "attention drift," where the model loses focus on the specific task at hand (see https://arxiv.org/abs/2307.03172).

Agent Skills are the next logical step in this evolution. We are moving from static, persistent instructions to modular, on-demand expertise that the AI activates only when it’s needed.

What are Agent Skills?

Agent Skills are an open standard designed to package instructions, metadata, and optional resources (like scripts or templates) into a discoverable capability. Think of it as a "just-in-time" training manual for your AI analyst.

The power of Skills lies in Progressive Disclosure. Rather than reading every file upfront, the agent operates in three levels of loading:

Level 1: Metadata: The agent only knows the Skill exists and what it’s for (e.g., "Use this for PDF form filling").
Level 2: Instructions: When a task matches the description, the agent reads the SKILL.md file to understand the workflow.
Level 3: Resources & Code: If the instructions reference a specific Python script or a complex database schema, the agent accesses those assets only as required.

At the time of writing, Agent Skills are supported by Claude Code and Gemini CLI (preview).

The Evolution of the ai-runbooks Repository

We added the skills to the ai-runbooks repository organized by function:

Triage (4 Skills): Includes triage-alert, triage-suspicious-login, triage-malware, and deep-dive-ioc.
Threat Hunting (5 Skills): Provides specialized capabilities for hunt-threat, hunt-apt, hunt-ioc, hunt-lateral-movement, and hunt-credential-access.
Incident Response (4 Skills): Delivers end-to-end IR workflows, including respond-ransomware, respond-malware, respond-phishing, and respond-compromised-account.
Enrichment & Case Management: Facilitates essential data enrichment with enrich-ioc, pivot-on-ioc, and correlate-ioc, alongside case operations like check-duplicates and document-in-case.

Explicit Skill Outputs: Improving Reproducibility in Workflow Automation

To improve automation and predictability of results, the specification mandates the inclusion of Explicit Skill Outputs. Each skill now explicitly defines the data it must return in a ## Required Outputs section within its Markdown body.

This explicit definition improves Workflow Orchestration and Skill Chaining in several ways:

Standardized Hand-offs: In a multi-step workflow—such as a Tier 1 Analyst move from triage-alert to enrich-ioc—the output of one skill serves as the structured input for the next. The naming convention uses UPPER_SNAKE_CASE to make variables like MALICIOUS_CONFIDENCE more easily identifiable and consumable by subsequent agents in the chain.
State Management: Complex investigations (e.g., full-investigation) involve multiple atomic skills. Explicit outputs act as a "contract," forcing the LLM to extract and preserve specific evidence, which prevents critical information from being lost in the "noise" of a long conversation history.
Logical Branching: Workflows often require decision-making based on previous findings. For instance, the result of MALICIOUS_CONFIDENCE (High, Medium, or Low) determines whether the workflow proceeds to respond-malware or results in a close-case-artifact action.
Instruction Adherence: Placing these requirements in the Markdown body rather than the YAML frontmatter ensures that the LLM actively parses and follows these instructions as part of its procedural guidance.

By defining these outputs, we transform independent skills into composable building blocks for a cohesive, agentic SOC that can maintain focus and accuracy throughout the entire incident response lifecycle.

Expanded Section: Personas and Orchestration

Another significant addition to this new architecture is the Persona-Based Orchestration system. While Agent Skills provide the "how-to" for specific tasks, Personas define the "who"—giving the AI a codified identity, operational perspective, and clear boundaries within your security team.

What is a Persona?

A persona is a YAML-based manifest that defines a specific role within the SOC (e.g., Tier 1 Analyst, Incident Responder, or Threat Hunter). It serves as a blueprint for the AI, outlining its typical responsibilities, the set of skills it is authorized to use, and the specific workflow chains it should follow.

Why Do We Need Personas?

In a real-world SOC, efficiency and security rely on Divide and Conquer and Separation of Duties. Personas bring these disciplines to AI agents in several critical ways:

IAM & Security Boundaries: Different security tasks require different levels of access. By defining personas, we can map specific IAM roles to the AI's identity. For instance, a Tier 1 Analyst agent may only require viewer permissions in Google Security Operations, whereas an Incident Responder requires admin and soarAdmin roles to perform containment actions.
Tool & Skill Scoping: To prevent "tool sprawl" and reduce context window clutter, personas restrict the AI's toolset to only what is necessary for the job. A Threat Hunter persona will prioritize APT hunting and lateral movement skills, while a Tier 1 persona focuses on alert triage and duplicate checking.
Structured Intent: Personas help the LLM understand the user's intent based on their role, allowing it to tailor its responses and suggest the most appropriate next steps in an investigation.

Maximum Flexibility: Tailoring the Persona to Your SOC

One of the core strengths of this system is its flexibility. No two security teams are structured identically, so we’ve designed personas to be fully customizable:

Customize Existing Roles: You can easily update a persona's YAML file to add or remove skills as your team's responsibilities evolve.
Create New Roles: The existing personas provide a template, so that organizations can define entirely new personas—such as a "Compliance Manager" or "Cloud Security Architect"—to handle specialized workflows.
Regional or Team Variations: You can even maintain different versions of a persona for different regions or shifts (e.g., tier1-analyst-emea.yaml) or for LLM model (e.g. tier1-analyst-pro.yaml, tier1-analyst-flash.yaml).

Sample Persona Document Structure

Below is a sample of the structure used in the _personas/ directory, illustrating how roles, skills, and workflows are codified:

name: tier1-analyst
description: Focuses on initial alert assessment, duplicate checking, and basic enrichment.
iam_requirements:
  chronicle: roles/chronicle.viewer
  soar: roles/chronicle.editor
  gti: GTI Standard
primary_skills:
  - check-duplicates
  - triage-alert
  - enrich-ioc
workflow_chains:
  alert_triage:
    steps:
      - check-duplicates
      - triage-alert
      - enrich-ioc
    escalation_criteria: "If MALICIOUS_CONFIDENCE is High, escalate to Tier 2."

How to Use: A Practical Example

Using these skills in your daily workflow is seamless, whether you are in a terminal or an IDE.

Example: Triaging a Suspicious Login

Suppose you are alerted to a suspicious login from an unfamiliar IP address. You can invoke the specialized skill directly:

In Claude Code:

/triage-suspicious-login USER_ID=jsmith CASE_ID=1234

The agent will automatically:

Check for duplicate cases to see if this user has been flagged recently.
Enrich the IP address using gti-mcp to check for known malicious activity.
Cross-reference with other alerts using correlate-ioc.
Document findings directly in your case.

Example: Triage a Case

In Gemini CLI: You can trigger an entire persona-based workflow with a single command:

gemini -p "@skills/_personas/tier1-analyst.yaml Triage CASE-1234 following this persona workflow"

The “Triage” keyword in the prompt is used to find the default_triage workflow, which chains together multiple skills and uses a decision tree to capture logic.

The Synergy: Runbooks + Skills + MCP

To summarize the new architecture:

Personas define who is asking and their likely intent.
Agent Skills provide the how—the modular, executable expertise for specific tasks.
MCP Servers (like secops-mcp, gti-mcp and remote MCP for SecOps) provide the tools to interact with the real-world security environment.

This synergy allows your AI agent to operate with the precision of an experienced analyst, using the right tools at the right time without being overwhelmed by a massive wall of text.

Furthermore, the utility of these skills extends beyond our internal framework, allowing for seamless integration with other third-party skills, orchestrated by the LLM. This synergy fundamentally unlocks a multitude of opportunities and amplifies efficacy within real-world SOC operations.

Conclusion

Moving to an Agent Skills-based model represents a significant step forward in making LLMs effective allies in the SOC. It makes our automation more consistent, efficient, and scalable.

Explore the new skills directory in the ai-runbooks repo and start leveling up your AI-powered security operations today!