Missed the live session? Ready to take your parsing skills to the next level? Building on the foundation laid in our previous session, this technical deep dive moves beyond the basics of normalization.
In this webinar, Vishwanath Mantha, Principal Security Advisor at Google Cloud Security, guides us through the advanced mechanics of the Google SecOps pipeline. While the previous session, Parse Anything in Google SecOps: Parser Development Best Practices, covered the basics, Parsing in SecOps: Part Duex focuses on the critical "Schema on Write" architecture, managing the parser lifecycle, and leveraging new features like AI-assisted parser extensions and automatic extraction.
Whether you are dealing with log sources, handling overlapping IP addresses in complex environments, or trying to debug a Grok pattern, this session provides the technical best practices to ensure your data is detection-ready.
Key Topics & Discussion Points
Jump straight to the sections that interest you most:
- [06:44] The Strategic Value of Parsing (Schema on Write)
- Understanding why SecOps uses "Schema on Write" versus "Schema on Read."
- The importance of normalizing raw logs (from various vendors) into the Unified Data Model (UDM) to drive fast search, detection, and reporting.
- Breakdown of UDM data types: Nouns (Entities), Network, Security Results, and Enumerations.
-
[16:01] The Data Ingestion Pipeline
-
Overview of ingestion methods: Forwarders, BindPlane (Open Telemetry), Cloud Buckets, and direct APIs.
-
Crucial Concept: Distinguishing between a "Log Type" (the source format) and "Event Types" (what happened).
-
How raw logs travel through ingestion, normalization, indexing, and enrichment.
-
-
[20:44] Parser Extensions (The "New" Pipeline)
-
How extensions are now decoupled from the default normalization phase to prevent breaking changes.
-
Code vs. No-Code: Options for extending parsers using the UI mapping or writing custom logic.
-
[39:28] Live Demo: Using Gemini to automatically generate parser extension code (e.g., extracting a hostname from a request URL) via natural language prompts.
-
-
[28:50] Automatic Extraction
-
Introduction of "Schema on Read" capabilities for structured logs (JSON/XML).
-
How to extract up to 100 fields automatically into the
extractedUDM field as a fallback when full parsers aren't available.
-
-
[31:44] Entity & IOC Parsing
-
Writing to the Entity Graph vs. the Event Timeline.
-
Understanding the three contexts: Entity Context (customer specific), Derived Context (prevalence/first-seen), and Global Context (Threat Intel).
-
-
[35:14] Handling Complex Environments (Labels & Namespaces)
-
Best practices for tagging data.
-
Using Namespaces to handle RFC1918 (private IP) collisions across different data centers.
-
-
[45:13] Parser Lifecycle Management
-
New Feature: Parser Versioning.
-
How to opt-out of automatic updates for specific log types.
-
Managing generic "Premium" parsers vs. custom customer parsers.
-
-
[48:01] Monitoring & Troubleshooting
-
Using the "Health Hub" and "Data Ingestion and Health" dashboards.
-
Distinguishing between ingestion drops and parser validation errors.
-
Setting up Cloud Monitoring alerts for
failed_parsingstates.
-
-
[52:14] The Logstash/Grok "Cookbook"
-
Deep dive into syntax specific to SecOps (Grok patterns, Regex).
-
Technical Tips:
-
Handling JSON arrays and nested loops.
-
The difference between
Rename(changing token names) andReplace(changing values). -
Using
Mergefor repeated fields. -
Correctly parsing timestamps and handling timezones.
-
-
