Week 1: UDM: One Schema to Hunt Them All
Why Normalization Matters
Every security tool speaks its own language. Windows Event Logs call it "LogonType," Okta calls it "eventType," and GCP calls it "methodName." Hunting across all three means memorizing three schemas, writing three queries, and maintaining three sets of detection logic. Google SecOps solves this with the Unified Data Model (UDM), a single normalized schema that every log source maps into on ingestion.
When a Windows logon event (Event ID 4624), a GCP IAM authentication, and an Okta user.session.start all arrive in SecOps, they are each normalized to metadata.event_type = "USER_LOGIN". One query catches all three.
The UDM Noun Groups
Every UDM event is structured around noun groups. Each group describes a different participant or aspect of the event.
| Noun Group | What It Describes | Example Fields |
|---|---|---|
| metadata | Event classification and timing | metadata.event_type, metadata.log_type, metadata.product_event_type, metadata.event_timestamp |
| principal | The actor initiating the event | principal.user.email_addresses, principal.user.userid, principal.ip, principal.hostname |
| target | The entity being acted upon | target.resource.name, target.resource.resource_type, target.user.userid, target.ip |
| src | The original source (when relayed) | src.ip, src.hostname |
| observer | The device or sensor that reported the event | observer.hostname, observer.ip, observer.asset_id |
| intermediary | Proxies or load balancers between principal and target | intermediary.ip, intermediary.hostname |
| about | Additional referenced entities | about.file.sha256, about.url, about.ip |
| network | Network-level details | network.direction, network.ip_protocol, network.application_protocol |
| security_result | Verdicts and actions taken | security_result.action, security_result.severity, security_result.category |
| extensions | Product-specific fields | extensions.auth.type, extensions.vulns.vulnerabilities |
Key Fields You Will Use Constantly
Metadata fields classify the event:
metadata.event_type: The normalized action. Common values include USER_LOGIN, NETWORK_CONNECTION, PROCESS_LAUNCH, FILE_CREATION, USER_RESOURCE_ACCESS, STATUS_UPDATE, RESOURCE_CREATION, and USER_RESOURCE_UPDATE_PERMISSIONS.metadata.log_type: The source product. Examples: GCP_CLOUDAUDIT, WINDOWS_AD, OKTA, AZURE_AD, CS_EDR, WORKSPACE_ALERTS.metadata.product_event_type: The raw event name from the source (e.g., "google.iam.admin.v1.CreateServiceAccountKey").metadata.event_timestamp: When the event occurred.
Principal and target fields describe who did what to whom:
principal.user.email_addresses/target.user.email_addresses: Email of the acting or target user.principal.user.userid/target.user.userid: User identifier.principal.ip/target.ip: IP addresses involved.principal.hostname/target.hostname: Hostnames involved.target.resource.name: The specific resource acted on (e.g., a GCP project, a secret, a VM).target.resource.resource_type: The type of resource (e.g., "STORAGE_BUCKET").
How Normalization Works in Practice
Consider these three raw events from different sources:
- Windows AD: Event ID 4624, LogonType=10, TargetUserName="jsmith"
- GCP IAM: methodName="google.cloud.identity.v1.SignIn", principalEmail="jsmith@company.com"
- Okta: eventType="user.session.start", actor.alternateId="jsmith@company.com"
After ingestion, all three become UDM events with metadata.event_type = "USER_LOGIN". The user identity maps to target.user.userid or principal.user.email_addresses depending on the noun role. This means a single query or detection rule covers all authentication sources.
