I'm currently working on some detection rules and tuning false positives for Netskope DLP/Governance alerts (NETSKOPE_ALERT_V2).
I noticed a specific abstraction behavior with the default parser regarding the User-Agent field and wanted to clarify how it works under the hood.
In the raw JSON log, we receive the exact user agent string from the Netskope forwarder. For example: "useragent": "Microsoft SkyDriveSync 26.022.0203.0006 ship; Windows NT 10.0 (26200)"
However, in the parsed UDM, the original string is completely dropped, and the fields are populated as follows:
Since losing the raw string limits our ability to write granular exceptions for specific sync clients without heavily relying on URLs or app names, my questions are:
Why does the parser map the specific raw user agent string to "Native" instead of preserving the original value in the UDM (e.g., mapping it directly to network.http.user_agent or storing it as a fallback in additional.fields)?
What exactly does "Native" mean in the context of this specific UDM abstraction? Does it act as a hardcoded catch-all category for any non-browser desktop/sync client identified by Netskope?
Any insights into the parser logic or plans to retain the raw string in future updates would be greatly appreciated!
Thanks.
Best answer by dnehoda
1. Why is the raw string dropped?
In the Chronicle UDM (Unified Data Model) philosophy, parsers often prioritize normalization over preservation.
Mapping Logic: The default parser for NETSKOPE_ALERT_V2 frequently uses a lookup table or a regex-based classification logic. If the User-Agent doesn't match a known browser (like Chrome, Firefox, or Safari), the parser is often instructed to categorize the type of traffic rather than pass through the literal string.
The network.http.user_agent Field: In many Chronicle parsers, this field is intended to hold a "cleaned" or categorized version of the agent. By mapping it to "Native," the parser is essentially saying, "I recognize this isn't a standard web browser, but I don't have a specific UDM sub-field for this exact version of SkyDriveSync."
Missing Fallback: Ideally, the raw string should be mapped to about.custom_details or network.http.parsed_user_agent.full_tag, but many default parsers omit this to save on indexing costs or because the mapping was written with a "broad strokes" approach.
2. What does "Native" actually mean?
In the context of UDM abstraction, "Native" is a categorical label. It typically indicates Non-Browser App Traffic.
The "Catch-all": You hit the nail on the head—it acts as a hardcoded category. It identifies traffic originating from a local operating system process or a dedicated application (like the OneDrive/SkyDrive sync engine, Dropbox desktop client, or even a PowerShell script) rather than a standard web navigator.
USER_DEFINED: When you see family = "USER_DEFINED", it usually means the underlying library used by the parser didn't find a match in its standard "Browser/OS" dictionary, so it fell back to a generic bucket defined by the parser author.
You can use the following parser extension to potentially mitigate this.
filter { # 1. Extract the raw JSON from the 'message' field json { source => "message" target => "netskope" }
# 2. Check if the useragent field exists and isn't empty if [netskope][useragent] != "" {
# 3. Use mutate to map the raw string into the UDM structure # We use 'event1' as the default event handle for extensions mutate { replace => { "event1.idm.read_only_udm.network.http.parsed_user_agent.full_tag" => "%{[netskope][useragent]}" } } }
# 4. Merge the extension's findings into the final output mutate { merge => { "@output" => "event1" } } }
In the Chronicle UDM (Unified Data Model) philosophy, parsers often prioritize normalization over preservation.
Mapping Logic: The default parser for NETSKOPE_ALERT_V2 frequently uses a lookup table or a regex-based classification logic. If the User-Agent doesn't match a known browser (like Chrome, Firefox, or Safari), the parser is often instructed to categorize the type of traffic rather than pass through the literal string.
The network.http.user_agent Field: In many Chronicle parsers, this field is intended to hold a "cleaned" or categorized version of the agent. By mapping it to "Native," the parser is essentially saying, "I recognize this isn't a standard web browser, but I don't have a specific UDM sub-field for this exact version of SkyDriveSync."
Missing Fallback: Ideally, the raw string should be mapped to about.custom_details or network.http.parsed_user_agent.full_tag, but many default parsers omit this to save on indexing costs or because the mapping was written with a "broad strokes" approach.
2. What does "Native" actually mean?
In the context of UDM abstraction, "Native" is a categorical label. It typically indicates Non-Browser App Traffic.
The "Catch-all": You hit the nail on the head—it acts as a hardcoded category. It identifies traffic originating from a local operating system process or a dedicated application (like the OneDrive/SkyDrive sync engine, Dropbox desktop client, or even a PowerShell script) rather than a standard web navigator.
USER_DEFINED: When you see family = "USER_DEFINED", it usually means the underlying library used by the parser didn't find a match in its standard "Browser/OS" dictionary, so it fell back to a generic bucket defined by the parser author.
You can use the following parser extension to potentially mitigate this.
filter { # 1. Extract the raw JSON from the 'message' field json { source => "message" target => "netskope" }
# 2. Check if the useragent field exists and isn't empty if [netskope][useragent] != "" {
# 3. Use mutate to map the raw string into the UDM structure # We use 'event1' as the default event handle for extensions mutate { replace => { "event1.idm.read_only_udm.network.http.parsed_user_agent.full_tag" => "%{[netskope][useragent]}" } } }
# 4. Merge the extension's findings into the final output mutate { merge => { "@output" => "event1" } } }