Skip to main content

Hi - I'm writing SIEM documentation for my SOC and am struggling to quickly produce summaries of all the relevant (for us) field names and lists of corresponding values. In Splunk I'd  | fieldsummary  or  | table * or some other options.

In SecOps SIEM UDM search I run searches on a logtype, review a set of results, create columns for my UDM search results, view, and download a CSV file of these results by my columns. Then I iterate on my logtype search with exclusions for the common field values I found and repeat.

Pitfalls of my approach are the restrictions on the number of result rows allowed and possible difficulties seeing the more rarely seen field values .

I have to iterate on the searches as SecOps SIEM searches have limits on the # of rows available in the results: i.e 1mm rows for UDM searches and 30k rows for statistics and aggregations searches. With these limits it's difficult to see the full distribution of values for high volume logs like firewall.

So I have to iteratively filter out in searches the most common values for fields of interest as the SecOps SIEM so I can continually reduce the results rows to keep them under the UDM results row limits. 

But, filtering on the most common values for a field can cause me to miss uncommon values for other fields if they also occur in logs with the field values I filtered out.

There clearly is a solution within SecOps SIEM as evidenced by the existence of the "UDM Lookup" tab (image)...

...but UDM search cannot search on wildcards (tho I only tried * and .* ).

Also UDM Lookup  cannot find or display some important fields: e.g. in gmail logs i.e. the sub fields of "additional field" or "label" types: about.labels["post_delivery_action"}.

 

This seems highly suboptimal. What am I missing?

 

Further search results data visibility issue: when UDM search results output is at limits, not all the data is there - it's sampled.

 


Hi @Chris_B,

A really good question to be honest, I think besides using Bigquery, the only other alternative way I could imagine this being done, is looking at the parser, and using some regex to capture all occurrences of a UDM field within the parser.

For example, the below parser logic (taken from:  https://medium.com/@fabrizio.rendina/how-to-realize-a-parser-for-google-chronicle-siem-dda820754b55)

# Product: QNAP Systems Nas
# Category: Storage
# Last Updated: 2023-07-25
# Author: Fabrizio Rendina


filter {
# initialize variables
mutate {
replace => {
"WHEN" => ""
"SRCLOG" => ""
"TIPOLOG" => ""
"USERNAME" => ""
"SRCLOGIP" => ""
"COMPUTERNAME" => ""
"APPLICATION" => ""
"CATEGORY" => ""
"CONTENT" => ""
"CONNTYPE" => ""
"RESOURCE" => ""
"ACTION" => ""
}
}

# Extract fields from the raw log.
grok {
match => {
"message" => ["(<\\d+>)?%{SYSLOGTIMESTAMP:WHEN} %{SYSLOGHOST:SRCLOG} %{DATA}: %{GREEDYDATA:TIPOLOG} log: Users: %{GREEDYDATA:USERNAME}, Source IP: %{GREEDYDATA:SRCLOGIP}, Computer name: %{GREEDYDATA:COMPUTERNAME}, Application: %{GREEDYdata&colon;APPLICATION}, Category: %{DATA:CATEGORY}, Content: %{GREEDYDATA:CONTENT}"
"(<\\d+>)?%{SYSLOGTIMESTAMP:WHEN} %{SYSLOGHOST:SRCLOG} %{DATA}: %{GREEDYDATA:TIPOLOG} log: Users: %{USERNAME:USERNAME}, Source IP: %{IP:SRCLOGIP}, Computer name: %{GREEDYDATA:COMPUTERNAME}, Connection type: %{DATA:CONNTYPE}, Accessed resources: %{DATA:RESOURCE}, Action: %{GREEDYDATA:ACTION}"]
}
overwrite => ["WHEN","SRCLOG","TIPOLOG","USERNAME","SRCLOGIP","COMPUTERNAME","APPLICATION","CATEGORY","CONTENT","CONNTYPE","RESOURCE","ACTION"]
on_error => "not_valid_log"
}

# Parse event timestamp
if [when] != "" {
date {
match => [ "WHEN", "MMM dd HH🇲🇲ss", "MMM d HH🇲🇲ss"]
rebase => true
}
}

# Save the value in "when" to the event timestamp
mutate {
rename => {
"WHEN" => "timestamp"
}
on_error => "timestamp_error"
}

#Convert the SRCLOGIP from GREEDYDATA to IP:
mutate {
replace => {
"src_ip_temp" => "%{SRCLOGIP}"
}
convert => {
"src_ip_temp" => "ipaddress"
}
on_error => "not_a_src_ip"
}

if ![not_a_src_ip] {
mutate {
merge => {
"event.idm.read_only_udm.principal.ip" => "SRCLOGIP"
}
}
}

# Transform and save username
if [username] not in [ "-" ,"" ] {
mutate {
lowercase => ["USERNAME"]
}
mutate {
replace => {
"event.idm.read_only_udm.principal.user.userid" => "%{USERNAME}"
}
on_error => "Username_error"
}
}

if [action] != "" {
mutate {
replace => {
"security_result.description" => "%{ACTION}"
}
merge => {
"event.idm.read_only_udm.security_result" => "security_result"
}
}
}

mutate {
replace => {
"event.idm.read_only_udm.metadata.product_name" => "QNAP Nas"
"event.idm.read_only_udm.metadata.vendor_name" => "QNAP"
"event.idm.read_only_udm.target.hostname" => "%{SRCLOG}"
"event.idm.read_only_udm.target.application" => "%{APPLICATION}"
"event.idm.read_only_udm.target.file.full_path" => "%{RESOURCE}"
"event.idm.read_only_udm.principal.hostname" => "%{COMPUTERNAME}"
"event.idm.read_only_udm.metadata.description" => "%{CONTENT}"
"event.idm.read_only_udm.network.application_protocol_version" => "%{CONNTYPE}"
"event.idm.read_only_udm.metadata.event_type" => "GENERIC_EVENT"
}
on_error => "multiple_replace_error"
}

# save event to @output
mutate {
merge => {
"@output" => "event"
}
}
} #end of filter

Using a simple regex pattern:

".*\\..*_.*["]\\s=

returns the following:

"event.idm.read_only_udm.principal.ip" =
"event.idm.read_only_udm.principal.user.userid" =
"event.idm.read_only_udm.security_result" =
"event.idm.read_only_udm.metadata.product_name" =
"event.idm.read_only_udm.metadata.vendor_name" =
"event.idm.read_only_udm.target.hostname" =
"event.idm.read_only_udm.target.application" =
"event.idm.read_only_udm.target.file.full_path" =
"event.idm.read_only_udm.principal.hostname" =
"event.idm.read_only_udm.metadata.description" =
"event.idm.read_only_udm.network.application_protocol_version" =
"event.idm.read_only_udm.metadata.event_type" =

I'm unable to view a full parser, so am unsure how viable this solution is, but hopefully it helps?

Kind Regards,

Ayman


Reply