Hi @angel288 ,
You should be able to use the statistical functions within YARA-L, e.g. stddev: https://cloud.google.com/chronicle/docs/investigation/statistics-aggregations-in-udm-search#stddev
Alternatively, you can use time windows and stddev, variance, etc.: https://cloud.google.com/chronicle/docs/detection/yara-l-2-0-syntax#windowvariance
Good afternoon @_K_O
Thank you very much for your answer.
The problem is that I can't use statistics functions over already grouped variables.
Take a look at this search. This is what I want to potentially accomplish in a detection rule. I don't really know if it makes sense.
metadata.log_type = "FIREWALL"
and principal.ip = "10.4.14.20"
$date = timestamp.get_date(metadata.event_timestamp.seconds, "UTC")
match:
$date
outcome:
$logs = count(metadata.id)
$avg = window.avg($logs) // I can't do this.
If I can't work with $logs value, is not possible for me to know what deviates from normal, right?
Thanks!
Good afternoon @_K_O
Thank you very much for your answer.
The problem is that I can't use statistics functions over already grouped variables.
Take a look at this search. This is what I want to potentially accomplish in a detection rule. I don't really know if it makes sense.
metadata.log_type = "FIREWALL"
and principal.ip = "10.4.14.20"
$date = timestamp.get_date(metadata.event_timestamp.seconds, "UTC")
match:
$date
outcome:
$logs = count(metadata.id)
$avg = window.avg($logs) // I can't do this.
If I can't work with $logs value, is not possible for me to know what deviates from normal, right?
Thanks!
Hi @angel288, YARA-L has some surprising limitations and there is no sub-select functionality within UDM Search. I think you would be able to use BigQuery or the Search API to do this, but I haven't tested it out.
For your use case, if I understand it correctly, you're looking to find the average over a specific timeframe - you should be able to do some weird math functions to get the average events per "timeframe" (e.g. 86400 sec = 24 hours):
$detection.metadata.log_type = "FIREWALL"
$detection.principal.ip = "10.4.14.20"
$date = timestamp.get_date($detection.metadata.event_timestamp.seconds, "UTC")
match:
$date
outcome:
$avg = math.round(count($detection.metadata.id) / 86400)
This would create a stat table showing the average number of events per second, minute, hour, etc.

Not sure if this is exactly what you're going for, but this blog post contains some useful methods to try out if you're interested: https://medium.com/@thatsiemguy/aggregate-queries-in-udm-search-1b885c8c27d5
@angel288 , Another way to address this could be a combination of Native Dashboards, Data Tables, and your Yara-L Detection Rule. At a high level create a Native Dashboard that shows avg stats across log sources and output the avg to a data table using the write_row function in the export section of the YARA-L query.
When creating a detection rule reference the log sources table updated by the Native Dashboard for type and avg accordingly and compare this value with the detection rule. You also may need to utilize the "over" operator in the match section and ensure the sample times for the dashboard and the detection rule are the same.
Dashboard Mock-Up
// parsing health dashboard - log ingest 24 hour average
ingestion.log_type != ""
$log_type = ingestion.log_type
//$date = timestamp.get_date(ingestion.start_time)
//$hours = timestamp.get_hour(ingestion.start_time)
match:
$log_type //, $date, $hours
outcome:
// $ingested_log_count = sum(if(ingestion.component="Ingestion API" and ingestion.state = "", ingestion.log_count,0))
$avg_logs_24hr = math.round(sum(if(ingestion.component="Ingestion API" and ingestion.state = "", ingestion.log_count,0) / 24))
order:
$avg_logs_24hr desc
export:
%avg_logs_24hr.write_row(
logType:$log_type,
avgLogs:$avg_logs_24hr
)
Detection Rule Mock-Up
rule logsource_deviation_poc {
meta:
rule_name = "Log Source Deviation"
author = "xxxxxx"
description = "Detect a spike of 20% from a Firewall"
severity = "Medium"
events:
metadata.log_type = "FIREWALL"
principal.ip = "10.4.14.20"
$date = timestamp.get_date(metadata.event_timestamp.seconds, "UTC")
match:
$date over 1hr
outcome:
$logs = count(metadata.id)
$avg = %avg_logs_24hr.<column_name> //refrence the log_source data table
$trigger = ($avg * 1.2)
condition:
$logs > $trigger