Thanks for the question!
I've always interpreted predicates in the documentation you are referring to as a line of criteria which may not be the best way to describe it but that is the best I've come up with as I talk about building rules.
If you take the sample rule below, the documentation states
In the events
section, list the predicates to specify the following:
What each match or placeholder variable represents
Simple binary expressions as conditions
Function expressions as conditions
Reference list expressions as conditions
Logical operators
Each line of criteria in the events section are what the docs refer to as predicates. We have logical operators separating a field with a term. The last line of the event section is associating a field to placeholder variable that is also being used as a match variable in the match section. This example doesn't have regex or functions or lists but the same concepts apply, each of these lines of criteria are predicates.
In the events
section, all predicates are regarded as and
ed together by default.
Notice that we don't have AND or OR separating these lines in the event section. This is based on the above statement. We can use parenthesis and then AND/OR are required inside of those parenthesis.
The final example I will call out is in the condition section
List condition predicates for outcome variables here, joined with the keyword and
or or
, or preceded by the keyword not
.
In the condition section we have the event variable of $conn because our rule needs to take into account the event predicates specified in the event section, but we also use outcome variable of total and largest bytes received as they are calculated in the outcome section but are condition predicates, again I think of this as criteria for the rule to trigger, that is both of these outcome variables must meet a threshold before firing.
rule zeek_network_connections_bytes_received {
meta:
author = "Google Cloud Security"
events:
$conn.metadata.event_type = "NETWORK_CONNECTION"
$conn.metadata.product_name = "Bro"
$conn.metadata.vendor_name = "Zeek"
$conn.metadata.product_event_type = "conn"
$conn.metadata.description = "SF - Normal establish & termination"
$conn.network.received_bytes > 0
$conn.principal.hostname = $hostname
match:
$hostname over 30m
outcome:
$largest_bytes_received = max($conn.network.received_bytes)
$smallest_bytes_received = min($conn.network.received_bytes)
$total_bytes_received = sum($conn.network.received_bytes)
condition:
$conn and $total_bytes_received > 1000000 and $largest_bytes_received < 8000
}
Hope this helps!
Thanks for the question!
I've always interpreted predicates in the documentation you are referring to as a line of criteria which may not be the best way to describe it but that is the best I've come up with as I talk about building rules.
If you take the sample rule below, the documentation states
In the events
section, list the predicates to specify the following:
What each match or placeholder variable represents
Simple binary expressions as conditions
Function expressions as conditions
Reference list expressions as conditions
Logical operators
Each line of criteria in the events section are what the docs refer to as predicates. We have logical operators separating a field with a term. The last line of the event section is associating a field to placeholder variable that is also being used as a match variable in the match section. This example doesn't have regex or functions or lists but the same concepts apply, each of these lines of criteria are predicates.
In the events
section, all predicates are regarded as and
ed together by default.
Notice that we don't have AND or OR separating these lines in the event section. This is based on the above statement. We can use parenthesis and then AND/OR are required inside of those parenthesis.
The final example I will call out is in the condition section
List condition predicates for outcome variables here, joined with the keyword and
or or
, or preceded by the keyword not
.
In the condition section we have the event variable of $conn because our rule needs to take into account the event predicates specified in the event section, but we also use outcome variable of total and largest bytes received as they are calculated in the outcome section but are condition predicates, again I think of this as criteria for the rule to trigger, that is both of these outcome variables must meet a threshold before firing.
rule zeek_network_connections_bytes_received {
meta:
author = "Google Cloud Security"
events:
$conn.metadata.event_type = "NETWORK_CONNECTION"
$conn.metadata.product_name = "Bro"
$conn.metadata.vendor_name = "Zeek"
$conn.metadata.product_event_type = "conn"
$conn.metadata.description = "SF - Normal establish & termination"
$conn.network.received_bytes > 0
$conn.principal.hostname = $hostname
match:
$hostname over 30m
outcome:
$largest_bytes_received = max($conn.network.received_bytes)
$smallest_bytes_received = min($conn.network.received_bytes)
$total_bytes_received = sum($conn.network.received_bytes)
condition:
$conn and $total_bytes_received > 1000000 and $largest_bytes_received < 8000
}
Hope this helps!
Thanks, John for breaking it down.
I was wondering why not just use the word "criteria" as opposed to "predicate" to describe this concept.
One way to look at the events section in YARA-L is as a big Boolean expression that serves to select a base set (or sometimes sets) of events for the rule. Seen through that lens, each element in the events section is indeed a "predicate" in the sense that it can be interpreted as a Boolean expression that returns a value of true or false. When the rule is processed, all these predicates are combined together using Boolean logic (even when you see no Boolean operator between predicates there is an implicit AND). So, characterizing the elements as predicates is correct in a technical sense, but I certainly agree the wording could be clarified and the terminology simplified to serve a wider audience.
ah yes! Thanks so much, @herrald.
That makes a lot of sense. Even in JS, there is a concept of predicate functions that return either true or false.