I'm working on a search string that will eventually (hopefully) be used in API that will query our Tenable log source type and export the results directly to JIRA to be remediated by our infrastructure team.
The problem I'm having is that the query I've built below has some columns returning multiple values. So for instance, the current search is saying vulnerability X is showing up on asset A,B,C,D but it is placing A,B,C,D in the same cell. What I need is for vulnerability X to be broken out into separate rows for each asset:
row 1. Vuln X | asset A
row 2. Vuln X | asset B etc.
"Plugin ID" (which is Tenable's version of the CVE) is the field I'd like to match on.
Search String:
events:
metadata.log_type = "TENABLE_IO" AND security_result.detection_fields.value != "INFO" security_result.rule_id = $pluginID
The root of the issue lies in how YaraL-2 handles repeated fields and thematchsection. When a repeated field, likeprincipal.asset.ip, is used in theoutcomesection of a rule with amatchclause, it undergoes an implicit unnesting process. This means that if an event has multiple IP addresses associated with it, YaraL-2 creates separate internal rows for each IP address, duplicating the other fields from that event.
In this case, thematchclause is$pluginID, which groups events by plugin ID. Becauseprincipal.asset.ipis a repeated field, and likely contains multiple assets for the same vulnerability, the aggregation in theoutcomeusingarray_distinct($IP_Address)will still combine all unique IPs for that plugin ID into a single array element.
To achieve the desired output, where each vulnerability is listed on a separate row for each associated asset, you need to modify thematchsection to include the asset identifier. Here's how the rule can be adjusted:
events:
metadata.log_type = "TENABLE_IO"
security_result.detection_fields.value != "INFO"
$pluginID = security_result.rule_id
$assetID = principal.asset.product_object_id
match:
$pluginID, $assetID // Group by Plugin ID AND Asset ID
outcome:
$Description = extensions.vulns.vulnerabilities.description
$FQDN = principal.hostname
$IP_Address = principal.asset.ip // No need for array_distinct as it's now a 1:1 relationship
$Name = extensions.vulns.vulnerabilities.name
$NetBios = security_result.detection_fields.value
$OS = principal.asset.platform_software.platform_version
$Plugin_ID = security_result.rule_id
$Plugin_Output = security_result.detection_fields.value
$Risk = extensions.vulns.vulnerabilities.severity
//$See_Also = array_distinct()
$Solution = security_result.description
$Synopsis = security_result.summary
$Vulnerability_State = security_result.detection_fields.value
$Asset_UUID = principal.asset.product_object_id
By including$assetIDin thematchclause, the grouping will now occur for each unique combination of plugin IDandasset ID. As a result, the$IP_Addressvariable will only contain the single IP address associated with that specific plugin ID and asset ID combination, effectively creating the desired one-to-one relationship and separate rows in the output. Note thatarray_distinctis no longer needed for most fields since there should now be a 1:1 relationship.
Remember that in a rule with amatchsection, if an outcome variable depends on a repeated field, or a field of an event variable that is part of a match, an aggregation function must be used.
The root of the issue lies in how YaraL-2 handles repeated fields and thematchsection. When a repeated field, likeprincipal.asset.ip, is used in theoutcomesection of a rule with amatchclause, it undergoes an implicit unnesting process. This means that if an event has multiple IP addresses associated with it, YaraL-2 creates separate internal rows for each IP address, duplicating the other fields from that event.
In this case, thematchclause is$pluginID, which groups events by plugin ID. Becauseprincipal.asset.ipis a repeated field, and likely contains multiple assets for the same vulnerability, the aggregation in theoutcomeusingarray_distinct($IP_Address)will still combine all unique IPs for that plugin ID into a single array element.
To achieve the desired output, where each vulnerability is listed on a separate row for each associated asset, you need to modify thematchsection to include the asset identifier. Here's how the rule can be adjusted:
events:
metadata.log_type = "TENABLE_IO"
security_result.detection_fields.value != "INFO"
$pluginID = security_result.rule_id
$assetID = principal.asset.product_object_id
match:
$pluginID, $assetID // Group by Plugin ID AND Asset ID
outcome:
$Description = extensions.vulns.vulnerabilities.description
$FQDN = principal.hostname
$IP_Address = principal.asset.ip // No need for array_distinct as it's now a 1:1 relationship
$Name = extensions.vulns.vulnerabilities.name
$NetBios = security_result.detection_fields.value
$OS = principal.asset.platform_software.platform_version
$Plugin_ID = security_result.rule_id
$Plugin_Output = security_result.detection_fields.value
$Risk = extensions.vulns.vulnerabilities.severity
//$See_Also = array_distinct()
$Solution = security_result.description
$Synopsis = security_result.summary
$Vulnerability_State = security_result.detection_fields.value
$Asset_UUID = principal.asset.product_object_id
By including$assetIDin thematchclause, the grouping will now occur for each unique combination of plugin IDandasset ID. As a result, the$IP_Addressvariable will only contain the single IP address associated with that specific plugin ID and asset ID combination, effectively creating the desired one-to-one relationship and separate rows in the output. Note thatarray_distinctis no longer needed for most fields since there should now be a 1:1 relationship.
Remember that in a rule with amatchsection, if an outcome variable depends on a repeated field, or a field of an event variable that is part of a match, an aggregation function must be used.
The root of the issue lies in how YaraL-2 handles repeated fields and thematchsection. When a repeated field, likeprincipal.asset.ip, is used in theoutcomesection of a rule with amatchclause, it undergoes an implicit unnesting process. This means that if an event has multiple IP addresses associated with it, YaraL-2 creates separate internal rows for each IP address, duplicating the other fields from that event.
In this case, thematchclause is$pluginID, which groups events by plugin ID. Becauseprincipal.asset.ipis a repeated field, and likely contains multiple assets for the same vulnerability, the aggregation in theoutcomeusingarray_distinct($IP_Address)will still combine all unique IPs for that plugin ID into a single array element.
To achieve the desired output, where each vulnerability is listed on a separate row for each associated asset, you need to modify thematchsection to include the asset identifier. Here's how the rule can be adjusted:
events:
metadata.log_type = "TENABLE_IO"
security_result.detection_fields.value != "INFO"
$pluginID = security_result.rule_id
$assetID = principal.asset.product_object_id
match:
$pluginID, $assetID // Group by Plugin ID AND Asset ID
outcome:
$Description = extensions.vulns.vulnerabilities.description
$FQDN = principal.hostname
$IP_Address = principal.asset.ip // No need for array_distinct as it's now a 1:1 relationship
$Name = extensions.vulns.vulnerabilities.name
$NetBios = security_result.detection_fields.value
$OS = principal.asset.platform_software.platform_version
$Plugin_ID = security_result.rule_id
$Plugin_Output = security_result.detection_fields.value
$Risk = extensions.vulns.vulnerabilities.severity
//$See_Also = array_distinct()
$Solution = security_result.description
$Synopsis = security_result.summary
$Vulnerability_State = security_result.detection_fields.value
$Asset_UUID = principal.asset.product_object_id
By including$assetIDin thematchclause, the grouping will now occur for each unique combination of plugin IDandasset ID. As a result, the$IP_Addressvariable will only contain the single IP address associated with that specific plugin ID and asset ID combination, effectively creating the desired one-to-one relationship and separate rows in the output. Note thatarray_distinctis no longer needed for most fields since there should now be a 1:1 relationship.
Remember that in a rule with amatchsection, if an outcome variable depends on a repeated field, or a field of an event variable that is part of a match, an aggregation function must be used.
I very much appreciate your help here -- I was able to run the search successfully after aggregating some of the outcome variables per your recommendation. Now I need to start digging into how the API portion is going to work 🙂
I very much appreciate your help here -- I was able to run the search successfully after aggregating some of the outcome variables per your recommendation. Now I need to start digging into how the API portion is going to work 🙂
Good luck. If you have API questions, just post here and I'll try to help. It can be a steep learning curve.