Skip to main content
Question

Custom Parser nested JSON

  • March 4, 2026
  • 2 replies
  • 76 views

thaishfmarques
Forum|alt.badge.img+1

I’m creating a parser based on a Webhook feed that returns a nested JSON. Due to the way the JSON was designed and how it is received in SecOps, when I use array_function => "split_columns", I end up losing an important field, which appears overwritten in the statedump.

My problem is that I’m unable to map it correctly without using split_columns.

This is how the fields looks like in the sample

 

And this is how it looks when I use split_columns

 

filter {
mutate {
gsub => ["message", "[\\r\\n\\t]*", ""]
}
mutate {
replace => {
"token_metadata.vendor_name" => "X"
"token_metadata.product_name" => "One"
}
}

json {
source => "message"
on_error => "not_json"
# array_function => "split_columns"
}
if [not_json] { drop { tag => "TAG_MALFORMED_ENCODING" } }


mutate {
replace => {
"token_metadata.event_type" => "GENERIC_EVENT"
}
}

mutate {
rename => {
"token_metadata" => "udm_event.idm.read_only_udm.metadata"
}
on_error => "rename_failure"
}

mutate {
merge => { "@output" => "udm_event" }
}
statedump {}
}

That’s my initial template, after that I started to map this specific email field

    ######## EMAIL ########

grok { match => { "message" => [ "\\\"user\\\":\\s*\\\"(?P<detection_email>%{EMAILADDRESS})\\\"" ] } }

if [detection_email] == "" {
mutate {
replace => {
"udm_event.idm.read_only_udm.principal.email" => "NOT_FOUND"
}
}
}
else {
mutate {
replace => {
"udm_event.idm.read_only_udm.principal.email" => "%{detection_email}"
}
}
}
######## EMAIL ########

This actually works but my issue is when I try to put this in a for loop and the fields do not map anymore

PS: I try with grok because I haven’t been successfull in using JSON field without split_columns.

The for loop I tried was 
for detection in collectionData.detections {}

 

That’s my statedump


Internal State (label=):

{
"@collectionTimestamp": {
"nanos": 0,
"seconds": 1772642927
},
"@createTimestamp": {
"nanos": 0,
"seconds": 1772642927
},
"@enableCbnForLoop": true,
"@onErrorCount": 0,
"@output": [
{
"idm": {
"read_only_udm": {
"metadata": {
"event_type": "GENERIC_EVENT",
"product_name": "One",
"vendor_name": "X"
}
}
}
}
],
"@timestamp": {
"nanos": 0,
"seconds": 1772642927
},
"@timezone": "",
"collectionData": {
"detections": [
{
"access.url": "xxxx",
"user": "zzzzz"
}
]
}
}

 

Since I am relatively new in Google SecOps Parsing, I am having a very hard time to make this for loop work

 

2 replies

hzmndt
Staff
Forum|alt.badge.img+11
  • Staff
  • March 5, 2026

@thaishfmarques Gemini_output:
 

Okay, I've analyzed your parser configuration and the issue you're facing with the nested JSON array. Here's a breakdown of the problem and the suggested solution:

The core issue is how you're trying to access elements within the collectionData.detections array. Using array_function => "split_columns" flattens the array structure in a way that's not suitable for iterating with a for loop, and it seems to be causing the fields within the array objects to be lost or overwritten, as you observed.

To properly iterate over the array and access the nested fields like user and access.url for each detection, you should use the for ... in loop construct without array_function => "split_columns".

Here's the corrected approach:

  1. Remove array_function from the json filter: This allows the for loop to work directly with the JSON array structure.
  2. Iterate with for detection in collectionData.detections: This loop will execute for each object within the detections array, with the current object being assigned to the detection variable in each iteration.
  3. Access fields within the loop: Inside the loop, you can access the fields of the current object using dot notation on the loop variable (e.g., detection.user).
  4. Handle keys with special characters: For keys containing dots, like access.url, you need to use bracket notation (e.g., detection["access.url"]).
  5. Generate multiple events: To create a separate UDM event for each entry in the detections array, you should initialize your event variable (e.g., udm_event) at the beginning of each loop iteration and use the merge => { "@output" => "udm_event" } operation inside the loop.

Here's the revised filter block:

 

filter { mutate { gsub => ["message", "[\\r\\n\\t]*", ""] } # Parse the JSON message WITHOUT split_columns json { source => "message" on_error => "not_json" } if [not_json] { drop { tag => "TAG_MALFORMED_ENCODING" } } # Prepare base metadata that is common to all events mutate { replace => { "base_metadata.vendor_name" => "X" "base_metadata.product_name" => "One" "base_metadata.event_type" => "GENERIC_EVENT" } } # Loop through each item in the collectionData.detections array for detection in collectionData.detections { # Initialize an empty udm_event for this specific detection mutate { replace => { "udm_event" => "" } } # Copy the base metadata into the current event mutate { copy => { "udm_event.idm.read_only_udm.metadata" => "base_metadata" } } ######## EMAIL / USER ######## mutate { replace => { "detection_email" => "" } # Initialize } mutate { replace => { "detection_email" => "%{detection.user}" } on_error => "no_user_field" } if [no_user_field] or [detection_email] == "" { mutate { replace => { "udm_event.idm.read_only_udm.principal.user.userid" => "NOT_FOUND" } } } else { mutate { replace => { "udm_event.idm.read_only_udm.principal.user.userid" => "%{detection_email}" } } } ######## EMAIL / USER ######## ######## URL ######## mutate { replace => { "detection_url" => "" } # Initialize } mutate { # Use bracket notation for keys with special characters like '.' replace => { "detection_url" => "%{detection[access.url]}" } on_error => "no_url_field" } if ![no_url_field] and [detection_url] != "" { mutate { replace => { "udm_event.idm.read_only_udm.target.url" => "%{detection_url}" } } } ######## URL ######## # Add other field mappings from 'detection' object here if needed # Output this complete event mutate { merge => { "@output" => "udm_event" } on_error => "merge_failed" } } }

This revised configuration will:

  • Parse the incoming JSON message.
  • Iterate through each element in the collectionData.detections array.
  • For each element, extract the user and access.url fields.
  • Populate the relevant UDM fields.
  • Generate a distinct event in the @output for each detection, ensuring the fields from one detection don't carry over to the next.

thaishfmarques
Forum|alt.badge.img+1

 Thank you very much for the reply. It helped me better organize the items inside the loop. However, I’m still facing another issue I had: the output doesn’t recognize the fields and doesn’t generate a result.

Also, I don't see any on_error on statedump, except for not_json