i added the statedump{} inside the for loop:
filter {
json {
source => "message"
array_function => "split_columns"
}
for index, data in records {
if index == 0 {
mutate {
replace => { "target.hostname" => "%{data.hostname}" }
}
}
else {
mutate {
replace => { "intermediary.hostname" => "%{data.hostname}" }
}
mutate {
merge => { "event.idm.readonly_udm.intermediary" => "intermediary" }
}
}
statedump {}
}
}which yielded the following output:
Internal State (label=):
{
"@createTimestamp": {
"nanos": 0,
"seconds": 1708075176
},
"@enableCbnForLoop": true,
"@onErrorCount": 0,
"@output": [],
"@timezone": "",
"data": {
"hostname": "host"
},
"index": 0,
"iter": {
"records-7": 0
},
"message": "{\\n \\"records\\": [\\n {\\n \\"hostname\\": \\"host\\"\\n },\\n {\\n \\"hostname\\": \\"host1\\"\\n },\\n {\\n \\"hostname\\": \\"host2\\"\\n }\\n ]\\n}\\n",
"records": {
"0": {
"hostname": "host"
},
"1": {
"hostname": "host1"
},
"2": {
"hostname": "host2"
}
},
"target": {
"hostname": "host"
}
}
Internal State (label=):
{
"@createTimestamp": {
"nanos": 0,
"seconds": 1708075176
},
"@enableCbnForLoop": true,
"@onErrorCount": 0,
"@output": [],
"@timezone": "",
"data": {
"hostname": "host1"
},
"event": {
"idm": {
"readonly_udm": {
"intermediary": [
{
"hostname": "host1"
}
]
}
}
},
"index": 1,
"intermediary": {
"hostname": "host1"
},
"iter": {
"records-7": 1
},
"message": "{\\n \\"records\\": [\\n {\\n \\"hostname\\": \\"host\\"\\n },\\n {\\n \\"hostname\\": \\"host1\\"\\n },\\n {\\n \\"hostname\\": \\"host2\\"\\n }\\n ]\\n}\\n",
"records": {
"0": {
"hostname": "host"
},
"1": {
"hostname": "host1"
},
"2": {
"hostname": "host2"
}
},
"target": {
"hostname": "host"
}
}
Internal State (label=):
{
"@createTimestamp": {
"nanos": 0,
"seconds": 1708075176
},
"@enableCbnForLoop": true,
"@onErrorCount": 0,
"@output": [],
"@timezone": "",
"data": {
"hostname": "host2"
},
"event": {
"idm": {
"readonly_udm": {
"intermediary": [
{
"hostname": "host2"
},
{
"hostname": "host2"
}
]
}
}
},
"index": 2,
"intermediary": {
"hostname": "host2"
},
"iter": {
"records-7": 2
},
"message": "{\\n \\"records\\": [\\n {\\n \\"hostname\\": \\"host\\"\\n },\\n {\\n \\"hostname\\": \\"host1\\"\\n },\\n {\\n \\"hostname\\": \\"host2\\"\\n }\\n ]\\n}\\n",
"records": {
"0": {
"hostname": "host"
},
"1": {
"hostname": "host1"
},
"2": {
"hostname": "host2"
}
},
"target": {
"hostname": "host"
}
}
. But why is target.hostname showing in both index 1 and index 2, whereas per the code, it should show only in the element in index 0.
Also, the following is not showing in index 2:
"event": {
"idm": {
"readonly_udm": {
"intermediary": [
{
"hostname": "host2"
},
{
"hostname": "host2"
}
]
}
}
},
I'm not sure I understand the question, but I will walk through what I understand your code to be doing and why the statedumps you shared are exactly what I would expect.
- Parse the JSON string into a JSON like object structure
- start a loop to iterate over the records key/placeholder
- if this is the first iteration of the loop (index == 0) take the value currently contained in data.hostname (meaning the first array member of records) and place it in the newly created placeholder target.hostname.
- On the second iteration of the loop we go into the else statement. We take the value contained in data.hostname this time and place it into the intermediary.hostname placeholder
- We then merge the intermediary placeholder into event.idm.readonly_udm.intermediary
- Keep in mind, the original target.hostname value is still there because after placing it there in the first loop iteration nothing further was done to it
- On the third iteration of the loop we go into the else statement again. We take the value contained in data.hostname (now the third array member of records) and place it into the intermediary.hostname placeholder
- Just like before we merge the current value in intermediary into event.idm.readonly_udm.intermediary
This leaves us with an internal state of:
- The original JSON structure contained in the message that was parsed out via the JSON command
- A target placeholder with a subkey of hostname that contains the hostname value in the first array member of the input data
- An event placeholder with a subkey of idm, followed by readolny_udm, followed by intermediary, which is an array of two hostname keys with a value attached to each of them (the second and third array members of the original data)
- Some stray placeholders like index, intermediary, message, and others that were left over from processing the data.
There are a few things that I believe you're missing.
You should probably merge the target.hostname into the event placeholder the same way you're merging the intermediary.hostname
readonly_udm should actually be read_only_udm
You will need to add some other mandatory fields that are required before the event output will actually work.
Once you're done with everything else you want to do you need to a merge command to merge the event placeholder into output as seen here: https://cloud.google.com/chronicle/docs/reference/parser-syntax#output_data_to_a_udm_record
I noticed one more thing as I reviewed my response. It appears that the merge command is doing a merge by reference. That is why when you look at the final output you'll see both of the intermediary hostnames show as host2 instead of one showing host1 and the other showing host2.
I'm including the code with a remove_field command to fix this issue (as well as a merge of the target.hostname field into the event).
filter {
json {
source => "message"
array_function => "split_columns"
}
for index, data in records {
if index == 0 {
mutate {
replace => {
"target.hostname" => "%{data.hostname}"
}
}
mutate {
merge => {
"event.idm.read_only_udm.target" => "target"
}
}
mutate {
remove_field => ["target"]
}
}
else {
mutate {
replace => {
"intermediary.hostname" => "%{data.hostname}"
}
}
mutate {
merge => {
"event.idm.read_only_udm.intermediary" => "intermediary"
}
}
mutate {
remove_field => ["intermediary"]
}
}
}
statedump {}
}
And here is the output of that statedump
Internal State (label=):
{
"@createTimestamp": {
"nanos": 0,
"seconds": 1708093490
},
"@enableCbnForLoop": true,
"@onErrorCount": 0,
"@output": [],
"@timezone": "",
"data": {
"hostname": "host2"
},
"event": {
"idm": {
"read_only_udm": {
"intermediary": [
{
"hostname": "host1"
},
{
"hostname": "host2"
}
],
"target": [
{
"hostname": "host"
}
]
}
}
},
"index": 2,
"iter": {
"records-7": -1
},
"message": "{\\n \\"records\\": [\\n {\\n \\"hostname\\": \\"host\\"\\n },\\n {\\n \\"hostname\\": \\"host1\\"\\n },\\n {\\n \\"hostname\\": \\"host2\\"\\n }\\n ]\\n}",
"records": {
"0": {
"hostname": "host"
},
"1": {
"hostname": "host1"
},
"2": {
"hostname": "host2"
}
}
}
Hopefully this makes sense.