Hello @Sarthakd25 ,
The (best?) good practice regarding CBN parsers is to initialize at the beginning of the parser, before any parsing and mapping operations, all the variables that you will use in your parser, as empty strings. This way you can use them later to test if some field was extracted from the raw log without causing the "not found in state data" error.
e.g :
filter {
# Initialize tokens
mutate {
replace => {
"token1" => ""
"token2" => ""
"cef_fields.status" => ""
}
}
# Then you can extract from raw log, it will assign values to the tokens you initialized, and they will be left as empty strings if not extracted.
grok {
[...]
}
# Then you can test if the value is present without errors
if [cef_fields][status] != "" {
[...]
}
}
Regards
Hello @Sarthakd25 ,
The (best?) good practice regarding CBN parsers is to initialize at the beginning of the parser, before any parsing and mapping operations, all the variables that you will use in your parser, as empty strings. This way you can use them later to test if some field was extracted from the raw log without causing the "not found in state data" error.
e.g :
filter {
# Initialize tokens
mutate {
replace => {
"token1" => ""
"token2" => ""
"cef_fields.status" => ""
}
}
# Then you can extract from raw log, it will assign values to the tokens you initialized, and they will be left as empty strings if not extracted.
grok {
[...]
}
# Then you can test if the value is present without errors
if [cef_fields][status] != "" {
[...]
}
}
Regards
Thanks for the response @chrisd2 ,
My logic looks like below
filter {
grok {
match => {
"message" => [
"%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:host} CEF: (?P<header_version>[^|]+)\\\\|%{GREEDYDATA:cef_event_attributes}"
]
}
overwrite => ["message"]
}
kv {
source => "cef_event_attributes"
field_split => "|"
value_split => "="
target => "cef_fields"
}
mutate {
add_field => { "security_result" => "{}" }
}
mutate {
add_field => { "event1" => "{}" }
}
mutate {
replace => {
"cef_fields.status" => ""
"cef_fields.createdOn" => ""
}
}
if [cef_fields][status] != "" {
if [cef_fields][status] in ["CREATED", "IN_PROGRESS"] {
mutate {
replace => { "security_result.threat_status" => "ACTIVE" }
}
}
else if [cef_fields][status] == "RESOLVED" {
mutate {
replace => { "security_result.threat_status" => "CLEARED" }
}
}
else {
mutate {
replace => { "security_result.threat_status" => "THREAT_STATUS_UNSPECIFIED" }
}
}
}
mutate {
merge => {
"event1.idm.read_only_udm.security_result" => "security_result"
}
}
if[cef_fields][createdOn] != "" {
date {
match => ["cef_fields.createdOn", "yyyy-mm-dd HH
ss"]
target => "event1.idm.read_only_udm.metadata.collected_timestamp"
on_error => "time_stamp_wrong_format"
}
}
now what it is doing for the short log format which is not having status and createdOn field and is working fine but for long format log which is having these values is not getting parsed like security_result.threat_status etc is not getting replaced, where am I getting it wrong?
I see that you added the initialization snippet, but you should put it at the very beginning of the parser, just below the "filter {" line.
Where you put it in your pasted code, it just overwrites the parsing done by grok / kv, this is why you have unexpected behavior.
Hi @chrisd2
Thank you for your help! Your solution worked perfectly.