Dear all, in the attempt to be helpfull, I am posting a first code draft I have done to manage truncated JSON
if [JSON_field] != "" {
json {
source => "JSON_field"
target => "response"
array_function => "split_columns"
on_error => "_error_responseBody"
}
if [_error_responseBody] {
# Innestation are cleared up
mutate {
gsub => [
"JSON_field",",.{2,20}:\\\\[?\\\\{", ","
]
}
# Some more cleaning
mutate {
gsub => [
"JSON_field", "\\\\{", "",
]
}
mutate {
gsub => [
"JSON_field", "\\\\}", "",
]
}
mutate {
gsub => [
"JSON_field", "\\\\]", "",
]
}
mutate {
gsub => [
"JSON_field", "\\\\[", "",
]
}
# Modify separator from coma to #, since some string values can have comas
mutate {
gsub => ["JSON_field", "\\"(\\\\w+)\\":\\"(.*?)\\",", "\\"$1\\":\\"$2\\"#"]
}
# Modify separator from coma to # also for integer and bool fields
mutate {
gsub => ["JSON_field", "\\"(\\\\w+)\\":([\\\\d|true|false]+),", "\\"$1\\":$2#"]
}
# Some more clean up
mutate {
gsub => ["JSON_field", ",{2,}", "#"]
}
mutate {
gsub => [
"JSON_field", "\\"", "" ]
}
kv {
source => "JSON_field"
target => "kv_field"
field_split => "#"
value_split => ":"
whitespace => strict
allow_empty_values => "false"
on_error => "_kv_error"
}
}
}
This is definitely not optimized and presents the following issues:
- flatterns nested json to a single layer, using the bottom layer key as final name
- if multiple kv pairs have the same key (coming from different layers), they are overwritten as a new one is encounterd
I would love to dive deeper and present a better solution, but this is enough for my actual use case and my deadline is approaching and need to move on with the project.
I hope someone else can find this helpful/take over and upgrade this.
Cheers