Parsing truncated JSON logs

Question

Hello everyone!We are receiving logs containing, among others, a responseBody from API calls. The log generator has a limiter of 2000 chars per field, which is fine for most of the cases, but a specific API calls always generates longer responseBody. This brings me to have always truncated JSONs for calls from that specific API. Unless I am missing something, JSON extraction filter does not work with truncated JSON, which leave me without info at all from those specific logs.Is there any workaround to make Chronicle extract at least those available info in the truncated JSON? ThanksA

Tonio · Accepted Answer

Dear all, in the attempt to be helpfull, I am posting a first code draft I have done to manage truncated JSON

if [JSON_field] != "" {
    json {
        source => "JSON_field"
        target => "response"
        array_function => "split_columns"
        on_error => "_error_responseBody"
    }
    if [_error_responseBody] {
        # Innestation are cleared up
        mutate {
            gsub => [
            "JSON_field",",.{2,20}:\\\\[?\\\\{", ","
            ]
        }
        # Some more cleaning
        mutate {
            gsub => [
            "JSON_field", "\\\\{", "",
            ]
        }
        mutate {
            gsub => [
            "JSON_field", "\\\\}", "",
            ]
        }
        mutate {
            gsub => [
            "JSON_field", "\\\\]", "",
            ]
        }
        mutate {
            gsub => [
            "JSON_field", "\\\\[", "",
            ]
        }
        # Modify separator from coma to #, since some string values can have comas
        mutate {
            gsub => ["JSON_field", "\\"(\\\\w+)\\":\\"(.*?)\\",", "\\"$1\\":\\"$2\\"#"]
        }
        # Modify separator from coma to # also for integer and bool fields
        mutate {
            gsub => ["JSON_field", "\\"(\\\\w+)\\":([\\\\d|true|false]+),", "\\"$1\\":$2#"]
        }
        # Some more clean up
        mutate {
            gsub => ["JSON_field", ",{2,}", "#"]
        }
        mutate {
            gsub => [
            "JSON_field", "\\"", "" ]
        }
        kv {
            source => "JSON_field"
            target => "kv_field"
            field_split => "#"
            value_split => ":"
            whitespace => strict
            allow_empty_values => "false"
            on_error => "_kv_error"
        }
    }
}

This is definitely not optimized and presents the following issues:

- flatterns nested json to a single layer, using the bottom layer key as final name

- if multiple kv pairs have the same key (coming from different layers), they are overwritten as a new one is encounterd

I would love to dive deeper and present a better solution, but this is enough for my actual use case and my deadline is approaching and need to move on with the project.

I hope someone else can find this helpful/take over and upgrade this.

Cheers

manthavish · Answer

Hi Tonio,You are correct that the json filter would fail. One option I can think of is to write your own parser that starts of with the default parser and then add some logic in  the error handling section for where the json filter fails. Basically, see if you can use gsub to make the message json complete and then parse what is needed.

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded