Skip to main content

I'm trying to test a format to format a log that should arrive as JSON but has some special characters and isn't formatted with double quotes. In principle, this is the current parser:

filter {
#inicialization
mutate {
replace => {
"message_handler" => ""
}
}
# format validation
grok {
match => {
"message" => "<\\\\d+>\\\\{%{GREEDYDATA:message_handler}\\\\}"
}
on_error => "on_uniocode_remover"
overwrite => ["message_handler"]
}

mutate {
gsub => ["message_handler", "'", '"']
}

mutate {
replace => {
"message_handler" => "{%{message_handler}}"
}
}

json {
source => "message_handler"
array_function => "split_columns"
on_error => "not_json"
}

statedump {
label = "verification"
}
}

And I'm trying to parse a log that has this original format:

```
<14>{'UTC Offset': -3, 'Instance Id': 888, 'Session Id': 888, 'Successful Sqls': 3, 'Failed Sqls': 0, 'Objects and Verbs': 'dual SELECT;SEQUENCE_HISTORY SELECT', 'Construct Id': '888', 'Period Start': '2024-10-16T20:00:00Z', 'DB User Name': 'USER', 'OS User': 'ROOT', 'Source Program': 'CLIENT', 'Server IP': '10.10.10.10', 'Analyzed Client IP': '10.10.10.10', 'Service Name': 'SERVICE', 'Client Host Name': 'SERVICE', 'Server Type': 'SERVER', 'App User Name': nan, 'Database Name': 'DB_NAME', 'Application Event ID': 0, 'Event User Name': 'N', 'Event Type': 'N', 'Event Value Str': 'N', 'Event Value Num': 'N', 'Event Date': 'N', 'Server Port': 1521, 'Network Protocol': 'TCP', 'Total Records Affected': -1, 'Server Host Name': 'SERVER', 'Timestamp': '2024-10-16T21:33:39Z', 'Original SQL': 'SELECT SEQUENCE_HISTORY.currval FROM dual', 'Average Execution Time': 1, 'Uid Chain': 'N', 'Uid Chain Compressed': 'N', 'Session Start': '2024-10-16T20:41:59Z'}□
```

I've tried to remove the unicode characters and also replace the cotes with double cotes with gsub. But this return a double cotes with "\\" in the message:

 

 


Is it possible to format an object to JSON from the SIEM parser editor, or is this not a recommended operation?


@chicoqueiroga wrote:

'App User Name': nan, 




Hi,


it seems that the value nan for 'App User Name' is not enclosed in quotes. (even though it represents "not a number.")


Testing the same with quotes will work:


"message_handler": "{\\"UTC Offset\\": -3, \\"Instance Id\\": 888, \\"Session Id\\": 888, \\"Successful Sqls\\": 3, \\"Failed Sqls\\": 0, \\"Objects and Verbs\\": \\"dual SELECT;SEQUENCE_HISTORY SELECT\\", \\"Construct Id\\": \\"888\\", \\"Period Start\\": \\"2024-10-16T20:00:00Z\\", \\"DB User Name\\": \\"USER\\", \\"OS User\\": \\"ROOT\\", \\"Source Program\\": \\"CLIENT\\", \\"Server IP\\": \\"10.10.10.10\\", \\"Analyzed Client IP\\": \\"10.10.10.10\\", \\"Service Name\\": \\"SERVICE\\", \\"Client Host Name\\": \\"SERVICE\\", \\"Server Type\\": \\"SERVER\\", \\"App User Name\\": \\"nan\\", \\"Database Name\\": \\"DB_NAME\\", \\"Application Event ID\\": 0, \\"Event User Name\\": \\"N\\", \\"Event Type\\": \\"N\\", \\"Event Value Str\\": \\"N\\", \\"Event Value Num\\": \\"N\\", \\"Event Date\\": \\"N\\", \\"Server Port\\": 1521, \\"Network Protocol\\": \\"TCP\\", \\"Total Records Affected\\": -1, \\"Server Host Name\\": \\"SERVER\\", \\"Timestamp\\": \\"2024-10-16T21:33:39Z\\", \\"Original SQL\\": \\"SELECT SEQUENCE_HISTORY.currval FROM dual\\", \\"Average Execution Time\\": 1, \\"Uid Chain\\": \\"N\\", \\"Uid Chain Compressed\\": \\"N\\", \\"Session Start\\": \\"2024-10-16T20:41:59Z\\"}",
"not_json": false,

 



@chicoqueiroga wrote:

'App User Name': nan, 




Hi,


it seems that the value nan for 'App User Name' is not enclosed in quotes. (even though it represents "not a number.")


Testing the same with quotes will work:


"message_handler": "{\\"UTC Offset\\": -3, \\"Instance Id\\": 888, \\"Session Id\\": 888, \\"Successful Sqls\\": 3, \\"Failed Sqls\\": 0, \\"Objects and Verbs\\": \\"dual SELECT;SEQUENCE_HISTORY SELECT\\", \\"Construct Id\\": \\"888\\", \\"Period Start\\": \\"2024-10-16T20:00:00Z\\", \\"DB User Name\\": \\"USER\\", \\"OS User\\": \\"ROOT\\", \\"Source Program\\": \\"CLIENT\\", \\"Server IP\\": \\"10.10.10.10\\", \\"Analyzed Client IP\\": \\"10.10.10.10\\", \\"Service Name\\": \\"SERVICE\\", \\"Client Host Name\\": \\"SERVICE\\", \\"Server Type\\": \\"SERVER\\", \\"App User Name\\": \\"nan\\", \\"Database Name\\": \\"DB_NAME\\", \\"Application Event ID\\": 0, \\"Event User Name\\": \\"N\\", \\"Event Type\\": \\"N\\", \\"Event Value Str\\": \\"N\\", \\"Event Value Num\\": \\"N\\", \\"Event Date\\": \\"N\\", \\"Server Port\\": 1521, \\"Network Protocol\\": \\"TCP\\", \\"Total Records Affected\\": -1, \\"Server Host Name\\": \\"SERVER\\", \\"Timestamp\\": \\"2024-10-16T21:33:39Z\\", \\"Original SQL\\": \\"SELECT SEQUENCE_HISTORY.currval FROM dual\\", \\"Average Execution Time\\": 1, \\"Uid Chain\\": \\"N\\", \\"Uid Chain Compressed\\": \\"N\\", \\"Session Start\\": \\"2024-10-16T20:41:59Z\\"}",
"not_json": false,

 


Thanks! With this I can handle the events. Appreciate your time @Digal.


The payload inside the {} seems to be Key-Value pairs, so KV should work. 


Try with grok to get payload and then kv on the payload itself, such as:


grok {
        match => {
            "message" => [
                " {%{GREEDYDATA:kvdata}}"
                ]
            }
        }


Then,


kv {
        source => "kvdata"
        value_split => ":"
        field_split => ","
    }


Reply