Hello everyone,
I am having a quite hard time trying to parse a MalwareByte logs in CEF + KV format, since the kv pairs are separated by a simple space and several values contains spaces as well. Here a (reconstructed) example:
<13>Apr 8 14:59:06 cercer CEF: 0|Malwarebytes|Malwarebytes Endpoint Protection|Endpoint Protection 1.2.0.1193|Detection|PUP found|2|deviceExternalId=239dw57h9861fe48342534f dvchost=cercer deviceDnsDomain=fake.local dvcmac=458234F23E33 dvc=10.10.10.10 rt=Apr 08 2024 14:59:06 Z fileType=file cat=PUP act=found msg=PUP found\\nFile: C:\\\\Users\\\\Gengis\\\\AppData\\\\Local\\\\Google\\\\Chrome\\\\User Data\\\\Default\\\\Sync Data\\\\Rott\\\\in.ldb\\nMD5: HEHEDEE3DE24DE4343FHT3TT3HTW\\nSHA256:R3OU4HTIF39U4TND3487H387S64HDE9CU309JV4UT9F0V4KUY5GTJ894YHV3JTY9 filePath=C:\\\\Users\\\\Gengis\\\\AppData\\\\Local\\\\Google\\\\Chrome\\\\User Data\\\\Default\\\\Sync Data\\\\Rott\\\\1NCu.ldb cs1Label=Detection name cs1=PUP.Optional.PushNotifications.Generic cs3Label=Detection ID cs3=ijbf4398-7ryn-3944-38fy-n3g48ygr3uyj
I tried several approaches to solve this, but could not make it work. Big problem is the regex captuing functions do not work, so trying regex patterns like
gsub => ["inner_message", "(\\\\w=)", ",\\\\1"]
to modify the separator char are useless.
Is there any other peculiar function or trick that I am missing? I see there are several prebuilt parser working on CEF formats, so there must be a way around this...
Many thanks
A