Capstone Example
Capstone Example: Constructing a List of Complex Composite JSON Objects |
In this example we will pick a more complex scenario ; Construct a JSON object consisting of "action_type","target_ip" of the first session only, and the user integer ID. JSON Path of Source Fields ; $.message.user{}.sessions}0].actions $.message.user{}.sessionsu0].actions $.message.user{}.sessionss0].actions $.message.user{}.id Required Target Schema ; listFirstSession (Repeated, Composite) eachElement – action (Composite) – type(string) , dst(string) and userId (integer). In short ; listFirstSessiono] – action – type, dst, userId The expected constructed field should look like ; {"listFirstSession" : < {"action": {"type": "login", "dst" : "10.0.0.10", "userId": 12345}}, {"action": {"type": "search", "dst" : "10.0.0.10", "userId": 12345, "query":"weather"}}, {"action": {"type": "logout", "dst" : "10.0.0.11", "userId": 12345}} ]} |
Input Schema before Flattening Input Schema after Flattening Required Target Schema |
|
Snippet from statedump output: {… "listFirstSession": > { "action": { "dst": "10.0.0.10", "type": "login", "userId": "12345" } }, { "action": { "dst": "10.0.0.10", "query": "weather", "type": "search", "userId": "12345" } }, { "action": { "dst": "10.0.0.11", "type": "logout", "userId": "12345" } } ], ..} |
Schematic for the transform |
Summary: The core logic for mapping data involves transforming the input schema into a target schema with the following structure: listFirstSession(Repeated).action(composite).type, dst, userid, query(atomic). We'll use a temporary variable, temp, to hold the structured data for each action within the first session. This variable will have a hierarchy mirroring the target: temp.action.type, temp.action.dst, temp.action.userid, and temp.action.query. The input data has two levels of repeated fields: user.sessions and user.sessions.actions. To process all the actions, we'll implement two nested loops. The outer loop iterates through each session, and the inner loop iterates through the actions within each session. Finally, because the query field in the input is optional and can appear multiple times, we'll include error handling to gracefully manage cases where it's absent."
|
UDM Schema Mapping
In this final example, we will use what was discussed so far to tokenize and map target fields to UDM event format, in addition to a few more .
Interpreting UDM Schema
The UDM usage guide https://cloud.google.com/chronicle/docs/unified-data-model/udm-usage and UDM field list https://cloud.google.com/chronicle/docs/reference/udm-field-list documents detail the data model for UDM entity and events. The focus of this guide is on Events UDM Schema.
We highlight the following properties ;
-
Schema Adherence: All mapped fields must strictly conform to the UDM data model structure and the defined data type for that field (e.g., Integer fields require integer tokens, Repeated fields require List tokens).
-
Some UDM events have mandatory fields, for example ;
-
Mandatory 'Metadata.event_type': Every UDM event necessitates a value for the 'Metadata.event_type' field. The UDM usage guide lists the possible values. https://cloud.google.com/chronicle/docs/unified-data-model/udm-usage#metadataevent_type , 'GENERIC_EVENT' serves as a versatile, catch-all type.
-
Conditional Field Requirements: The UDM documentation specifies that the necessity (optional or mandatory) of other fields depends on the value of 'Metadata.event_type’ as listed in ;
-
-
https://cloud.google.com/chronicle/docs/unified-data-model/udm-usage#required_and_optional_fields,
For example when “Metadata.event_type” = “NETWORK_HTTP” , the mandatory fields are listed in https://cloud.google.com/chronicle/docs/unified-data-model/udm-usage#network_http . -
“GENERIC_EVENT” event type has the least restrictive number of events so it is always a good choice to start building custom parsers with this event then start experimenting with other types.
- Avoid Deprecated Fields: It is essential to refrain from using any fields marked as deprecated in the UDM field list to ensure long-term compatibility as listed in https://cloud.google.com/chronicle/docs/deprecations
Part 2.4 will cover miniature examples of how to map source data to the UDM (Unified Data Model) format, focusing on the mapping algorithm and avoiding complex loops.