Skip to main content

Loops

 

In this section we will discuss some token operations to manipulate the tokenized fields.

 

Loop through Composite Fields

Task: Loop through the subfields of the composite field $.system{}. 

filter {

json { source => "message" array_function => "split_columns"}

for index_, field_ in system map {

statedump {}

}

}

 

Snippet from statedump output:

 

First Loop Iteration:

{….

 "field_": "server-001",

  "index_": "hostname",

  "iter": {

    "system-3": 0

  },

  "iter-3-33": {

    "keys": f

      "hostname",

      "ip_address"

    ]

…}

 

Second Loop Iteration

{...

 "field_": "192.168.1.100",

  "index_": "ip_address",

  "iter": {

    "system-3": 1

  },

  "iter-3-33": {

    "keys": 5

      "hostname",

      "ip_address"

    ]

  },

…}

  1. Equivalent to ;

$ ← flatten($.message{})

 

For index_, field_ in $.system{} : 

Print All Fields

 

# Output:

# index_ = hostname in first run, ip_address in second run. 

# field_ = server-001 in first run, then 192.168.1.100 in second run

 
  1. This loop provides a way to iterate over the fields within a composite field, such as 'system' in this case.
    1. index_: This variable acts as a counter or index, holding the name of the current field being processed within the loop. In the first iteration, index_ will be 'hostname', and in the second iteration, it will be 'ip_address'.
    2. field_: This variable stores the actual value of the field corresponding to the current index_. So, in the first run, field_ will hold the value 'server-001', and in the second run, it will hold '192.168.1.100'.
 
  1. The map keyword in GoStash provides a way to iterate over the fields within a composite field. Here's how it works:
  • map keyword: Indicates that you want to loop through the subfields of a composite field.
  • index_ variable: Instead of holding a numerical index (like in repeated fields -to be discussed later-), index_ stores the actual name of each field within the composite field. For example, if your composite field is 'system' with subfields 'hostname' and 'ip_address', index_ will be 'hostname' in the first iteration and 'ip_address' in the second.
  • Values: Within the loop, you can access the value of each field using the field_ variable, as usual.

The values in the loop will be ;

 

index_

field_

"hostname"

"server-001"

"ip_address"

"192.168.1.100"

 
  1. If "map" was not used as in ; 

for index_, field_ in system {

You will get an error.


 

Tips: 

  1. Be careful in writing the Looping field name, as looping through a non-existent field will not generate a syntax/compilation error but it will be a logical silent error.
  2. The loop body must not be empty, otherwise you will get "Failed to initialize filter" error

filter {

json { source => "message" array_function => "split_columns"}

for index_, field_ in Tags {

    }

statedump {}

}

  1. Watch out for logical errors when using loops! If you're looping through a field that doesn't exist, the loop will silently fail. This can be tricky to debug, so double-check that your fields exist.

filter {

json { source => "message" array_function => "split_columns"}

for index_, field_ in nonExistentField {

statedump {}

}

}

 

 

Loop through Repeated Flattened Fields

Task: Loop through the repeated field $.Tags/].

#Loop through the Tags repeated field ;

filter {

json { source => "message" array_function => "split_columns"}

for index_, field_ in Tags {

statedump {}

}

}

Snippet from statedump output:

 

First Loop Iteration:

{….

 "field_": "login_logs",

  "index_": 0,

  "iter": {

    "Tags-4": 0

  },

…}

 

Second Loop Iteration

{...

 "field_": "dev",

  "index_": 1,

  "iter": {

    "Tags-4": 1

  },

…}

  1. Equivalent to ;

$← flatten($.message{})

For index_, field_ in $.Tags{}

  • $.message{}.Tags
  •  

    Print All Fields 

    # index_  = 0 in the first run then 1 in the second run (both are integers)

    # field_  = "login_logs" in the first run then "dev" in the second run.

    1. This loop provides a way to iterate over the elements within a repeated field, such as 'Tags' in this case.
    • index_: This variable acts as a counter or index, holding the numerical position of the current element being processed within the loop. In the first iteration, index_ will be 0, and in the second iteration, it will be 1.
    • field_: This variable stores the actual value of the element corresponding to the current index_. So, in the first run, field_ will hold the value 'login_logs', and in the second run, it will hold 'dev'.
    1. The ‘Tags’ field is accessed in JSONPath as  $.Tags
    2. .

    AD_4nXcPcO5VGHeTqsDs39xD621PcHXr0q69kfAOon-jgYU_z4Z92s5M7FDFe85Uno926PBpxZ7VCAa0hQ7nV-u0togMDvmRAuP-r8p7zJNybe_htpIrZUluNMG0pM2YE_D-PiInCVkhE3vbt_hTQC4mzexZ3l1sKRwq_iLLoRnzHAvVKsSy?key=YuSiKdRvbl45k-MQMKvjNzQd

    1. Looping through flattened repeated fields is slightly different from looping through composite fields:
    • No map keyword: You don't need to use the map keyword when looping through flattened repeated fields. The array_function handles the flattening, making the repeated field accessible like a composite field with numerical indices.
    • index_ as numerical index: The index_ variable will hold integer values (0, 1, 2, etc.) representing the position of each element in the flattened repeated field.
    • Example: If your flattened repeated field is 'Tags' with values 'login_logs' and 'dev', index_ will be 0 in the first iteration (for 'login_logs') and 1 in the second iteration (for 'dev')
     

    The loop variables values will be ;

     

    index_

    field_

    0

    "login_logs"

    1

    "dev"

     
    1. If "map" was used, i.e. 

    for index_, field_ in Tags map {

    The loop variables “field_” will be the same, but the "index_" variable will be a string not an integer ;

     

    index_

    field_

    "0"

    "login_logs"

    "1"

    "dev"

    1. When using numeric operators like <, <=, >, or >= with repeated fields in GoStash, keep in mind that the map keyword is not applicable in this scenario.
    • map for field names: The map keyword is specifically designed for iterating over the names of fields within a composite field.
    • Numeric operators for numbers: Numeric operators are used for comparing numerical values, such as the indices of elements within a repeated field.
    • Avoid map with numeric comparisons: If you need to perform numeric comparisons within a loop that iterates over a repeated field, do not use the map keyword. Instead, rely on the numerical indices of the elements.
  •  

    Tips
    1. When using loops in GoStash, the map keyword is generally recommended for both composite and repeated fields. This is because:
    • Versatility: map works with both field names (in composite fields) and numerical indices (in flattened repeated fields).
    • Consistency: Using map consistently can make your code easier to read and understand.
    • Exception for Numeric Comparisons: The only scenario where you should avoid using map is when you need to perform numeric comparisons (using operators like <, <=, >, or 😆 with the index of a repeated field. In these cases, you'll need to rely on the numerical index directly.

     

     

    Writing A Simple UDM Event

     

    Write Some Simple Fields into a UDM Event

    Task: Capture some string fields like  $.user{}.username into Non-Repeated UDM fields

    filter {

    json {  source => "message"  array_function => "split_columns"}

    mutate {replace => {"event1.idm.read_only_udm.metadata.event_type" => "GENERIC_EVENT"}}

    mutate {replace => {"event1.idm.read_only_udm.metadata.vendor_name" => "myVendor"}}

    mutate {replace => {"event1.idm.read_only_udm.metadata.product_name" => "myProduct"}}

     

    mutate {replace => {"event1.idm.read_only_udm.principal.application" => "%{event_type}"}}

    mutate {replace => {"event1.idm.read_only_udm.principal.application" => "%{event_type}"}}

    mutate {replace => {"event1.idm.read_only_udm.principal.administrative_domain" => "%{user.profile.location}"}}

    mutate {replace => {"event1.idm.read_only_udm.principal.email" => "%{user.profile.email}"}}

    mutate {replace => {"event1.idm.read_only_udm.principal.platform_version" => "V1"}}

     

     

    mutate {convert => {"user.id" => "string"}}

    mutate {replace => {"event1.idm.read_only_udm.principal.user.employee_id" => "%{user.id}"}}

     

    mutate {merge => { "@output" => "event1" }}

    #statedump {label => "end"}

    }

     

    Snippet from statedump output:

     "@output": =

        {

          "idm": {

            "read_only_udm": {

              "metadata": {

                "event_type": "GENERIC_EVENT",

                "product_name": "myProduct",

                "vendor_name": "myVendor"

              },

              "principal": {

                "administrative_domain": "New York",

                "application": "user_activity",

                "email": "john.doe@example.com",

                "platform_version": "V1",

                "user": {

                  "employee_id": "12345"

                }

              }

            }

          }

        }

      ],

    1. When defining your UDM target schema in GoStash, you have the freedom to choose the name of the top-level field that will contain your event data.
    • Default Value: By default, GoStash uses the field name 'event'.
    • Customization: However, you can customize this to any value that suits your needs. In this example, we're using 'event1' to demonstrate this flexibility.
    • Non-Reserved Keyword: It's important to note that 'event' is not a reserved keyword, meaning you can use it even if you choose a different name for your top-level field.
     
    1. When defining your UDM target schema in GoStash, it's essential to adhere to the UDM field name format to ensure compatibility and consistency.
    • Top-Level Field: The name you choose for your top-level event field (e.g., 'event1') must be followed by '.idm.read_only_udm'. This creates the complete root node for your UDM schema (e.g., 'event1.idm.read_only_udm').
    • UDM Field List: Refer to the official UDM field list documentation https://cloud.google.com/chronicle/docs/reference/udm-field-list#field_name_format_for_parsers  for detailed information on valid field names and their formats.
    • UDM Event Data Model: The overall hierarchy of your UDM schema should follow the structure defined in the UDM event data model documentation _insert hyperlink here]. This ensures that your data is organized according to the UDM standard.
     
    1. When building your UDM schema in GoStash, it's crucial to follow the hierarchical structure defined in the UDM event data model. This ensures that your data is organized correctly and can be interpreted by Chronicle.

    Example: The documentation shows that fields like principal and target should be placed under the event.idm.read_only_udm node.

     
    1. To select suitable fields for your UDM target schema, you'll need to navigate the UDM Event data model hierarchy, which is documented here: plink to UDM event data model].
    • Starting Point: Begin with the 'Event' data model, not the 'Entity' data model. These two models have different structures and purposes.
    • Hierarchy: The UDM Event data model defines a hierarchical structure for your fields. Start at the root node (e.g., 'event1' in our example) and follow the defined hierarchy to choose appropriate fields.
    • Field Selection Criteria: When selecting fields, consider the following:
      • Non-repeated fields: Choose fields that are single-valued, not repeated (i.e., not lists or arrays).
      • String field type: Select fields that have a string data type.
    • Field Type and Multiplicity: The documentation provides information about the field type (whether it's a primitive type like string or integer, or a composite object) and the multiplicity (whether it's repeated or single-valued). Use this information to guide your field selection.
     

    Field Name

    Type

    Label

    about

    Noun

    repeated

    additional

    google.protobuf.Struct

    extensions

    Extensions

    intermediary

    Noun

    repeated

    metadata

    Metadata

     

    network

    Network

     

    observer

    Noun

     

    principal

    Noun

     

    security_result

    SecurityResult

    repeated

    src

    Noun

     

    target

    Noun

     

    Field Name

    Type

    Label

    about

    Noun

    repeated

    additional

    google.protobuf.Struct

    extensions

    Extensions

    intermediary

    Noun

    repeated

    metadata

    Metadata

     

    network

    Network

     


     
    1. In the UDM schema, the event_type field plays a crucial role. It's a mandatory field that categorizes the event and determines the requirements for other fields within the schema.
     

    Here's how the event_type influences mandatory fields:

    • Example: HTTP Events: When parsing logs related to HTTP activity, you would set the event_type field to NETWORK_HTTP. This signals to Chronicle that the event is related to web traffic.
    • Mandatory Fields: By setting event_type to NETWORK_HTTP, certain other fields become mandatory to provide a complete picture of the HTTP event. For example, the network.http.method field, which indicates the HTTP method used (GET, POST, PUT, etc.), is now required.
    • Reference: You can find the specific mandatory fields for NETWORK_HTTP events in the UDM documentation: https://cloud.google.com/chronicle/docs/unified-data-model/udm-usage#network_http 

    This example demonstrates how the event_type acts as a key that unlocks specific requirements for other fields within the UDM schema. By understanding these requirements, you can ensure that your GoStash configurations produce valid and informative UDM events.

     
    1. Using 'GENERIC_EVENT' as your event_type is a good way to learn the basics of UDM without getting bogged down in complex field requirements. You can add more specific fields later as you become more familiar with the schema.
     
    1. To assign the value 'GENERIC_EVENT' to the event_type field in your UDM schema, you'll need to follow the hierarchical structure defined by UDM and utilize the concept of "Constructing Nested String Elements."
    • Hierarchy: The event_type field is nested within the metadata object. This means you need to create the metadata object first before you can assign a value to event_type.
    • Constructing Nested String Elements: This refers to the process of building the nested structure by creating the parent objects first. In this case, you would first create the metadata object and then assign the value 'GENERIC_EVENT' to the metadata.event_type field

    Root ($.event1.idm.read_only_udm) ⇒ (metadata : Composite object of type “Metadata” ) ⇒ (event_type : Enumerated String)

    mutate {replace => {"event1.idm.read_only_udm.metadata.event_type" => "GENERIC_EVENT"}}

     
    1. The step-by-step approach to mapping simple string non-repeated fields to your UDM schema in GoStash:
    1. Identify a Suitable Parent: Start by identifying a relevant parent field from the UDM field list (e.g., 'about', 'additional', 'target').
    2. Check the Object Type: Refer to the 'Type' column in the UDM documentation to understand the structure and composition of the parent field.
    3. Select a Non-Repeated Field: Choose a suitable field within the parent that is not repeated (i.e., not a list or array).
    4. Continue Drilling Down: Repeat steps 2 and 3 until you reach a field that is either an enumerated string field or a single-valued field.
    5. Use replace to Build the Hierarchy: Use the replace operator to construct the full path from the root of your UDM schema ('event1.idm.read_only_udm') to the selected UDM field. This involves creating any necessary intermediate parent levels along the way.

    This approach ensures that your data is mapped correctly to the UDM schema, following its hierarchical structure and field type requirements.

    The final steps in your GoStash configuration involve writing the transformed data to the output and preparing the parser for production use:

    • Writing to Output: The statement @output ← $.event1 instructs GoStash to take the data that has been structured under the event1 field and write it to the output destination. This makes the transformed data available for consumption by Chronicle or other systems.
    • Disabling statedump: The statedump operation, which is used for debugging and inspecting the internal state of the parser, should be commented out or removed before deploying the parser to production. This is because statedump can introduce performance overhead and is not necessary for normal operation.

    mutate {merge => { "@output" => "event1" }}

    statedump {label => "end"}

     

    Snippet from UDM Event Output;

    AD_4nXcKslmXj24FGf_fgv1mw7AjKUglK0Y841gHtsxZc0w-HnaDdVUP_yxk1QqqLahGWNAwJ8iYsTj3RAysNBu9Kz-UjcAEoYLgD1-2s4XylR3XwAWOHQ9OGd2xoBzTz7-fgw8GZ36jnL4kyWa04MLS0Bv6s2bMOTdQgxFI-wilqXmA8RM?key=YuSiKdRvbl45k-MQMKvjNzQd

    Conclusion

    This concludes Part 1 of our GoStash guide, where we've covered the foundational concepts of parsing and transforming log data into the Unified Data Model (UDM) format. We explored the key components of GoStash syntax, including field types, operators, and looping statements. Additionally, we delved into the intricacies of JSONPath and its relationship to GoStash, highlighting the differences and similarities between the two.

    By now, you should have a good understanding of how to structure basic GoStash configurations, manipulate data fields, and create a valid UDM schema. Remember that practice is key to mastering GoStash. Experiment with different configurations, explore the GoStash documentation, and don't hesitate to seek help from the community if you encounter challenges.

    In Part 2, we'll delve deeper into more advanced GoStash techniques, such as complex conditional logic, and repeated fields. We'll also explore how to handle various log formats and troubleshoot common parsing issues. Stay tuned for more exciting insights and practical examples that will empower you to become a GoStash expert!

    Be the first to reply!

    Reply