Skip to main content

Hi Team,

What are the best practices for optimizing data when multiple sources such as firewalls, EDR, and telemetry are involved? Specifically:

  • How should log source prioritization be determined when there is overlap?
  • What factors should guide the decision on which event types to retain or ignore to minimize redundancy?

Looking forward to your insights and recommendations ?

Thank you!


Adding to Kent’s advices, the main point is to maximize the detection value with a lesser log volume/EPS.

  1. Prioritization In case of Overlap :
    1. Compliance : Which ones must be monitored as mandated by any compliance or regulatory requirements.
    2. Coverage : Is one of them covering an extra or additional zones ? (e.g. EDR is monitoring all host-to-host/server communication with high-fidelity, so we can log only the any-to-external interfaces traffic on the firewalls and suppress the internal-to-internal ones)
    3. Audit : Configuration, Logging Policy, Access and Authorization, state (reboot,restart,..etc) changes in must be monitored in all devices regardless of the redundancy.
    4. Data Quality :
      1. Unique Discriminator Fields : some data source could have additional fields that are high value for the detections (e.g. UTM/NGEN firewall having Web Categories compared to the standard firewalls).
      2. Suppression:  a trade-off between sacrificing some fields and gaining lower EPS (e.g. Suppressing internal-to-internal firewall connection initiation logs while logging the connection closure logs, since they will have more info like the session duration and flags, while relying on the firewall’s DDoS protection module logs to trigger only if there is a internal-to-internal SYN flood, and EDR is already covering the standard connections logs) This is a calculated risk however that must be documented.
      3. Aggregation : Sacrifice some the visibility in some fields as a trade-off for Log volume (e.g. Removing Process ID from noisy high frequency process restarts in Unix-based OSs) but this is also a calculated risk, ideally the aggregation should be time-based and applied on the source (either the data source like the firewall/EDR, OR through the Transport modules (Bindplane, Cribl,..etc) .
  2. Datasources EPS​​​
    1. Audit Policy: Apply standard audit policies recommended by the vendors if provided in accordance with any compliance requirements.
    2. Noisy Categories Per Log Type: Look for the top 10 Event Categories Per data sources, check if there is an indication of policy tuning or enhancement to avoid log congestions and alert fatigue (e.g. ML detections in EDRs could be very noisy if not tuned).
    3. Detection Value: Focusing on the detection rules for your threat profile, which log types are required to monitor a TTP (e.g. Kerberos logs for Golden Ticket attacks, logging critical registry hives modifications, ...etc)
    4. Parsing : Based on “2” , are there too many GENERIC_EVENT per log_type; that could indicate some parsing issues.
    5. Batching : For data sources without a need of immediate or real-time logs visibility, batching could save some BW on the cost of data caching.
    6. Logs Priority: Look for the noisy hosts within the lowest severity of each data source, prioritize the events with event_type=GENERIC_EVENT as these will be either low-quality or mis-parsed events, or could just indicate misconfigurations (crashing process, invalid saved credentials,..etc)

Hope these tips would help.


Thank you ​@kentphelps ​@AbdElHafez !


Reply