Skip to main content

Hi Everyone, 

TL;DR: Has anyone dealt with debugging why a SOAR remote agent would start failing? How did you manage to get the Agent to connect again?

---

I realized today (after several playbooks started failing) that the remote agent that we set up hasn't connected since the 11th of May.  I currently have a case open with support but getting a fix may take a while.

When opening up the Remote Agent page, it shows that the Status is "FAILED" and when I download the logs for the last week, the file itself is empty (I'm assuming this is due to the FAILED connection for over a week).

When I dive into the host, the docker connection is still live and healthy, so I'm not sure where to begin since it's likely on the Google side. (Initial deployment was done through the "Send Installer/docker Command" action within the SOAR itself. 

I've done some debugging on my end, including:

  • Reviewing Docker logs on the host - nothing new in the logs over the past 3 weeks which could be related to the current FAILED status. 
  • Disabling / re-enabling the agent but the status still shows as failed and I am unable to find the logs which could point to the reason. 
  • Deployed a new agent which auto-fails. 

Questions:

  1. Is anyone having the same issue with remote agents?
  2. How are you monitoring for agents changing status to Failed?
  3. Have you been able to resolve an issue like this? How did you debug and fix the issue?

TIA

 

 

Managed to fix this with the help of Google support - but the question remains:

  • How are you monitoring for agents changing status to Failed?

Also wondering this


Also wondering this


@yasterday After speaking with Google support, there is no way to monitor for this at the moment through logs. You need to create a scheduled job  to monitor for changes in Remote Agent status. 

Idea:

  • Create a list of Active agents
  • Compare status of active agents against requirement
  • Create a case if Agents do not match expected status

Here's an example function to get you started using the "GetAgents" endpoint. 

def get_remote_agents_status(auth_header, soar_url, live_agents, siemplify):
siemplify.LOGGER.info("[info] Retrieving Agents..")
resp = requests.get(url=soar_url + "/api/external/v1/agents/GetAgents?format=camel",
headers=auth_header
)

siemplify.LOGGER.info("[info] List of Agents have been retrieved..")
remote_agents = resp.json()

for agent in remote_agents:
# We're only interested in Live remote agents that change status from 0 - list of status codes:
# Live 0
# Error 1
# Disabled 2
# WaitingForAgent 3
# Deleted 4
if str(agent['identifier']) == live_agents and str(agent['status']) != "0":
siemplify.LOGGER.info("[info] Issue with remote agent identified, creating SecOps case for agent:")
// Do Something

 Hope this helps!


Reply