Connectivity issues happen, in this article we'll show you a few things we use to troubleshoot your agent to server communication issues.
When we suspect a problem we always check the status using the agent_control daemon.
# /var/ossec/bin/agent_control -l
This will list out all your agents. Using the -l option will list all agents, active or not, while the -lc will list only active agents. Here is an example of what that might look like:
OSSEC HIDS agent_control. List of available agents:
ID: 000, Name: OSSECM (server), IP: 127.0.0.1, Active/Local
ID: 001, Name: AGENT01, IP: any, Active
ID: 002, Name: AGENT02, IP: any, Not Active
ID: 003, Name: AGENT03, IP: any, Active
ID: 004, Name: AGENT04, IP: any, Not Active
ID: 005, Name: AGENT05, IP: any, Active
If you see an agent that is not active, log into that agent and navigate the ossec.log file to see what is going on:
2021/10/09 03:39:33 ossec-agentd(4101): WARN: Waiting for server reply (not started). Tried: '[mothership IP]'. 2021/10/09 03:39:35 ossec-agentd: INFO: Trying to connect to server ([mothership IP]:1514). 2021/10/09 03:39:35 ossec-agentd: INFO: Using IPv4 for: [mothership IP] . 2021/10/09 03:39:56 ossec-agentd(4101): WARN: Waiting for server reply (not started). Tried: '[mothership IP]'. 2021/10/09 03:40:16 ossec-agentd: INFO: Trying to connect to server ([mothership IP]:1514). 2021/10/09 03:40:16 ossec-agentd: INFO: Using IPv4 for: [mothership IP].
These logs confirm the connection failure.
Open your firewall, and verify outgoing rules are not blocking the connection. If you're not sure, save your firewall rules and flush them, then check the connection. If they start working, then you know where to start.
Verify traffic is reaching your OSSEC manager by using TCPDUMP on the manager. Something like this should do the trick:
# tcpdump -i eth0 port 1514
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
OSSEC uses port 1514 by default, and the UDP protocol That is why I have it set to look at port 1514.
Be sure to restart OSSEC to initiate a new request and track it clean:
It doesn't hurt to actively tail the logs at ossec.log to see what it tells you as it initializes. If you see something like this:
# tail -F /var/ossec/logs/ossec.log
2012/10/09 03:47:17 ossec-remoted: WARN: Duplicate error: global: 0, local: 51, saved global: 5, saved local:7563
2012/10/09 03:47:17 ossec-remoted(1407): ERROR: Duplicated counter for 'Agent001'.
You're in luck. This is usually indicative of a conflict with the RIDS queue. This queue is used to record each sent and received, it's a way of preventing replay attacks. This duplication usually comes when restoring OSSEC from a backup, or reinstalling, but not upgrading. The solve is simple, clear the RIDS queue.
# /var/ossec/bin/ossec-control stop
# rm -rf /var/ossec/queue/rids/*
# # /var/ossec/bin/ossec-control start
More often than not, that should do the trick. Monitor the osse.log file, and you should start seeing TCPDUMP on the manager populate with data for the specific port. If so, go back to Step 1 and you should see the Agent change to Active.