Any thought on HPSA agent troubleshooting using HPOO
Wanted to develope some flow which will do sometroubleshooting for unreachable HPSA agent.
Any inputs will be highly apperciated!!
It seems pretty simple. BTW, I speak from a Unix background.
First, you'll need to have an account that's already on the box (user, root, etc.). Then write you flow to access that box via ssh from your appropriate Central or RAS.
Then once logged into the box, do a quick service opsware-agent status. Then grab the result code. 0 = successful, 1 = failed.
Then start to take action. service opsware-agent stop, do a look at the ps -ef | grep opsware-agent and kill that PID, then restart opsware-agent. Then if fails, do the full stop and kill, then check to see if something is already listening on 1002:
netstat -an | grep -i listen | grep 1002
Sometimes NFS services will take port 1002.
Next, you'll want to determine the Gateway of that server. You can use the ServerVO HP Integration with HPSA and HPOO to get the serverID, and its gateway.
Then do a lookup on that gateway to find its primary IP.
Then back on the managed server that's having a problem, check for the following TCP Ports to be opened
nc -z OPSWAREGWIPADDR 1002
nc -z OPSWAREGWIPADDR 2001
nc -z OPSWAREGWIPADDR 3001
nc -z OPSWAREGWIPADDR 4040
using result codes to determine if it was successful or not (0 vs. 1).. this way you don't have to parse text all the time.
At that point, you know if the agent died, its port was stolen, the FW rules weren't in for agent -> GW communcation.
Then for good measure, throw in a check to see if the Filesystem is ReadOnly since its soo common now a days for people to way overstack their VM's on the ESX Luns
Well that's what I'd start with at least! Share the flow when you're done 🙂