GroupWise 8 SP1 Linux - Script Checking Services w/auto restart/logging

0 Likes

This is a script I wrote to automate the restart of GroupWise 8 sp1 services on our SLES11 server. Frequently the grpwise and gwmta services would fail on the server. Restarting the services manually in the middle of the night got old fast. This script will check the grpwise and gwmta services to see if they are running or not. If not it will be logged, restarted, checked for completion, logged and if it fails, logged again.



This script could be easily modified to include other services like webaccess, apache, tomcat, and the gwia. This script uses two methods for finding the running service/process. grpwise services does not run with a PID so the method for finding if the service is running is different than finding the gwmta process, take note.



You can download the script as well. Rename file to have .sh extension. Upload to server the as root then chmod u x <script_name>.sh to make the script executable. Edit crontab and insert 10,20,30,40,50,00 * * * * /opt/novell/groupwise/<script_name>.sh to run the script every 10 minutes



Code:

#!/bin/bash
#
#------------------Change LOG------------------
#v9.5-Added hostname to email notifications
#v9.5.1-Removed $(date) from mail string. Mail was processing $(date) as an email address.
#
#----------------------------------------------
#
#Edit crontab as root to modify the schedule of this script.
#10,20,30,40,50,00 * * * * /opt/novell/groupwise/<script_name>.sh
echo "This script will check to see if $SERVICE1 and $SERVICE2 are running. If one or both are not running this script will start the processe(s). This script will also log the event and email $EMAIL1,$EMAIL2."

#Defined variables
LOGDATE=$(date "%m%d%Y")
LOG="/var/log/novell/groupwise/gwscript_logs/gw_services_$LOGDATE.log"
SERVICE1='gwmta'
SERVICE2='grpwise'
HOST=gwmail
EMAIL1=root
EMAIL2=email@domain.com
VERSION=v9.5.1

#Check to see if log file exists already, if not create log.
if [ -e $LOG ]
then echo "Log exists"
else touch $LOG ; echo "Log file $LOG created"
fi

#Write header to log.
echo -e "-------------------------$SERVICE1/$SERVICE2 Service Check Log $VERSION-------------------------">>$LOG

#Check to see that gwmta process is running.
if ps ax | grep -v grep | grep 'gwmta --home /opt/novell/groupwise/<insert domain here>' > /dev/null 2>&1
then
echo -e $(date) "$SERVICE1 service is running.">>$LOG
elif
#If service is not running try to restart and send email notification.
echo -e $(date) "Sending Email Notice that $SERVICE1 is not running to $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$SERVICE1 is not running, attempting to restart $SERVICE1 now....">>$LOG
echo -e $(date) "$HOST $SERVICE1 is not running! attempting to restart" | mail -s "$HOST $SERVICE1 is down." $EMAIL1,$EMAIL2

/opt/novell/groupwise/agents/bin/gwmta --home /opt/novell/groupwise/<insert domain here> &
#for script testing only: /etc/init.d/apache2 start
#Import last 15 lines from /var/log/messages to help with troubleshooting.
echo -e $(date) "Importing last 15 lines from /var/log/messages to $LOG now.... \r \r From /var/log/messages:">>$LOG
tail -15 /var/log/messages>>$LOG
echo -e "End /var/log/messages">>$LOG
echo -e "\r">>$LOG
then
#Check to see that service started correctly and send email notification.
ps ax | grep -v grep | grep 'gwmta --home /opt/novell/groupwise/<insert domain here>' > /dev/null 2>&1
echo -e $(date) "$SERVICE1 started successfully">>$LOG
echo -e $(date) "Sending Email Notice that $SERVICE1 restarted successfully $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$HOST $SERVICE1 has been restarted successfully" | mail -s "$HOST $SERVICE1 restarted" $EMAIL1,$EMAIL2
else
#If service failed to start send email notification.
echo -e $(date) "$SERVICE1 failed to restart">>$LOG
echo -e $(date) "Sending Email Notice for $SERVICE1 restart failed $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$HOST $SERVICE1 restart failed!" | mail -s "$HOST $SERVICE1 restart failed" $EMAIL1,$EMAIL2
#Import last 15 lines from /var/log/messages to help with troubleshooting.
echo -e $(date) "Importing last 15 lines from /var/log/messages to $LOG now.... \r \r From /var/log/messages:">>$LOG
tail -15 /var/log/messages>>$LOG
echo -e "End /var/log/messages">>$LOG
echo -e "\r">>$LOG
fi

#Check to see that grpwise service is running.
if /sbin/service $SERVICE2 status |grep running > /dev/null 2>&1
then
echo -e $(date) "$SERVICE2 service is running.">>$LOG
echo -e "\r">>$LOG
elif
#If service is not running try to restart and send email notification.
echo -e $(date) "Sending Email Notice for $SERVICE2 is not running to $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$SERVICE2 is not running, attempting to restart $SERVICE2 now....">>$LOG
echo -e $(date) "$HOST $SERVICE2 is not running! attempting to restart" | mail -s "$HOST $SERVICE2 is down." $EMAIL1,$EMAIL2
/etc/init.d/$SERVICE2 start
#Import last 15 lines from /var/log/messages to help with troubleshooting.
echo -e $(date) "Importing last 15 lines from /var/log/messages to $LOG now.... \r \r From /var/log/messages:">>$LOG
tail -15 /var/log/messages>>$LOG
echo -e "End /var/log/messages">>$LOG
echo -e "\r">>$LOG
then
#Check to see that service started correctly and send email notification.
service $SERVICE2 status |grep running > /dev/null 2>&1
echo -e $(date) "$SERVICE2 started successfully">>$LOG
echo -e $(date) "Sending Email Notice for $SERVICE2 restarted successfully $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$HOST $SERVICE2 has been restarted successfully." | mail -s "$HOST $SERVICE2 restarted" $EMAIL1,$EMAIL2
echo -e "\r">>$LOG
else
#If service failed to start send email notification.
echo -e $(date) "$SERVICE2 Failed to restart">>$LOG
echo -e $(date) "Sending Email Notice for $SERVICE2 Failed to restart to $EMAIL1,$EMAIL2">>$LOG
echo -e $(date) "$HOST $SERVICE2 failed to restart." | mail -s "$HOST $SERVICE2 failed to restart." $EMAIL1,$EMAIL2
#Import last 15 lines from /var/log/messages to help with troubleshooting.
echo -e $(date) "Importing last 15 lines from /var/log/messages to $LOG now.... \r \r From /var/log/messages:">>$LOG
tail -15 /var/log/messages>>$LOG
echo -e "End /var/log/messages">>$LOG
echo -e "\r">>$LOG
fi

#end script


If you have ideas on how to improve the script please share them.

Tags:

Labels:

How To-Best Practice
Comment List
  • Thank you very much for sharing it.

    Note to Novell: Why the hell do we need such scripts? I'm tired of babysitting your products.
  • I think its a pretty nice solution. It should have been built into groupwise really.

    Simple is beautiful.
  • Thanks for the work on your script, but I also recommend that you run the GWHA service. I have mine set at a 30 second poll so when an agent drops it is quickly restarted and most users never notice when their POA goes down (we run in caching mode.) If you can't run the GWHA service, take a look at the script in TID 7002916. It checks through the processes to see if your WebAccess agent is running, and if it isn't it logs the event, sends an e-mail and restarts the agent. It should be easy to modify for your needs to restart any agent.

    Again, great work and thanks for the tip.
  • Let me just say I'm no GroupWise expert by any means, writing this script was a huge learning experience on how GW server works. I don't usually work with the GW server so I was not ware of GWMonitor or GWHA. That being said from what I've just read about them, because we are running an Linux environment, the GWMonitor is suggested to be run behind a firewall because the monitor utility is web only on Linux. This server is in the DMZ as it is webaccess only box. We could run the monitor utility however there is a possible security risk involved in that. Since GWHA relies on GWMonitor agents to provide it information that's not going to work either (please correct me if I'm wrong). I suppose if we had it running on another Linux box in the private network than we could poll the server and agents, but we only have this one GW Linux box, the rest are Netware (in the process or migrating). In short, this was the quickest solution.

    I'm glad you mentioned /var/run/novell/groupwise/, its not something I thought about before. In testing what you say, you are right, my method of searching for grpwise status is flawed. The PID is only written to the [agent].PID file when the process is started and is not again updated until the process is restarted in the event it fails. I should be searching for the gwia and webacc processes more specifically. The status is incorrectly returned when only one process gwia or webacc is 'dead.'

    It looks to me like the grpwise script checks to see what processes it has the ability to start are running or not and only restarts the one(s) that are not running. If I kill gwia and run /etc/init.d/grpwise start it will only start the gwia and won't bother webacc. The possibility for duplicate processes is rather small in my opinion.

    I'll have to revise the script to search for gwia and webacc. Thanks for bringing this to my attention!
  • Why not just use GWMonitor with GWHA (High Availability agent)?

    'grpwise' is a script, it is not a process, hence it will never have a PID. All GroupWise agents certainly do have PIDs, and these need to also exist in /var/run/novell/groupwise/[agent].PID . If the real PID is not in this file for any given agent, then 'rcgrpwise [agent] [command]' will fail to properly report the status or stop an agent, and 'start' might start a second instance of the agent.
Related
Recommended