AppManager Tips and Tricks - Automated actions

You already know that AppManager can monitor just about anything which enables you to get alerts when things go south or analyze performance information so that you can make better decisions about your application and server environment right? Of course you did.  I just needed to check.

One thing that you may not be aware of however is that AppManager has the capability to take corrective actions when alerts are generated.  This can get you out of the business of responding to phantom alerts and doing simple things like service restarts.  Think about it.  How much time does it take to remote to a server, open up control panel, wait for the services to enumerate, scroll down to find the service and then just start it.  Probably a good 5-10 minutes.  Now think about how many times a week you may have to (or someone else) do that.  It can add up to a lot of time and as we all know,  time is money. Additionally, as an IT Operations professional,  your job is to keep things running and in many cases,  that means reducing downtime as much as possible.  Having your monitoring tool automatically restart applications or servers can really help reduce MTTR making your metrics look great.

AppManager has a category of knowledge scripts called actions.  These action do things (hence the name action).  You can do simple things like send an email,  send a page or automatically close an event.  Most customers use these actions regularly.  There are also many action scripts that can take corrective actions.

  • Restart services - The majority of the scripts that monitor services have a built in capability to automatically restart services.  These include NT_ServiceDown, IIS_HealthCheck, SQL_ServerDown and a few others.  Basically,  if the service is detected as down,  the script can attempt to restart it.  If it doesn't come back in a specified amount of time or after a specified number of tries,  then an event is generated.  It will still generate an event if it restarts OK but that can be set to a lesser severity that doesn't generate an email or pager message

  • Reboot server - Lets face it,  there are some applications out there where the only fix is to reboot the server.  If you have a few of those, and you know what to monitor on the app,  you can have AppManager automatically reboot that server.  You have to add a registry key that allows AppManager to do it but it is easily done if you need.

  • Do you run a NetApp Filer environment?  If so, there are actions specific to NetApps that enable you to execute a snapmirror or any other filer command.

  • Do you have some things that happen that you need to keep track of yet don't want to get notified about regularly?  AppManager can write out information to the Windows EventLog or to a text file.  This allows you to keep track of things while not be burdened with events in the console.  Many people do things like write out to a text file and then have another job that monitors the size of that text file.  When the size hits a threshold, they're notified and can take a look at it.

  • How many Powershell scripts do you have laying around?  Did you know that you can execute powershell scripts (as long as there is a path to the script) in response to events.  This opens up an entire world of being able to take some very complex corrective actions without having to touch a single system.

  • Collect additional information - Maybe you get a high memory or high CPU alert and want to know a little bit more.  If you're not already running them, you can have those events start up a script to collect the top 10 memory or CPU processes and then send that list to you so that you can make a better informed troubleshooting decision.

  • If you are monitoring VOIP or Unified Communications environments, you can automatically trigger a VOIP Quality diagnostic trace to run.  This way, you can save time hunting for data and get right to troubleshooting.

  • In some cases, running a simple operating system command is useful.  On both Windows and Unix/Linux,  you can run console commands as an action (including calling batch files and scripts).

There are numerous others actions available.  I've just touched on a few of the more common ones.  If you want to really have some fun,  take a look at our Aegis IT Process Automation solution.  Talk about massive capability.  Aegis workflows can be triggered by AppManager events and can take multiple step corrective actions like taking a snapshot of a VM or issuing a vMotion Command.  It has interfaces into anything with a backend database or webservices as well including things like Azure or Amazon cloud services.  Well worth a look.

As is usually the case,  if you've got a favorite action that I didn't list or if you disagree with anything I've said, please feel free to sound off in the comments section below.

Happy monitoring!


How To-Best Practice
Comment List