The Server Down Knowledge Script (KS) uses a 'data heartbeat' to identify down servers and non-functioning AppManager agents. The main benefit of this approach is that it does not ping every server to determine up/down status; only those which have not returned data to the database recently, which may make it more suitable for large AppManager environments. It also provides a mechanism to identify non-functioning agents if, for example, they are stopped, hung or no longer installed.
The monitoring process consists of two parts: a Stored Procedure and a KS. The Stored Procedure, which must be installed into each AppManager database, determines which agents have not returned a data point from a designated KS within a specified interval in minutes. Separate KSs may be defined for Windows and UNIX agents. There are two versions of the code included, one for version 7 and another for version 8 onwards but that in many cases it may not be necessary to run this script in version 8 due to out-of-the-box server down monitoring performed by AppManager 8 as part of its new self-monitoring capabilities.
The agents identified by the stored procedure are pinged (i.e. ICMP echo) to determine up/down status. Windows servers which ping up may be further checked to determine if the NetIQ AppManager Client Resource Monitor (netiqmc) service is running or not, based on the availability of its TCP port. This type of check is not possible for UNIX agents or the Client Communication Manager service.
The latest version of this KS is 3.63, released 06-Mar-2014. See code body for detailed change history.