While the concept of Business Service Management (BSM) is an approach to managing the enterprise and tends to have some unique characteristics per implementation, we have encountered some commonality amongst them. One specific piece is the way in which servers are represented in the overall topology. When it comes down to it, the most basic way to represent the state of the server is to link all of the management tools under the server. The reality is, not all critical alarms are an equal impact to the health of the server. One approach is to categorize the types of alarms into buckets that represent high level facets of the health such as the Network or Applications/Processes even a metric from Help Desk or Change Management. The way to accomplish this is to use a template type of approach for how servers are depicted.
Below is an example of the end view of just the server. The idea is that by categorizing the types of information coming in, we can provide general rules on how the state is propagated (IE: critical/red flows up to the server). In the view below, it appears that the backup server is down, or the backup from last night failed, overall, it should not have an impact to the server availability, but it is useful information. There also appears to be some P1 helpdesk tickets opened against the server, we are not sure if there is impact, but when we look at the other silo's, Performance of the server appears fine, network access appears file, there appears to be a problem with one of the processes, but it is still operating, probably just slow or errors in the log file. So based on all of that, the server is being represented at yellow, more of a caution type of condition, not critical that we put everyone on it to fix it, especially if there are outages going on elsewhere in the environment. With the right view and the right state propagation rules, NOC will help prioritize your teams to focus on the most important things.
Option 1: One of the ways customers have templatized the topology of a server is to use Business Data Integrator. For cases where you are able to mine the list of servers from a database, layout the topology on the actual BDI definition. Sometimes we use adapters for driving topology and not for direct management. For this example, you setup a folder in the BDI definition to do a DBElement query and generate server elements. Underneath the DBElement, add additional folders as children such as Network, Help Desk, Change Management, Performance, etc. You are not going to put additional queries under these folders, they are more place holders for future steps. This topology is then used as a subset of the end to end view, specifically, it is used for the servers within the overall service.
The figure below is an example of what the folder might look like for BDI, it is more of a structure of a view, not a management metric feedback at this point. (IE: not alarms or state, but it could)
Option 2: Another approach customers have taken is around the event management adapters such as Netcool, T/EC and Event Integrator. The idea there is to update the hierarchyfile to add the additional folders under the server breakout, again, same idea, empty placeholder folders underneath the service for the groupings of Network, Performance, etc. Then this is used by BSCM to build the structure of how a server is represented.
Option 3: This option is a bit more advanced, the idea in this option is to generate the children folder on the fly within a BSCM script. As elements are created in the end view, based on the class of the element created (server, application, etc), you can dynamically create the folders (elements) on the fly that correspond to the class (server, app, etc) of element that was just created by BSCM. I previously blogged on how to create elements via script which is the core concept needed, it will just need to be wrapped with "if" statements around the class of the element. Please leave a comment if you would like me to blog this option for more details.
The next figure is an example of an update to the hierarchy file for an event adapter (necool, T/EC, etc), the difference here is that filters have been applied to some of the folders to start the process of moving specific types of management to their corresponding folder. The filter method is NOT required, BSCM could place the event management under the appropriate folder, its more that some of our customers have used this point to get a jump start on it. Notice how only a few of the sub-folders are showing condition.
For part 1 of this series, it is about getting the initial building blocks in place (sub-folders) to drive towards a unified way to manage the infrastructure, apply rules on how these different silo's impact the server availability and maintain the view in an automated manner.
In the future blogs on this topic, I will cover how to BSCM this starting point topology with your management tools as well as elevating some metrics from this collection to be used as Key Performance Inidiators (KPI's) in your end views.