Service Manager and Disaster Recovery/Backup Contingency Plan/Scheduled Maintenance

Good morning,

I'm curious as to what other companies are doing in the area of Disaster Recovery or Backup Contingency Planning for outages for Service Manager.  

In our company, when we make changes to Service Manager that require changes to the dbdicts, we schedule an outage, kick everybody off the application, perform our changes and checkout, and then bring the application back up.  While the main Service Manager application is down, we actually have our users create tickets in another system, and then we use ConnectIt to move those tickets over to HPSM.  The other system is limited, does not have all the data our Production system has, and it's kind of a pain trying to keep the core data in sync.

We use that same system for unexpected outages - like if HPSM goes down for some reason.

For 'Hard Down' disaster recovery, the database team and server teams bring up whole copies of Service Manager in another data center, but that's only for a company-wide disaster, and not something that's available for scheduled outages or unexpected HPSM outages.

So, I'm curious how other companies have solved it.  What do you guys do in your companies for unexpected outages or shceduled maintenance?  

Parents
  • Hi Jacob,

    We do have exactly same requirement, but HP said that is not supported from Application Side. E.g. In case of an planned outage, we thought of moving users to DR instance and work on Primary instance. During this window, there will updates to DB from both instances, but Service Manager doesn't support active sync between DB instances. Service Manager being an enterprise level application, clients expects this functionality should available. HP not even planning add this feature to SM 9.5x versions also. Not sure when this will be availalbe in Service Manager.

    Regards,

    Madhava

  • One of my customers which no longer use HP had a 2 datacenter setup. One active and one standby.

    In case of major outages in the active datacenter they would switch over to the standby datacenter and the database would be synced between the two datacenters so all data would be available regardless of which datacenter was active.  Only one service manager instance would be active at any time.

    Once a year they did a disaster recovery test where this setup was tested and it worked perfectly.

    Service Manager was running on AIX.

  • We swing over to our secondary data center for DR exercises.  If I had a planned outage that was a significant length I could probably plan to also swing over to DR but we mirror the database to our secondary data center and I think it would be problematic to run both instances.  I've never had an unplanned outage last more than 15-20 minutes.

    And most of my scheduled outages are less than 30 minutes so people just have to do without for that short time.  I do plan all my outages during my maintenance window which is 9pm on Sundays because that was determined to be the least busy time for Service Manager usage.  So far this has worked for us.

    When we upgraded from 9.40 to 9.41 we knew the outage would be 3-4 hours.  The powers that be were not happy about that and didn't like the idea of tickets being tracked on paper or via a spreadsheet, access database, etc..  I had to allow them to open tickets in the QA environment (after resetting the number table to match production) and then after the outage I did an unload in QA of the following tables for the outage time period: SYSATTACHMENTS, activity, activityservicemgmt, incidents, probsummary, screlation and moved it to production.

    It was a pain on myside but the only impact on the users was they had to sign into a different link during the outage window.

Reply
  • We swing over to our secondary data center for DR exercises.  If I had a planned outage that was a significant length I could probably plan to also swing over to DR but we mirror the database to our secondary data center and I think it would be problematic to run both instances.  I've never had an unplanned outage last more than 15-20 minutes.

    And most of my scheduled outages are less than 30 minutes so people just have to do without for that short time.  I do plan all my outages during my maintenance window which is 9pm on Sundays because that was determined to be the least busy time for Service Manager usage.  So far this has worked for us.

    When we upgraded from 9.40 to 9.41 we knew the outage would be 3-4 hours.  The powers that be were not happy about that and didn't like the idea of tickets being tracked on paper or via a spreadsheet, access database, etc..  I had to allow them to open tickets in the QA environment (after resetting the number table to match production) and then after the outage I did an unload in QA of the following tables for the outage time period: SYSATTACHMENTS, activity, activityservicemgmt, incidents, probsummary, screlation and moved it to production.

    It was a pain on myside but the only impact on the users was they had to sign into a different link during the outage window.

Children
No Data