OMI upgrade Advices
Dear OMi / OpsB experts,
I am working for a customer, which has a project to enhance their monitoring capabilities in APM and Ops Bridge spaces.
I have included their services roadmap for this Calendar year in the file attached.
They are currently focused on migration from OML to OMi 10.60. During the assessment for this specific upgrade, some questions came up and I would like to share them with you for advices.
1. OMI or OBR Capabilities to Store and Report on “Raw Data” from OMi Events
- Their instance monitors about 4,500 nodes, and have around 15,000 events per month (average).
- Their intention is to keep all those events available for ad-hoc analysis, for a period of at least 1 year.
- Therefore, they need to keep a database populated with “raw data” of those events for long periods and be able to search on that data. Currently, they have a daily process to extract and store that data (everything older than 7 days) to an external database so they don’t affect OML production with searchs for past events (and keeping production database clean).
- They run reports to extract information for any of their monitored nodes (4.500) looking for events and its attributes (description, open date, close date, other details per event) which happened in certain past period of time for the informed nodes.
- This customer has an use case to consult events stored in OM database for long time periods, such as 1 or 2 years before current date.
- Is there a better way to archive and search these events with OMi 10.60?
- If we work with archived events (XML files in “/var/opt/OV/shared/server/datafiles/archive”), is there a friendly interface to analyse them later, or import these events to OMi Event perspective again?
- Would OBR be the answer for this use case, being able to store large amounts of Event Details (raw data) for a long time (long retention period)?
2. Impacts after Changing OMi retention parameter to larger value
- What would be the impact for OMi Platform if we change “Archiving of Closed Events” configuration to 1 year?
- Can we experience degradation on performance since the database should grow fast? Or this is not too much for the OMi Platform to handle (4,500 nodes, 15,000 events per month in average)?
- There are events which are not worth to keep and will be archieved using opr-archive-events command: about 1,300,000 per month!
3. Increase “Maximum Event Count” parameter
- This parameter controls the maximum number of active events (excluding events with parents) that is shown in a UI. If the actual event count exceeds this number, the system switches to Purge Mode and only displays the latest x events.
- Default seems to be 20,000. But we would like to increase it to 40,000 (double).
- Is it going to have an impact on Performance, or any other side-effect?
- Should we also consider modifying other parameters to allow OMi to handle this increase, such as Java Heap Size? Is there any specific recommendation to follow?
4. OMI Health Check, or Self-Monitoring
- Is there a recommendation on how to setup Self-Monitoring on OMI Server and all its Agents to ensure that they are constantly able to communicate with the Server (not just monitoring if the OM process is running, but a more robust “keep-alive” type of self-monitoring)?
About your question 3,
I think you also need to increase the parameter of "User Interface Update Interval "to avoid performance issue.
About your question 4,
There is the way to do it by configuration on the Monitored Node portal to constantly monitor it.