Aegis Workflow Test and Benchmark Documentation - Why and How?

Aegis Workflow Test and Benchmark Documentation - Why and How?

Aegis workflows can be designed to do just about anything from simple tasks like running a script on a schedule to larger more complex tasks like provisioning virtual environments. Just because these are possible however doesn't mean you can run an infinite number of these workflows simultaneously. How many simultaneous workflows which can run simultaneously will depend on a number of factors but all really boil down to one thing - the design of the workflows.

The purpose of the Aegis Test and Benchmarking documentation then is really important.

It defines:

  1. The Scope of the workflow - The purpose of the workflow including its features and limitations, what workflow errors are handled and how (passively or actively).

  2. The Footprint of the workflow - How much resources (File system / Database / Message Queues / Memory / CPU etc.) the workflow uses while running and left after workflow completes.

  3. The Load Limitations of the workflow - How many events per second / bulk processing items / managed objects / simultaneous runs the workflow is designed to handle.

  4. Plan B - What happens when an incident occurs which is outside the parameters of a b or c above.



This all sounds pretty simple. Unfortunately as somebody who gets asked why a workflow stopped working or the system stops working, its not very often that the above information is known. This makes for an awkward hunt as to who is to blame for a particular problem. If you don't write your own workflows this documentation is particularly important for all involved parties.

Usually problems will occur which are related to point d. The absence of a Plan B. This is normally directly related to the scoping of the workflow not being realistic or incomplete. More specifically, the absence of a Plan B usually indicates that the limitations of the workflow are not known.

As an example: The plan is to design a workflow to create an IT Ticket for each new event generated in NetIQ AppManager.

The scope should include a flowchart corresponding to the workflow. This way everyone can follow a non technical description of what the workflow does and what it can and cannot do. It also serves as a great way of showing that a particular feature was never meant to be in the workflow. Flowcharts are really important in designing even the simplest workflow in order to understand what a workflow is meant to do, and the workflow designer just needs to follow the logic of the flowchart to create the workflow.

The scope also needs to include the limitations of the workflow - these may often be a wish list at the scoping phase - during the workflow development and testing phase you will try and hit these targets.

In this example, if the requirement is to be able to handle 10 events per minute, you may be able to build a complex workflow which does many things as the load is small.

If the requirement is to handle 10 events per second, you will probably need to build a highly efficient workflow with limited features to deal with higher volume.

Therefore knowing the scope and requirements leads directly to how the workflow should be designed. The actual limitations will truly only be found by testing. Its obviously best to find the limitation during testing when it may be still possible to modify the scope, than in production.

However if something out of the ordinary occurs, and it will - say some network maintenance prevented events being synced for some time, a workflow designed to handle 10 events per minute may suddenly have to handle thousands of events per minute for a short duration - how will the workflow handle that ? - what is the maximum EPS before performance will degrade - how can this overload affect other workflows on the system - what is the Plan B for this scenario?

It is also impossible to predict the quality of a workflow, 2 people can design the same workflow but can have completely different operating characteristics so its not possible to say Aegis has a limitation of running X workflows simultaneously.

Good questions to answer in the documentation relating to Load Limitation:

Is there a maximum simultaneous value for the workflow running set?
Is there a possibility of loops with large number of iterations? Is this mitigated?
How many workflows can run simultaneously before performance is affected?
Will this be affected if other workflows are in production?
What will happen if workflow runs outside limits of it design?
What is the decided data Grooming Strategy?
What are the limiting factors of workitem (if any)? - e.g. connections to an external database etc. etc.

Plan B workflows are what prevent the system from being overloaded. With the best will in the world, a workflow developer probably isn't going to be able to design a workflow which completely handles thousands of events per second in the same way as if there are 5 events per second, but they can probably have a Plan B workflow to take over which at least records those events so they can be handled fully later by the main workflow.

 

 
Tags (2)

DISCLAIMER:

Some content on Community Tips & Information pages is not officially supported by Micro Focus. Please refer to our Terms of Use for more detail.
Top Contributors
Version history
Revision #:
1 of 1
Last update:
‎2015-11-20 01:06
Updated by:
 
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.