Mark_RB Frequent Contributor.
Frequent Contributor.
913 views

00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi,

I would like some clarification on how RAS servers work in regards to how and when RAS servers are used to balance work load.  We have the following configuration:

 2016-08-03_15h59_53.png

The Windows RAS servers are in a subnet without connectivity (except to the other OO servers)

In my testing, it seems that because the two Windows RAS servers are the only members of RAS_Operator_Path, they are doing all of the processing.  I have proved this out by creating a simple operator (scriptletResult = java.net.InetAddress.getLocalHost();)- running this proves that it always seems to run on one of these RASs.

I have done a little experimentation by adding the other servers into RAS_Operator_Path, and it appears that sometimes operations will run on these RASs, sometimes on the other servers.

I understand that I could override an operator to use a different group, but could somebody explain:

  1. If all servers are in the RAS_Operator_Path, what would make Central determine which worker does the work, and when does it decide on placement? i.e. is it per flow run, per flow, per operation, etc.
  2. I believe that the configuration illustrated about is sub-optimal- the information in the documentation makes me believe that RAS_Operator_Path should be enabled on all servers EXCEPT the Windows RASs due to the connectivity issues with these. Is my analysis of this correct?
  3. Apart from that, does the configuration above look correct? Is there anything else that needs to be taken into account?
  4. If I have to stick with the current configuration (I did not implement this design), how do I go about overriding RASes? I have created a Group Alias in my content packs for the Win_RAS & Linux_RAS groups and have set them up as per the documentation, but what is the best way of overriding the configuration, and should I do it for everything, or just where connectivity is needed?

 

Thank you in advance,

 

Mark.

 

Labels (1)
Tags (4)
0 Likes
1 Solution

Accepted Solutions
michalhugim Absent Member.
Absent Member.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi Mark,

Let me try to answer your important questions:

  1. If all servers are in the RAS_Operator_Path, what would make Central determine which worker does the work, and when does it decide on placement? i.e. is it per flow run, per flow, per operation, etc.
    A: this will be the preferred setting (but see#2 as well) as many OO operations assigned to RAS_Operator_Path, it will be good in general that all workers will be assigned to this group, and OO will balance the work according to internal algorithm. In general, it can be even higher resolution than operation, but if it can (according to the group assigments definition), it will stay as long as possible in the same worker, to optimize throughput.
  2. I believe that the configuration illustrated about is sub-optimal- the information in the documentation makes me believe that RAS_Operator_Path should be enabled on all servers EXCEPT the Windows RASs due to the connectivity issues with these. Is my analysis of this correct?
    A: I believe it depends whether these Windows RASs belong to other groups (like "Win_RAS") that are used in many operations. If this is the case (it's widely used in operations), so you'll better keep them in the "RAS_Operator_Path" as well. otherwise, the flow will ping-pong more between workers.

  3. Apart from that, does the configuration above look correct? Is there anything else that needs to be taken into account?
    A:
    1) Actually I think that you have redundant RASes. In OO10.x, Central nodes contain OOTB internal worker (like RAS, only RAS is Remote.). I don't see the IPs, but if you have workers on the same host (Central + RAS on the same host), so most of the chances that you don't need the RAS, and actully it might reduce your performance throughput. -> remove the RAS.
    2) After removing the RASes that I've noted in #1, do you still have other RASes? If so, check that you really need them: Basically, RASes now needed only for the special use cases of – cross firewall, special prerequisites and remote datacenters with network latency (-you don't want to put there Central, because it'll have issues getting to the DB).

  4. If I have to stick with the current configuration (I did not implement this design), how do I go about overriding RASes? I have created a Group Alias in my content packs for the Win_RAS & Linux_RAS groups and have set them up as per the documentation, but what is the best way of overriding the configuration, and should I do it for everything, or just where connectivity is needed?
    A: I didn't understand this one. Sounds that you defined it well. Can you explain the need?

Regard,

Michal.

5 Replies
michalhugim Absent Member.
Absent Member.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi Mark,

Let me try to answer your important questions:

  1. If all servers are in the RAS_Operator_Path, what would make Central determine which worker does the work, and when does it decide on placement? i.e. is it per flow run, per flow, per operation, etc.
    A: this will be the preferred setting (but see#2 as well) as many OO operations assigned to RAS_Operator_Path, it will be good in general that all workers will be assigned to this group, and OO will balance the work according to internal algorithm. In general, it can be even higher resolution than operation, but if it can (according to the group assigments definition), it will stay as long as possible in the same worker, to optimize throughput.
  2. I believe that the configuration illustrated about is sub-optimal- the information in the documentation makes me believe that RAS_Operator_Path should be enabled on all servers EXCEPT the Windows RASs due to the connectivity issues with these. Is my analysis of this correct?
    A: I believe it depends whether these Windows RASs belong to other groups (like "Win_RAS") that are used in many operations. If this is the case (it's widely used in operations), so you'll better keep them in the "RAS_Operator_Path" as well. otherwise, the flow will ping-pong more between workers.

  3. Apart from that, does the configuration above look correct? Is there anything else that needs to be taken into account?
    A:
    1) Actually I think that you have redundant RASes. In OO10.x, Central nodes contain OOTB internal worker (like RAS, only RAS is Remote.). I don't see the IPs, but if you have workers on the same host (Central + RAS on the same host), so most of the chances that you don't need the RAS, and actully it might reduce your performance throughput. -> remove the RAS.
    2) After removing the RASes that I've noted in #1, do you still have other RASes? If so, check that you really need them: Basically, RASes now needed only for the special use cases of – cross firewall, special prerequisites and remote datacenters with network latency (-you don't want to put there Central, because it'll have issues getting to the DB).

  4. If I have to stick with the current configuration (I did not implement this design), how do I go about overriding RASes? I have created a Group Alias in my content packs for the Win_RAS & Linux_RAS groups and have set them up as per the documentation, but what is the best way of overriding the configuration, and should I do it for everything, or just where connectivity is needed?
    A: I didn't understand this one. Sounds that you defined it well. Can you explain the need?

Regard,

Michal.

lrevnic Absent Member.
Absent Member.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi,

4. If I udertood correctly you need to do one small extra step. 

Make sure the overrideJRAS variable is set to Linux_RAS. In this way, at run time the Java operations will be executed on the workers from Linux_RAS group. By default, and you saw as well, the content executes in RAS_Operator_Path group.

You can set the overideJRAS variable dynamically ar tun time using a Set Flow Variable step or, statically (eg by creating a system property with the name overrideJRAS and value Linux_RAS )

This is also described in the  Studio Guide / RAS Group Overrides section.

HTH,

Lucian

Mark_RB Frequent Contributor.
Frequent Contributor.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi Michal & Lucian,

Thank you both very much for the reply- your answers concur with what I am observing.

Lucian,  I overrode the worker group as you suggested and this greatly improved the performance as previously every operation was running on the RAS unnecssarily. 

 Michal, Your guess that the Central server also has a RAS is correct.  I will look to get this addressed with the owners of the Central servers.

I now have a more sensible configuration on a test cluster, and I can see it offloading to it's own RAS which takes a comparitvely long time when loading the server heavily (out of the box configration, 60 concurrent flows across 2 central nodes).

I feel that further worker group changes and a bit of performance tweaking, is required, but after that it will be fit for purpose.

I do have one further question- when Central offloads work to a RAS, does the ras take a  'chunk' of work, such as an entire flow, or set of steps and process them all, then return the result fo the Central server, or does Central have to provide it with the step, the environment (i.e. the flow variables) and then wait for it to return the result before passing it another step?  They both seem like heavy set-up operations, but particularly if it is performed operator by operator.

Kind Regards,

 

Mark.

michalhugim Absent Member.
Absent Member.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Mark,

Regarding your last question - in general, a certain worker will proceed with the flow (step after step) as long as it can. If it arrives to a step that is assigned to a group that this worker doesn't belong to, so it'll "return" it back to Central queue.

Can you explain what you wrote to Lucian "I overrode the worker group as you suggested and this greatly improved the performance" - what exactly did you set? (just to my knowledge :-))

Thanks,

Michal.

0 Likes
Mark_RB Frequent Contributor.
Frequent Contributor.

Re: 00 10.51 - Managing load & Connectivity - Central & RASes

Jump to solution

Hi Michal,

Thanks for the feedback, it seems to be in line with what I have observed.

So, I have did 2 things.

  1. I set overrideJRAS variable to point a Group Alias, which ensured that it was using the Central servers- that greatly improved the performance I was seeing.
  2. I have (with the aid of this post) convinced the team that provide me with the OO infrastructure that it is incorrectly configured, so now only Central is in the RAS Operator Path.

 

I am now investigating the best performance settings for our OO environmnet- I have built out some PowerShell code to measure performance of the environment- Our use case with OO 10 has changed from OO 9- previously we would have one flow that split into number of parrallel steps.  We would often have multiple flows running with various numbers of parralel steps.  Because of the way that parallel execution has changed in OO 10, we have changed this to run as seperate flows, using an external job kue- it gives us far more flexibility, scalability, reliability, etc.  The only issue I have is trying to identify how many flows are too many for our environment and the OO Tuning guide is pretty dire, as an example:

By default, each HP OO node has 20 worker threads. If your flows have a large number of parallel or
multi-instance lanes, or if you trigger a large number of flows simultaneously, we recommend
increasing this number. For example, you might increase this number to 200 threads per worker or
Central.

Note: The number of threads that can be configured is dependent on the amount of memory
available to the Central or worker.

It only tells you what you can change, it does not give details on maximums, or recommendations on how to judge it.  As a result, I have constructed a test flow that causes CPU load, defines variables, etc.  I can increase the number of itterations it runs, and we can controll how many run in parallel on the cluster.  By examining the amount of time that the flows run, I am tweaking the variables to see how much parallel performance I can get from our environment.  It will be interesting to see, but it appears that our new environment, on 10.51 Central Servers on Linux with Oracle backend will scale far more than the 9.07 Central servers with SQL Server backend.

 

Regards,

 

Mark.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.