In the past couple of days, unable to launch flows from CSA because OO is overloaded. Error seen:
Could not access HTTP invoker remote service at [https://itcs-oo10-pro.xxx.xxx.xx.com:8443/oo/central-remoting/engineExecutionFacade]; nested exception is org.apache.http.NoHttpResponseException: The target server failed to respond
After looking into the OO logs trying to determine the problem, the customer saw that the OO Central nodes handled the requests as shown below:
Node – HTTP hits from CSA:
3889 - 8466
3892 – 38
3893 – 0
Node 3889 receives most of the requests.
This environment is setup as static while the customer has another environment that is setup as round robin. This environment is also using an F5 load balancer. How to resolve?
Research from F5 online resources point to several options that may be considered.
1 - Disable persistence and then test and monitor.
2 – If the current configuration is acceptable, leave it as is.
3 - You could also check the version of F5 to see if there is an update available.
Adding information about persistence from F5:
“A persistence profile allows a returning client to connect directly to the server to which it last connected. In some cases, assigning a persistence profile to a virtual server can create the appearance that the BIG-IP system is incorrectly distributing more requests to a particular server. However, when you enable a persistence profile for a virtual server, a returning client is allowed to bypass the load balancing method and connect directly to the pool member. As a result, the traffic load across pool members may be uneven, especially if the persistence profile is configured with a high time-out value.”
“This brings us back to the first question: How does the load balancer decide which host to send a connection request to? Each virtual server has a specific dedicated cluster of services (listing the hosts that offer that service) which makes up the list of possibilities. Additionally, the health monitoring modifies that list to make a list of "currently available" hosts that provide the indicated service. It is this modified list from which the load balancer chooses the host that will receive a new connection. Deciding the exact host depends on the load balancing algorithm associated with that particular cluster. The most common is simple round-robin where the load balancer simply goes down the list starting at the top and allocates each new connection to the next host; when it reaches the bottom of the list, it simply starts again at the top. While this is simple and very predictable, it assumes that all connections will have a similar load and duration on the back-end host, which is not always true. More advanced algorithms use things like current-connection counts, host utilization, and even real-world response times for existing traffic to the host in order to pick the most appropriate host from the available cluster services.”
Certain static load balancing methods are designed to distribute traffic evenly across pool members. For example, the Round Robin load balancing method causes the BIG-IP system to send each incoming request to the next available member of the pool, thereby distributing requests evenly across the servers in the pool. However, when a static load balancing method such as Round Robin is used along with a BIG-IP configuration object that affects load distribution, such as a OneConnect profile or a persistence profile, traffic may not be evenly distributed across BIG-IP pool members as expected.
Please see the knowledge document at https://softwaresupport.hpe.com/km/KM02675008