'Muted Mode' reported as '200 OK' to load balancer
We have a large QC installation, which uses a load balancer to distribute users between servers. The load balancer periodically polls each server (http://server:port/qcbin/tdservlet), to make sure it is still 'healthy'. When it detects a server as unhealthy, it diverts users to the remaining healthy servers.
Unfortunately, we have found that when ALM/QC encounters a serious error and enters 'muted' mode, it continues to return the status line: "HTTP/1.1 200 OK". This is the same code which it returns when running normally. The load balancer only looks as the status line, so thinks the server is still healthy.
Is there another way that we can detect that the server is down (without using another component between the load balancer and QC)? Is there a different URL which is more appropriate for us to use?
Should QC use a http 5xx status code for muted mode?
Wish you a nice day.
As I understand, when ALM server goes to mute status, web server (Jetty) is still available.
Actually, all response codes will be sent from web server and it is a designed behavior.
From my perspective, it will return ‘500 Internal Server Error’ only when Jetty is not available.
Such as services were stopped, jetty were not configured properly..
Anyway, i think it would be better if you open a support ticket via SSO so our engineers can investigate your issue more clearly.
Hope my answer could be helpful to you.
Just to let you know that we already have an ER for this. Please check it here:
https://softwaresupport.hp.com/group/softwaresupport/search-result/-/facetsearch/document/LID/QCCR1J78593 - Incorrect http response code in muted mode
Until this is implemented (most probably in a new version of the product since it would involve a major change and may affect many current users), we may suggest you to use the following workaround:
- patch 01 for ALM12.01
When I last tested this, the server could still enter muted mode (and return HTTP 200 OK for all requests) with the MUTE_SERVER_FOR_OOME set to N.
Our current workaround for this issue is for the load balancer to make a request which normally returns a HTTP error. If it returns 200, then the load balancer treats the server as being down.
As I understand, MUTE_SERVER_FOR_OOME set to N. will disable Mute mode on Out Of Memory situation only.
For other reasons such as DB connection, reporitories issue... ALM may still enter mute mod.
Your current workaround is a good approach while R&D is working on this.
Thanks and Regards,
If you are satisfied with anyone’s response please remember to give them KUDOS by clicking on the STAR at the bottom left of the post and show your appreciation.”