Highlighted
Trusted Contributor.. Trusted Contributor..
Trusted Contributor..
195 views

Users unable to log into ALM, faced error: "ALM failed to retrieve authentication data"

Hi,

Wondering if anyone came across this before and can shed some light on what may have caused it:

Yesterday all users were who attempted to log into ALM were meet with below the error:
“ALM failed to retrieve authentication data”

We have two windows ALM servers (12.21 patch 7) in high availability with LDAP authentication in place. Users meet the same error when they hit either node.

Thankfully after restarting the ALM service on both ALM servers, users were then able to successfully access ALM. Still would like to understand what happened here. 

QC logs mention:
Invalid request: Remote host: xx.xx.xx.xx, Meta Data: [Function Name: Logout, Login Session ID: 21510287, Project Session ID: 10038317, Call ID: 45]. Error: Failed to obtain a connection to schema 'qcsiteadmin_xxxxxxx' - timeout expired.

Any insight appreciated 

0 Likes
6 Replies
Highlighted
Outstanding Contributor.
Outstanding Contributor.

Were you able to connect to Database when that issue happened?

I would think, its an Database issue.

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

The issue reoccurred this morning. Users were unable to login. Both application server nodes returned the same error message when users attempted to log in.

"ALM failed to retrieve authentication data"

 

In the SA logs the following error is thrown.

"Exception thrown when executing the job I am alive

com.hp.alm.platform.exception.CTdException

Messages:
Failed to set application server's last touch time.; Failed to obtain a connection to schema 'qcsiteadmin_db_tcoe_ALM1220' - timeout expired;"

We were able to resolve by restarting the ALM service on both application server nodes but it is very concerning that this is happening randomly in our Production environment.

 

Our DBA's checking our database server at the time of the issue and did not find anything of concern.

 

 

Tags (1)
0 Likes
Highlighted
Knowledge Partner Knowledge Partner
Knowledge Partner

@cmcn2016,

Looks like data server, repository server and application servers are not rebooting in correct order (one after after another) after weekend maintenance due to which you are facing this issue. Issue is occuring because DB server might not be fully up at the time application service wanted it to be available.

Ideally , both your database server and application sever should not be restarted at same time.

We have to ensure that application server should restart only after both database server and repository server has been restarted and is available ( up and online).

What you can do as keep the database sever patching maintenance in a week prior to application server patching , this way both servers will be on different week schedules and outage can be eradicated.

Br,Srihari
0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

I checked the server boot times as you suggested.

DB Server: 06/21/2020 7PM

App Server 1: 06/20/2020 10PM

App Server 2: 06/20/2020 10:30PM

Repo is on a NAS share. Server was not rebooted.

However, this issue occurred on Friday June 19th, before any servers were patched and rebooted.

Tags (1)
0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

So the issue seems to be that on random occasions the application loses connection to the qcsiteadmin_db

Our systems support and DBA's would definitely notify us if there was an issue with the DB server losing network connectivity or being unavailable.

 

We use Dynatrace monitoring on all of our servers and I cannot see anything in there this morning when the latest instance of this issue occurred.

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Yes true, from the error log it seems to be a random issue with the connection to the SA schema in your DB server.

Please check the stability of the DB server network, and another check point is if your DB server is under some maintenance work impacting the availability of that specific schema.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.