Idea ID: 2874143

AA's OSP aka OAuth webservices lack client session resiliency among cluster members

Status : New Idea

Client sessions to the Advanced Authentication’s OSP feature that facilitates NAM’s “Generic” aka OAuth method of integration lack resiliency across cluster members.

In our case, and likely many other customers, we have two AA Webservers front-ended by a load-balancer. We also have two NAM IDP nodes front-ended by a load-balancer. When a user visits the NAM IDP, the NAM Contract is configured such that the client session must first satisfy a NAM-local “Secure Form” Method (aka simple username + password over HTTPS). After that is successful, the next NAM Method in that Contract is the AA Generic aka OAuth integration, so the NAM IDP redirects to the AA OSP via the F5 load-balancer address in front of those two AA Webservers, and here the user is challenged for their second factor (aka Email, SMS, Voice OTP, etc). All of this is fully configured and working perfectly.

For the first part of the client session on the NAM IDP itself, let’s say the username+password capture screen is served up from NAM IDP “Node 1” through the IDP’s own load-balanced. If NAM IDP “Node 1” goes down before the user finishes typing and submitting their username+password, the load-balancer will transparently connect that client session to NAM IDP “Node 2” – which, due to NAM’s "Session Cookie Brokering" component that synchronizes client sessions and cookies between cluster members in real-time, thereby allowing the client session to transparently continue and validate the username+password even in the face of node failure mid-session for a client, and if successful then redirect the client to the AA Webserver’s OSP/OAuth integration for the next required factor of authentication.

AA's Webserver’s OSP/OAuth integrated authentication lacks similar enterprise-class resilience in the face of a similar Node failure mid-session, as NAM’s IDP is described above. Today, should a client session be redirected by NAM IDP to the AA Webserver’s load-balanced URL, and AA Webserver “Node 1” vends the Chains selection screen, AA Webserver “Node 1” then initially prompts for authentication (Smartphone/Email/SMS/Voice OTP/etc). While the client session is waiting on the human to type in the OTP code or accept the push notification, AA Webserver “Node 1” goes down. By the time the user clicks submit on the OTP screen or taps accepts on the Smartphone, the F5 load-balancer in front of the AA Webservers transparent redirects the client session to the surviving AA W3ebserver “Node 2.” AA, unlike NAM, has not been architected for proper resilience to synchronize session cookies between peer AA Webservers in the cluster or similar such that the client session is uninterrupted.

The AA team surely has access to the NAM source code and NAM developers, so the AA product leadership should exploit this internal relationship within Micro Focus to strategically improve the cyber resilience of the AA product. No need to completely reinvent the wheel, and even if the Session Brokering code from NAM isn't 100% transportable, there'd be plenty of lessons that could be learned about how to more quickly implement a from-scratch version that suits AA properly.

Labels:

AAF
Cluster Services
Configuration
Idea
Other
Status
timeout