satellite failover

can managed nodes be configured with multiple gateways to act as loadbalancing - - and/or failover?

  • Hi,


    Do you mean load balancing between SA managed servers or between SA cores/slices ?





  • In between SA Cores you could use egress filter to enable a redundant connection to another core.

    In this way for example if one of the core went down, managed servers would switch to another Core.

    This filter should be set at the Satellite .





  • I mean loadbalancing or failover between managed nodes and comm to the core - - if one satellite were unavailable have a second satellite to maintain comm to the core.  And; how is this config done?  thru the agent opswgw.args file?


  • The gateway network and agents have a feature built in that addresses load balancing and failover. It does require that there are at least two satellite gateways in the same realm/facility to be meaningful. Let's say a managed server is in a facility that is connected to the core via two satellite gateways. Soon after a managed server has registered for the very first time with the mesh, the satellite gateway network will provide a list of all the satellite gateways in the facility to the agent. The agent will automatically update its opswgw.args file with that list. So, shortly after registration, the agents know about all the gateways it can talk to.


    Now, should one of those satellite gateways drop off the gateway network, either the core or the agent will take notice the next time it tries to communicate with the other side. That is, if the core tries to send an update to the agent, it will notice that the usual path is blocked. Similarly, if the agent tries to update the core, it will notice the usual path through the gateway network is blocked. The side that notices first will then cycle through the list of gateways it already knows about until it finds one that can reach the other side. It will make a note of the change, and continue with the conversation.


    This is grossly simplified, and actual conditions and logic around this are quite complex. But the general idea is that in a properly configured gateway network, both the core side and the managed server side have access to the list of all the gateways that could be used to communicate with the other side. They pick one to start communicating, and use it until something stops working. They then round-robin through the rest of the list until they can communicate again, and continue using the new gateway until it stops working for some reason. They will continue cycling through the list until they succeed.


    Configuration of a fault resistant gateway network  between the cores and managed servers can be tricky. It requires some planning and careful configuration of the satellite gateways themselves. We do have customers with complex, fault-tolerant systems which allow for failover between satellites, and also failover should an entire facility or data center go offline. That level of detail is beyond the scope of what we can share here, but suffice it to say that it is done using the principles we've already mentioned so far. Typically, the gateway network is set up once and "just works" with no further configuration required; the core, satellites and agents handle the routing of informaiton amongst themselves. With appropriate redundancy, the network can reduce the likelihood of single points of failure.


    Please let us know if this answers your question.


    Thanks for a great question, and best wishes,


    Bryce Ryan

  • I would like to have 2 satellite servers report to the same realm - - but when I install a satellite a unique facility and realm is automatically created.  Did I make a mistake in the installation?  would a custom install of the satellite give me the option to select an existing realm?  That's all the redundancy I'm looking for.  It's my understanding that just having a multi-core mesh automatically provides core/slice redundancy if one core was unavailable.  (my last post - - I promise)

  • Verified Answer

    To set this up, you'd need to be able to set the values for the satellite.gateway_name and satellite.realm_name fields in the interview questions. Based on my reading of the Standard/Advanced Installation Guide for Server Automation, that would require the "Expert Mode" interview when using the satellite installer.


    If you need further assistance, please open a support case, as setting up your configuration is not documented in the user guides.


    Thanks, and best wishes,


    Bryce Ryan