7.0 SP1 ESM HA installation failure with existing ESM
We were reconfigure ESM HA with the Existing ESM due to hardware failure on one of the server. However now we have replaced the server with new one and followed all the steps successfully given in ESM HA config guide. While running HA Module Installation Script on the primary system we are observing error message like " we have detected an installation of HA 18.104.22.168. upgrade not allowed on your current installation and press ok to exit ".
ESM version used :
Any recommendation to fix the issue ?
If you read carefully the ESM_HA_UsersGuide_7.0P1.pdf section Replacing a Server, you will notice that there is no step that explains you to run the install “HA Module Installation Script”.
Once again according to the HA document the procedure is:
This topic describes how to use the First Boot Wizard to replace a server (for example, if it has a hardware problem)s.
Note that you need to bring down ESM during the installation on the new secondary – That means you should stop ESM services.
The procedure is given below:
1. Power down the server to be replaced. The other server will then become the primary.
2. Prepare the new server as described in "Installing HA with an Existing ESM" on page 30 – Prepared means that you should follow the recommendations to be able to sync it with the primary node.
Note: The new server may have different IP addresses and hostnames than the one it replaces and there are manual steps to perform on this machine as the secondary. Make sure that the Primary node knows about this changes and you will modify the hosts file accordingly.
3. Stop ESM services on the primary (the server that will not be replaced) by running the following command as user root:
4. Run the First Boot Wizard as user arcsight on the primary (the server that will not be replaced) and specify the hostname or IP address for the new secondary system if it's different from the original.
Note: make sure that the secondary server (the new server) have the CentOs or RHEL repository configured since the script will install extra rpm packages on the new server necessary for HA module.
5. Restart ESM services as user root on the primary:
At this point, ESM should come up again on the primary system. The new server will become the secondary system. The synchronization process between the primary system and this new secondary system may take some time. See the "Planning for the Initial Disk Synchronization" on page 32 section for more information.
This should be all.
Please once again pay attention to the necessary steps to prepare the new server starring from the partitioning the discs to the folder/user for the application. In the end, the new server should be similar to the one that you are replacing and the one that is working.
All the best,
I have done the exact step as you mentioned. Just that in my case I have two ESM which were earlier in HA and now need to be brought back in HA. There was no data sent to either ESM after HA was broken (It was broken because something had gone wrong).
Now when I run the firstbootwizard after installing HA on primary I am getting below message on console and it's stuck in a loop after it give the root password in the wizard.
"Please verify the following parameters
root password: ********
Are the values correct [yes/no/back/cancel]?yes
An Unexpected error has occurred: 1
Also my install.log file gets appended by these messages:
<hostname>: 2019-10-25 13:51:58 Called installation as arcsight - will switch to root.
<hostname>: 2019-10-25 13:51:58 Starting HA Installation Status Check
<hostname>: 2019-10-25 13:51:58 checkCommunications: from = <hostname>, to = <hostname>
<hostname>: 2019-10-25 13:51:58 checkCommunications: from = <ip1>, to = <ip2>
<hostname>: 2019-10-25 13:51:59 SSH access already set up.
<hostname>: 2019-10-25 13:51:59 installState=fresh
<hostname>: 2019-10-25 13:51:59 HA Installation Status Check complete
[2019-10-24 21:11:32,995][ERROR][wizard.FirstBootWizard$5] 1
if you follow the steps the HA should be easy to install but you need to make sure that you are having the Linux kernel specified into documentation and the size of the disk / partition is the same with the existing server.
You also mention "(It was broken because something had gone wrong)" what was wrong?