"Boy Meets ArcSight" or "the fairy tales of a ill-treated ArcSight admin" --- Epsiode 3.0
"Boy Meets ArcSight" or "the fairy tales of a ill-treated ArcSight admin" --- Epsiode 3.0 "2 fail o not 2 fail an upgrade - or - why size does matter"
Recently we got to know a new treatment by a monster called "the Upgrade".
Running down the steet with an aging "Game-Boy" ( Arcsight ESM version 6.9.1P4), we came across a shop and thought how about upgrading our old toy to the latest "Switch" (ESM 7.2.1).
We looked at home fo all the pocket money of tha last years, and ... did not have enough. So we had to do it the DIY way... and buy some manual and some parts from the next radio-shack.
The shop master told us, if we want to keep all our Game-Cartidges (ArcSight Rules, Trends Active-Lists) we would need to do a step by step upgrade of our gear... and if we wanted to be sure, that all works... we should
do all the upgrades in a test environment, to not break our Golden Game-Boy (aka Production environment).
So we built up our testing rig, exported all the Games from Golden Game-Boy (aka "export system tables") and imported it to the cheap and dirty version we bought at a used-gadgets store, next door (aka test system).
Starting our first try of the upgrade... magic smoke ( wikipedia link ) was exhausted from the toy... and we had to put it into the round grave for electronic waste.
So the upgrade failed. The Display showed only one Line "Error 100" as an upgrade failure.
Next Day, Next try... other capacitors, new resistors, same result "Error 100" and no running Game-Boy (aka ESM 6.11 failed to start).
Next try - this time with Amper- and Volt-Meter attached (aka looking into the log files whilst the upgrade was running).
There it was - the moment where the magic smoke started, on of the capacitors blew up, because... yes, why? ... We did not know (aka a lot of error messages, in the logs, but none, that really seemed to cause the capacitor to die, ahm the ESM upgrade to fail).
Next try... however it got more complicated to get cheap and dirty used gadgets... (aka the planned upgrade date came closer).
So we bought an oscilloscope, to see everything... (aka digged more into the logs).
And there it was. the reason for the issue:
One of tha resistors for the upgrade was calculated to small, and the capacitor behind got a to big current.
We changed the resistor... bought a new used-gear (aka reverted our virtual environment to last know good snapshot) and started again.
This time - without any magic smoke. Our jurney could be continuued, ony 4 more upgrades to go...
Okay, that analogon now gets complicated... what really happened:
We started to upgrade from ESM 6.9.1P4 to ESM 6.11GA.
That failed with a nothing saying: Error 100 message.
Not too much in the logs... some red hearings some wered stuf as well.. and one line, that went un-noticed the first tries :
"Data truncation: Data too long for column 'VALUE' at row 1" during the step: "Upgrade dynamic tables"
this was also one of the few information that was red-marked in the summary.html in /opt/arcsight/manager/upgrade/out/<date of upgrade>
looking into the server.upgrade.log in the same folder, revealed that something caused this error:
[INFO ][default.com.arcsight.upgrade.tasks.UpgradeDynamicTablesTask] Data truncation: Data too long for column 'VALUE' at row 1 [INFO ][default.com.arcsight.upgrade.tasks.UpgradeDynamicTablesTask] ===== end UpdateDynamicTablesTask end ===== [ERROR][default.com.arcsight.upgrade.tasks.UpgradeDynamicTablesTask] com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'VALUE' at row 1
and the something was this sql statement:
[INFO ][default.com.arcsight.upgrade.tasks.UpgradeDynamicTablesTask] Exec: com.mysql.jdbc.ServerPreparedStatement - insert into arc_upgrade_log (propertyName, value) values('IndexUpgrade0', 'CREATE INDEX ARC_ALD_BASDYY_main ON arc_ald_basdyy (device_event_class_id(50), end_time, name(50), device_address, device_dns_domain(50), device_process_name(50), destination_address, destination_dns_domain(50), destination_host_name(50), destination_process_id, destination_process_name(50), destination_service_name(50), destination_user_name(50), file_path(50), device_custom_string1(50), device_custom_string2(50), device_custom_string3(50), device_custom_string4(50), device_custom_string5(50), device_custom_string6(50))')
mysql> describe arc_upgrade_log; +--------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+-------+ | propertyName | varchar(100) | NO | | NULL | | | VALUE | varchar(500) | NO | | NULL | | +--------------+--------------+------+-----+---------+-------+
The statement "CREATE INDEX ARC_ALD_BASDYY.... " is more than 500 chars.. (yes i counted - one by one)
hence the upgade fails, and is not able to finish.
What is ARC_ALD_BASDYY ???
Actualy it the table for the date of an active list... that has a lot of columns... maybe its mig... but really should an upgrade fail because of a "wide" active-list? NO.
Our solution was "quick and dirty", as we found out that this ActiveList is.. how should we say... "not used anymore".
So we deleted the AL, and everybody lived happily ever after ... ah the upgrade went smooth.
So thanks to my fellow "M.S." who was driving most of the soldering aka building the test environment, and keeping awake with me several nights, to find "this easy thing" at the end.
Last but not least - Thanks to all the people who ate all the rad herrings, created by the log files.
Trust your first idea, and trust your log files.