Thanks for the reply.
I was only able find some HA availability docs, which talks about using EMC, etc. Though that would be the ideally way to go, as a start, we are looking only to replicating content - users, active channels, assets, rules, reports, queries, etc. Since we have two dedicated Loggers in each environment, we are assuming that the data is already replicated and will be available on the DR ESM.
Please do send me anything you find.
Aaron Wilson from SAIC did a presentation on a mechanism they developed at the Users Conference. Not sure if those presentations can be reposted since ArcSight only made them available to those who attended so far.
Is anyone kind enough to provide the presentation as the link above does not wrong while I do not seems to find it anywhere!!
The link works for me, but I think they only released the materials to those who attended Protect10 at the moment. I'm not sure the timing of the general release.
This is correct, the Protect '10 Session Materials are currently accessible by conference attendees only. Access is opened to the rest of the community about six months after the conference.
We are investigating different methods on performing the replication:
a) SAN Replication - Everything is in a SAN even the executables (very costy solution) - Events are replicated too
b) Replication using software like DRBD - Not tested, who is going to do ESM-Logger production with this one? - Events are replicated too
c) Content replication using packages (Protect '10 slides) - Propagate events from Main to DR using Connector.
d) System table export-import - Propagate events from Main to DR using Connector.
Regarding the System Table Export solution Vsankar pointed out that you lose the connectors. Will this idea be of any use?:
a) Using a script, export a connector package from the DR site
b) Export sys tables from the Main site
c) Import sys tables to the DR site
d) Import the previously exported Connector Package
If this helps, can we continue to my question? Does the system table export provide the following?:
a) Archived reports (low importancy - can be exported manually)
b) Active list data (high importancy)
c) Session list data (high importancy)
d) Trend Data (high importancy)
e) Events contained in cases (medium importancy)
I am looking forward to hearing additional thoughs.
If you virtualise your environment you could use VMware Falt Tolerance to have an online replica of your Manager, you could also do this with the DB but it might impact performance.
One seemless option would be to do the replication of the DB using Oracle itself but that won't be cheap.
Here are my observations when I tested the replication using system table export/import between the two ESM servers.
a) Archived reports - not done by the system table export, but you can just copy over the archived reports folder from the source server to the destination server.
b) Active list data - yes, system table export/import gets you the active list data from the source.
c) Session list data - yes, system table export/import gets you the active list data from the source.
d) Trend Data - no, trend data is not imported.
e) Events contained in cases - not sure
In our approach, we have the Connector Appliance feeding events to both servers thereby ensuring event replication. Content replication is done by system table export/import.
Nice, we are also considering the dual feed as the best option so far. SAN replication may propagate a malicious deletion of a file to the remote SAN too, on the other hand the dual feed acts as a backup too.
My observation here is that in dual feeds one should not use content import/export. This is because, for example, an active list will be populated, on the fly, due to the correlation done on the dual-fed events. Thus, theoritically, trends/list/sessions lists should not be replicated in a scheduled fashion in dual fed ESMs.
The case events in ESM 5 have their own event table. Thus they are not subject to retention period. I still have to test whether this table can be exported and imported to the DR site without problems.
Finally, as soon as I complete my content replication scenario I will try to share some thoughts/results/code here. If anyone has already performed a working content replication, please share some information here or at least a do/don't listing!
Yeah, we have a PS guy that suggested using the package utility method and despite of my my best efforts, there are too many workarounds and gotchas and manual intervention required for my level of comfort.
And it only partly works and can't really be automated. Basically you have to break up all your content into small packages under 30MB in size and this is sometimes hard to do and needs rework whenever your content grows beyond the limit in a single package.
The package utility is really buggy, prone to get stuck indefinitly, emits random DB errors that noone can explain, and sometimes causes cache/db discrepancy in the ESM, so we opted to run it with the ESM offline during the sync process. And it's slow. If you let it time itself out (instead of killing it before its timeout), bad things happen that sometimes require manual database repair because it doesn't roll back a transaction properly.
Also the package utility prompts you for a bunch of stuff so you need to anticipate all the possible prompts and script reasonable choices using an expect-like tool or the whole thing will freeze forever.
You have to make sure to use an ArcSight account with a non-expiring password because there's a bug where if the password is expired, the package utility goes into a tight loop emitting "expired password" messages eventually filling a disk partition if you're logging the package utility output.
There are many more gotchas and workarounds and still there's some manual fixing we have to do virtually every time it runs.