Newbie Question - Catalog or Report of Unique Event Types

Hello; I have a newbie question and it may have been asked/answered before and I just didn't see it, if so I apologize for asking again if that's the case.

Background: I'm new to Arcsight and just came on board at a company that began implementing it last year.  I have been assigned the task of helping to create effective reporting, trending, and analytics using data collected in Arcsight.

Problem:  Apparently in reviewing the implementation documentation and in speaking with the person assigned to support Arcsight, they turned on the audit flood gates and started dumping data without identifying the unique event codes and event types specific to each system, app, etc... that is sending data to Arcsight.  As you can imagine we are getting millions of records.

Question:  Is there a way to generate a report of every unique event type/code and group them by point of origin?  Of do you have a more effective solution/recommendation?

  • I think you need to narrow it down a bit more to specific fields for a good answer to that question. In other words, what do you consider to be a "unique event type?" You also have to consider that not all parsers are consistent in the way they map fields, so for instance you can't just collect all the unique values of "device event class ID" when some feeds don't actually use that field. Some use a unique event name for each event type, but some will use the same name for several types of events. This is the tricky part about trying to report across all feeds.

    Here's a suggestion that might help you get started... you might create a trend that collects on all of the following fields: agent name, agent type, device address, device hostname, device vendor, device product name, device event category, device event class ID, and sum(aggregated event count). Group by all those fields except aggregated event count. Whether you run the trend hourly or daily is up to you, just run it at least 24 hours behind if possible to keep it from hitting the current day's data and affecting query and active channel performance on the current day's data.

    Once you have that trend, you can create queries that use only some or all of the fields in that trend, and they'll be a great deal faster than querying the event table directly. You might start off by reporting on just device address, device hostname, device vendor, device product, and sum(sum(aggregated event count)). This will give you an idea of what devices (by address and hostname, grouped by vendor and product) are generating the most events. Then you can manually dive into them as needed. You could also try reporting on ALL of those fields, ordered according to the highest sum of aggregated event count, and possibly find the most common events by volume.

    Go from there and you might find other fields that are important to consider for your environment in this type of report. Typically when you launch into this effort you have to make a few passes at it until you figure out what's right for your environment.