Welcome Serena Central users! CLICK HERE
The migration of the Serena Central community is currently underway. Be sure to read THIS MESSAGE to get your new login set up to access your account.
gportnoy1
New Member.
2817 views

SOC Workflow

I was wondering if people would be willing to share a little bit of their ArcSight-based workflow when it comes to actually getting events in front of analysts for analysis. At a certain point it doesn't make sense to have analysts just randomly click around the various dashboards looking for things that look "interesting", so how do people accomplish the task of "pushing" the events to analysts? Do you have one Active Channel with all the correlated rules and use Event Annotation for tracking and accountability? Do you have correlation rules automatically create cases and assign them to a queue? Some combination of the two? How do you handle escalations? I am sure there are many different approaches to this and I would love to start a discussion on what works and what doesn't, what was the logic behind your decision to go one way or another.

Appreciate any feedback!

0 Likes
11 Replies
whoamib Absent Member.
Absent Member.

Re: SOC Workflow

Good question Gary, I think it is very critical in a SOC to have a solid workflow so no critical events are missed.  I have implemented a workflow that is rather unique but it has worked very well for the last 3+ years.

I tried to keep the workflow as simple and bullet proof as possible.  First I have setup a rule naming structure to have all production rules start with "Prod"  then all test rules start with "Stage".  Then we use a channel by manager receipt time that filters all rules that start with "Prod" and have not been annotated yet.

The analyst  will then take the event by annotating it and it will disappear from everyone elses channel but the persons channel who did the annotation.  If the incident requires a case then the case numer is put into the annotation and an automated to script is used to create a case in an external case database. It is critical to create stages in ArcSight that make sense to your company.

Then we run a trend on the rule fires for the last 24hrs and review the rule fires as a team every morning.

I have a lot of other details but currently running out of battery on my tablet, sorry for any grammar errors because I'm on my tablet.

Please contact me if you have any questions,  I'm also curious about other techniques that other people may have.

-Ben Spader

0 Likes
gportnoy1
New Member.

Re: SOC Workflow

Thanks Ben,

Some great information there and I'd love to hear more when you have time. I bet I could probably pick your brain about this for hours, but I'll limit myself to just a few follow-up questions. With all the correlated events you want investigated on one channel, do people have time to dig in/resolve them all or do some end up being missed and scroll off the screen? I can see that being a problem for some busier rules, like IDS-driven stuff or firewall port-scan types of alerts. I was thinking of having a few tiers of rules, the really important stuff and the stuff to dig into if there is some down-time. How do you handle escalations or re-assignments within the team if for example if the first analyst can't see the incident they took to completion?

I'd love to hear from others as well on any of these topics. We have some great information on this site around content creation (technology if you will), but very little on people and process.

0 Likes
faruknagori1 Absent Member.
Absent Member.

Re: SOC Workflow

Hi Ben,

I am too interested, Lets have a call if possible and try to understand more on this.

Thanks,

FN

0 Likes
Acclaimed Contributor.. Volker Michels Acclaimed Contributor..
Acclaimed Contributor..

Re: SOC Workflow

Good morning,

we are driving a 24/7 SOC and we work with a Main Channel based on certain criteria (Filter: rule file path and stage queued). The analysts are monitoring the events in the channel and annotate them based on certain criteria like case opened, added to list, no action etc.

In case of case opened we have a special filter that keeps the events out of the main channel (analysts write an action item to engineering to implemt this filter). Cases will be forwarded to incident management.

In case of false positives we filter certain events directly at connector level or filter in front of the rules.

In summery, the analysts are monitoring one channel 24/7 but sometimes we also have focused channels.

Volker

0 Likes
hendersonc Absent Member.
Absent Member.

Re: SOC Workflow

Hello!  I work for ArcSight (an HP Company) in the SOC Solutions Group. I thought I would pass on some details on how we structure our SOC engagements with customers.

On my most recent clients, we have a couple active channels which we use.

  • Channel 1 - SOC MAIN CHANNEL (all new events which need analyst review are placed here)
  • Channel 2 - LEVEL 1 CHANNEL (events which require additional review are placed here)
  • Channel 3 - LEVEL 2 CHANNEL (events which require a more senior level analyst to review are placed here)

These channels are governed using the ArcSight ESM Stages.  We define a stage for each of those channels.  All events will go to the Main Channel, then will move through the workflow defined for a particular customer which is designed on their need.

Generic Stages:

  • Queued (default stage)
  • SOC Triage (SOC MAIN CHANNEL)
  • L1 Investigation
  • L2 Investigation
  • Engineering Review
  • Added to Existing Case
  • Closed - No Action Required
  • Closed - Ticket Created for External Team
  • Closed - Case Created for Internal Tracking

No base events are put on the SOC MAIN CHANNEL.  Only events which match a Use Case defined for active monitoring will be presented on the main channel.  This is accomplished with a correlation rule which sets the Event Annotation Stage to "SOC Triage" as a rule action.   All rules we create are based on Use Cases, so each rule will set the stage to "SOC Triage".  At that point the normal workflow attaches to the event and we move the event through the event lifecycle.

We set an SLA of 15 minutes for events on the Main Channel.  This is tracked with trends using timestamp math to ensure we are meeting our SLAs.  Events that cannot be resolved in 15 min are moved to another stage for further review.  If that additional analysis results in any sort of escalation, or an investigation that will take several days...weeks...etc, we generate an ArcSight ESM case to track that work.  If the customer uses an external ticketing system such as Remedy.....we use that tool to escalate to other teams.

Escalations and incident response will fully depend on the customer environment, politics, existing processes, maturity, etc.

Hope that helps, please let me know if there are any other follow up questions I can answer.

Colin

0 Likes
whoamib Absent Member.
Absent Member.

Re: SOC Workflow

Colin,

Thank you very much for the detailed information.  I would love to know more details on how you track the SLA in the SOC main channel?  Do you just compare between the Manager Receipt Time and Event Annotation Time?

This is very similar to how I have setup workflow here in the SOC at my company.  We have those 3 channels for Level 1 (Triage), Level 2 (escalation), and Level 3 (SOC Leaders).

Unfortunately even with a large SOC with many employees, we found that it is not realistic for Level 2, or Level 3 to operate out of an ArcSight channel.  We found it was much easier for only Level 1 to operate out of the channel, take actions and make notes within our case tool then the upper levels would take it from the case tool and maybe open ArcSight if required but because the correlated event has been put into the case tool it usually isn't needed.  By having the upper levels operate out of the case tool it was much more benficial to track the cases that are escalated.

Thanks,

-Ben

0 Likes
hendersonc Absent Member.
Absent Member.

Re: SOC Workflow

There is no one way to do it.  Whatever works in your environment is how you should structure it.  It is true that active channels can be hard to work out of if you do not have someone actively looking at the events on it.  Since channels are based on "$Now - <sometime>" the events will roll off of the channel.  So using cases to "save" events for review is a great use of the case functionality.

I have also seen where we do 2 channels, one for triage and then one for all other investigation stages.  Since we can force the stages to save the username of the person who annotated the event we can have one channel where all events under investigation are viewable.  That "investigation" channel will then show all events, the stages in which they are in, and the person who is working them.

As for tracking metrics and SLA adherence, you can do it like this:

  1. use a trend to run every 5 min, for each event in a stage OTHER THAN SOC Triage stage, pull the correlated eventID and calculate the difference between manager receipt time and event annotation time. This will calculate the time to annotate.  You can save this as a trend action in an active list and use it for calculating average time to work on an event.  This can be used to figure out if you missed SLA adherence.
  2. use a rule to track all events that come into the main channel by adding the event ID and manager receipt time into an active channel.  Set the list expiration to be 30 min (twice your SLA time).  If you see an event where an entry expires from this list (which all will) have a second rule to see if that eventID is in the list populated by your trend rule action above.  If the eventID is in the list then you already calculated the SLA adherence, so you do nothing.  If the eventID is not in the list then you are already double your SLA time and you can add that to another list or alert on it or whatever you want to do.    This will catch the events which are left in SOC Triage and are not ever annotated.
    0 Likes
    Vini Acclaimed Contributor.
    Acclaimed Contributor.

    Re: SOC Workflow

    I have been thinking about all the stuff discussed here and perhaps using some sort of a layered approach would be ideal for this work flow, this layered approach can be modular and each module would add more complexity and power to the workflow. More complex workflows would work for larger SOCs.

    What I am thinking here is not very well structured yet though but for this to work the events of interest (EOI) need to be very well documented, otherwise analysts wouldn't be able to handle the avalanche of events.

    Have you guys gone throuhg the process of carefully document all events of interest from all device different types that you have?

    0 Likes
    lbennett1 Absent Member.
    Absent Member.

    Re: SOC Workflow

    I am very interested in the concept of having high-level response documentation based on the typical EOIs that come into the Triage Channel. We haven't quite worked it out yet, but currently we have the response procedures as kb articles in ESM that can be referenced and associated with the EOI, to prove that we have followed this set procedure. This allows for auditing and structured response. Is anyone else doing this?

    0 Likes
    Highlighted
    dzuperku1 Absent Member.
    Absent Member.

    Re: SOC Workflow

    I’ve been reading this thread and find it very thought provoking.

    In my environment, when an events of interest rule (tested and verified) fires it opens a case and sends the alert to the analyst group. Most of the rules are around malware infected machines. The analysts work off of open cases for their work load. 

    I’m not sure if this is the most effected way of handling alerts / EOIs. Does anyone have any insight into how to handle smaller SOC’s that aren’t 24/7?

    0 Likes
    hendersonc Absent Member.
    Absent Member.

    Re: SOC Workflow

    That is also a good way to do it, especially with limited staff.  I would recommend trying to expand your scope past only malware though.

    0 Likes
    The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.