Moving Sentinel Correlation Rules En Masse

over 6 years ago
NetIQ's Sentinel product is a SIEM solution that provides the ability to capture events from any number of event sources, and of any type, into a system that normalizes the data, finds meaning in a barrage of nonsense (needle in a haystack), and includes incidents management based on custom workflows as defined by each customer. In order to pull off this feat, Sentinel is often installed in an environment as an application distributed across multiple servers, and considering the CPU power required to handle tens of thousands of events per second, that should make complete sense.

Presumably in order to bring value to customers out of the box, there are hundreds of correlation rules that ship out of the box, and something like 118 of those rules are even deployed from the start. These are rules that automatically look for certain types of activity that is reliably considered suspicious, such as modification of system accounts in common directories. This means that from the first time Sentinel is on it is already providing value for any events that it audits from supported event sources (such as all of those Linux servers running locally or "in the cloud"). With that being the upside, there is a downside: you only get one "Reporting Server" (main server) in Sentinel, and this server already does a lot; event storage, event indexing, searches, reports, management of plugins, connections from clients of all types, the web-based user interface (UI), the REST interface, etc. Since Correlation Rules often make sense on their own dedicated Correlation Engine (CE), just like having collectors/parsers on their own dedicated Collector Manager (CM), moving all of those 118 or so rules from the main server to the dedicated CE is a good idea and, if done manually, very frustrating.

Thankfully one of the many pieces handled by the Sentinel Reporting Server is a REST interface that lets you do all kinds of nice things quickly. The API documentation is built into the system so you can access it once logged into Sentinel, see tutorials for automating things via scripts, and generally have a bit too much fun, but today we're doing work, so let's start there (fun with work). To access the REST API documentation login to Sentinel and then click on 'Help' (upper-right corner) and then on 'APIs'. This will open a new tab that essentially takes you here:

Once there, click on 'API Reference' and, for our case, 'Correlation'. We have now reached a URL that looks like this and which is just part of a huge page of links to various documents with the specifics we seek:

Note: The anchor at the end of this URL has 'supMeths' in it, meaning these are supported methods. Not all of the REST API methods may be supported, which means they likely work but if you call into NetIQ Support they may not officially support use of those.

Our particular section of interest has the following bulleted items:
Correlation objects are used to manage the real-time analysis of incoming event data.

CorrelatedEvents Count Method
CorrelatedEvents List Method
CorrelatedEvents Retrieve Method
CorrelationEngine Count Method
CorrelationEngine Create Method
CorrelationEngine Delete Method
CorrelationEngine List Method
CorrelationEngine Retrieve Method
CorrelationEngine Update Method
CorrelationRule Count Method
CorrelationRule Create Method
CorrelationRule Delete Method
CorrelationRule List Method
CorrelationRule Retrieve Method
CorrelationRule Update Method
DynamicList Count Method
DynamicList Create Method
DynamicList Delete Method
DynamicList List Method
DynamicList Retrieve Method
DynamicList Update Method
DynamicListItem Count Method
DynamicListItem Create Method
DynamicListItem Delete Method
DynamicListItem List Method
DynamicListItem Retrieve Method

With each of those being a link, you can see there are options to manipulate engines, rules, dynamic lists, and the items within dynamic lists. If you have worked with Sentinel for more than a few hours, this may start to get really exciting. Have you ever wanted to see, programmatically, what is in a running system's dynamic list, either one that is there by default or a custom one created by you? That's pretty neat. Better yet, have you wanted to update the silly thing to have a new value on the fly, maybe based on something else happening in another (e.g. Identity Management) system? For example, pretend you hire Bob, and he's the new administrator for SystemXYZ. You want to more-closely monitor Bob's account, either because he's new and you aren't sure about his shifty eyes, or because he's privileged and you aren't sure that his password is strong enough to be hard to guess considering the number of sticky notes he's hiding under his keyboard. For whatever the reason, there is a need to more-closely monitor him, which you have already implemented in Sentinel, as long as you can get his account into that list of monitored folks. You can do this manually, as you have done for years, or the API above makes this appear not just possible, but easy. More on that scenario later, but for now let's look at the CorrelationEngine Update method, as this is the one we'll use for our current task of moving all deployed Correlation Rules from one CE on the (main) Reporting Server to a dedicated CE in the environment.

The resulting page has some details like this which, if you have ever done anything with REST or maybe even SOAP, should start to make a little bit of sense:

Basically the URI shows that the REST interface needs the ID (UUID) of the Correlation Engine to access the appropriate CE, and from there the lower section of documentation makes a lot of sense:

Authentication Types

Sentinel Permissions Needed

Supported Formats

URL Parameters

Success Codes
204 No Content

Fault Codes
400 Bad Request
403 Forbidden
404 Not Found
500 Internal Server Error
503 Service Unavailable

This section of the page gives us great information, and basically everything we need for a high-level idea of how to call this API. Which rights do we need? Either manageCorrelation or to be an admin. Which formats are used? For requests, JSON (which is common for REST APIs, along with XML). What are the possible return codes? Standard HTTP ones so we can easily watch for those and our REST/HTTP clients will happily let us see those as they should.

Continuing down the page we find more about that 'Request' section which is critical to our success:
Request Data

Object type: correlation-engine
Correlation Engine.
Field Required Description
Active false Needs description.
DeployedRules false Needs description.
Name true Needs description.

Sample Request

We'll see if I can get the table to render properly, but basically that top section is a table showing which fields are present in the HTTP (REST) Request, whether or not they are required, and then some kind of description. The lower section shows a sample request; of note is the fact that we use an HTTP PUT (vs. GET and POST with which most web developers are familiar) to send the data to Sentinel. The data following 'PUT https://...' is the JSON (JavaScript Object Notation) content which is being sent to instruct Sentinel on what to do with the object referenced in the URI. The big long string at the end of the curl line is the UUID of the Correlation Engine (CE) which we want to modify. If you use an invalid UUID, of course, you'll get a nastygram back from Sentinel indicating it could not find that object. Testing this to prove it, the following XML came back along with an HTTP 404 header which is visible with curl using the '-v' (verbose) switch:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><WrapperNcacFault><Fault><Code><Value>Server</Value><Subcode><Value>NotFound</Value></Subcode></Code><Reason><Text>Object of type 'correlation-engine' with key values '[696080E0-9A20-1029-ADDD-0003BAC97ADFD]' not found.</Text></Reason></Fault></WrapperNcacFault>

One thing that probably comes to mind, then, is how do you get access to that big UUID for the URI, and also should 'DeployedRules' in the JSON be set to null like that? One step that we can take to answer both questions is to use another API, the CorrelationEngine List method, to find out about the currently-existing CEs for this environment. One really nice thing about the built-in help is that the URIs generally work, meaning they are modified for your system already, so for example the CorrelationEngine List method includes this URI for my system, which if I drop into my browser (open new tab in existing browser, paste) I immediately see the returned results, and this is because the API uses the same type of authentication as the main web interface, so your browser is (as one would expect) a natural REST API browser, better suited for some tasks than others. At this point I am going to instead switch over to playing from the command line, as the ultimate goal is to make changes and the CLI is better for "real work", in my opinion, as we move forward. Borrowing from the Sentinel-supplied tutorial where it uses 'curl' for everything ( I will run the following command from the server's shell to setup a SAML token for everything else (curl will prompt me for a password for my user, specified as 'admin' below):
curl -k -X POST --basic -u admin https://localhost:8443/SentinelAuthServices/auth/tokens | sed -rn 's/\{"Token":"([^"] )". /\1/p' >saml-token

and then the following command will use that generated saml-token file to make the REST API call to get all CEs from the system:
curl -k -X GET -H "Authorization: X-SAML $(cat saml-token)" https://localhost:8443/SentinelRESTServices/objects/correlation-engine

Resulting in the following output:

Some things which stand out include the @href property in the meta section; this includes the UUID of the CE, and is in fact a link which can be used with the curl command above to get the details of just one CE at a time. This is useful, since this is also the same URI that we'll use to perform updates shortly, telling each CE exactly how to be configured.

Also you likely see a huge section within DeployedRules on one CE, and that is the main server with its 118 deployed rules. It is these rules, represented by their respective UUIDs, that we want to move to the other server. For some of you who have played with REST in the past, this is enough information to get going, but there are a few application-specific caveats to note which cannot be understood by simply seeing the method definition for the API, or the JSON output. First, you cannot deploy any single rule to multiple CEs, so attempting to set any of these on the new CE before removing the same from the current CE is going to result in an error from Sentinel that will not be as clear as you may prefer. Using curl with -v you'll see the following:
< HTTP/1.1 100 Continue
< HTTP/1.1 400 Bad Request
< Date: Tue, 23 Dec 2014 15:09:59 GMT
< Content-Type: application/xml
< Content-Length: 234
< Server: Jetty(8.1.7.v20120910)
* Connection #0 to host left intact
* Closing connection #0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><WrapperNcacFault><Fault><Code><Value>Sender</Value><Subcode><Value>BadData</Value></Subcode></Code><Reason><Text>Object already exists.</Text></Reason></Fault></WrapperNcacFault>

If you try to deploy a rule to a second CE using the web UI, you'll find pretty quickly that the options to make this possible are grayed out until the rule is first undeployed from its current CE. The Sentinel product documentation may also define this constraint explicitly, but experience is probably the best teacher here. Despite having used Sentinel for a long time, I wanted to try this myself to be sure that my import worked before I broke my export, and was sadly reminded that the API is enforcing the limit no matter what I tried.

We know where we are headed, so let's just go there. If inclined, run the command above and redirect it to a backup file just so we have a record of everything before we started hacking with REST (backups are good, as are dev and test environments, snapshots of VMs, etc.). Once finished, run the command one more time appending the main server's UUID to the URI so that you get just the big bulky output with all deployed rules in one file. This file will then be modified (with a text editor for now) for use importing to the new CE.
curl -k -X GET -H "Authorization: X-SAML $(cat saml-token)" https://localhost:8443/SentinelRESTServices/objects/correlation-engine/696080E0-9A20-1029-ADDD-0003BAC9707D > reporting-server-ce.json
cp reporting-server-ce.json ce-only.json

At this point we have two identical files which both have the data of the main/reporting Sentinel server CE configuration. Now we will modify both files to have them meet our needs, and here is where syntax errors will likely cause most of us problems. JSON is a simple structure, but matching brackets, braces, and double-quote (when they're used) for containment of data, plus understanding colons and commas to break up what are basically fancy name/value pairs, can get the best of us.

Note: There are free tools online to validate JSON so if you feel comfortable doing so, validate the JSON before trying to import it using a tool like this one (found via Google): The name should look familiar to anybody who has done serious XML hacking before as a reference to the 'xmllint' command that can also be used for XML validation. For those with programming interests, validating a JSON input can also be done with just about any programming language out there.... Python has recently been shown to me to have some great JSON libraries, so perhaps try out those, or use the Rhino classes that Sentinel uses on its backend for a Java implementation of JS.

First let's start by un-deploying rules, and we'll do this for two reasons.

  1. This is the easier transformation by far, and in fact we can type it out.

  • It is required to do this before we can deploy the rules somewhere else.

Go ahead and practice modifying the file directly if interested, but at this time we ONLY want the DeployedRules block to be present. The caveat here is that you CANNOT move/remove it from its place in the JSON file, meaning you cannot move DeployedRule to the very start of the file, taking it outside of its enclosing braces, or the input will then be invalid per the method's expectations. The final result could look something like this:


The DeployedRules is in its place (as far as JSON depth is concerned) within the braces that enclose everything, and the list of values within the brackets is zero-length (meaning no values). If the contents of reporting-server-ce.json had this alone within, the following command (which must have the correct server's UUID at the end of the URI) would undeploy all rules and do nothing else:

curl -k -X PUT -H "Authorization: X-SAML $(cat saml-token)" -H 'Content-Type: application/json' -d @./reporting-server-ce.json https://localhost:8443/SentinelRESTServices/objects/correlation-engine/696080E0-9A20-1029-ADDD-0003BAC9707D

Note that this command has an added header specifying the Content-Type of the PUT data, so that the REST application knows what it is receiving is JSON data and not something else. Without this header I had some silent returns from 'curl' which did not include errors, but also did not work. Adding the -v (verbose) parameter to 'curl' let me see that the PUT data resulted in an HTTP 415 Unsupported Media Type error which makes sense as the default Content-Type for the -d option in curl is apparently 'application/x-www-form-urlencoded', i.e. the wrong format even though the actual data structure was correct. There are HTTP specifics that we could get into about why this happened, but for now just include the Content-Type header that appropriately matches the type of data being sent in.

On my system this command took a couple of seconds to return, maybe three or four. Looking in Sentinel's web UI I immediately see that my rules are no longer deployed to the CE on the Reporting Server, which is exactly what I wanted. The next step is to redeploy these rules to the other CE. Since we understand the necessary JSON structure, let's try modifying the ce-only.json file to ONLY have a DeployedRules section, but this time with all of the values of the other system: Once done, try importing and let's see what happens. The resulting file should look something like this:


Applying our curl command, remembering to modify the URI to refer to the other CE by UUID at the end, we should run something like this:

curl -k -X PUT -H "Authorization: X-SAML $(cat saml-token)" -H 'Content-Type: application/json' -d @ce-only.json https://localhost:8443/SentinelRESTServices/objects/correlation-engine/682A7D0-9BB0-1FF9-A64D-000523C9716E

Again we are sending in the ce-only.json file which should have a list of all of our previously-deployed rules from the main Reporting Server; we're specifying the Content-Type header properly, and we have modified the URI at the end of the curl command to have the UUID of the CE-only system as retrieved much earlier. This command, too, should take a few seconds to run, and the Sentinel web UI should show all of our rules now deployed on the other CE. If you look at the properties/statistics of the rules you should see that their counters for number of events processed is relatively low based on your system's event rate.

What we have shown is that the REST API, as it should be, is very powerful and lets us do tasks that should take seconds in seconds and do so very reliably, rather than taking hours, and being very error-prone. Imagine the number of clicks, keystrokes, and pulls-of-your-hair it would take to undeploy 118 rules, find them all again and redeploy them on the other CE, and the power becomes pretty apparent. Also consider what could be done should the hardware fail for one CE; automation could be put in place to detect that type of occurrence using tools like the SUSE Linux Enterprise Server (SLES) High Availability Extension (HAE) to then immediately migrate the resource of all correlation rules to another CE that has been passively waiting for work or running another subset of rules. Even without automation in place, using the steps above manually to move deployed rules from one system to another means the failure of one piece of hardware is a problem that only takes a few minutes of time away from production, leaving you with tons of time to work out how to replace the physical machine.

Have you found any REST API tricks with Sentinel? Share in the comments below, or write up a solution. Have ideas for tricks you'd like to see written-up? Share in the comments below and I'll see what I can do.

Happy computing!
Comment List
Related Discussions