Highlighted
Absent Member.
Absent Member.
1665 views

Botnet heartbeat tracking theory

This can be moved to the use cases section if we need to but that part of the site doesn't seem to get a whole lot of traffic. My co-worker and I had a botnet heartbeat tracking theory we wanted to test and while I haven’t had the time to really dig into the results it shows promise (found 2 P2P applications after a quick review of the data one day last week). Figured I would share it here and see if it can be improved upon by the community.

Theory is as follows: Many Botnet heartbeats are under 1kb in size, go over the same port, and to the same address. If you were to track unique sets of sourceIP, destIP, and destPort would patterns “bubble up” at this traffic size level? This methodology wouldn’t capture a conficker type bot at first glance.

How we have implemented this thus far:

Active List – 2 day TTL, key fields – sourceIP, destIP, destPort

Rule – populates AL with above field with the idea that the “count” field increments by 1; 30 min suppression on aggregated fields

Current Query Viewers on Active List

Original QVs (not including drilldowns)

-          Most communicative: sourceIP | destPort | destIP | sum(count) – good for seeing some initial patterns

-          Most common destination: destIP | count(unique sourceIP) – good so far for white listing destinations (google, yahoo, Disney, etc). Eyeball threshold = 500

-          Most communicative outbound: sourceIP | count(unique destIP) – Eyeball threshold = 1k

Subsequent QVs added to the dashboard

-          Most common destination ports: destPort | count (unique sourceIP)|count (unique destIP)|sum (count). Port 80 and 443 lead the way but interesting to see the correlation between the number of unique sort IPs to number of unique destination IPs relative to the ports being used

o   Akamai P2P – port 3478 lowish source IP to higher number of destination (6:21)

o   Abacast P2P – high source IPs relative to 2 destination ports (11:2)

-          Most Entries on list regardless of ports: sourceIP|source zone name | sum(count)

-          Most unique ports relative to destIP: destIP | count(unique destPort) | count(unique sourceIP) | sum (count)

Would be interested in feedback and/or collaboration either in this thread or through PMs.

Message was edited by: Trisha Liu - added tags use_case, botnet

Tags (2)
0 Likes
Reply
13 Replies
Highlighted
Absent Member.
Absent Member.

Hey Mark,

Yeah, I have always said there should be a way to track Botnet's through ArcSight.

Below is my theory that I played with for a while, but haven't looked at in a couple of weeks.

I was looking for Botnets that use common ports like 80, 8080, 443 and 8443.

If Bytes out > 0 and Bytes out < 1500

Traffic is Outbound only

Aggregate on Dest Add, Dest Zone, Source Add, Source Zone, Cust name, Bytes Out

Aggregating 2 in 1 hour

     First seen add to active list 1

          seen a 2nd time, remove from active list 1, and move to ignore list

Rule 2, same as rule 1, except it has to already be in active list 1,

           adds to active list 2

Rule 3, same as rule 2, except it has to be already be in active list 2.

          adds to active list 3

Rule 4,   Fires to analyst channel


The idea is that you shouldn't have an event firing ever hour for 3 hours consectively with the same byte count.   

Let's keep the thread going.


0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Mark,

You are right on track with ways ArcSight can help track BotNets.  I have done similar work at various agencies with BotNet Use Cases.

I'm attaching a Use Case White Paper I wrote up based on some content I created.  It was successfully deployed during a Proof of Concept.  This plus other methods/theories could easily be incorporated into a series of capabilities all centered around BotNet tracking.

John

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Greetings to All,

Deployed the following for a customer detecting interesting TCP/UDP beaconing patterns with the following logic

Method we are utilizing leveaging variables utilizes 2 Real Time Rule -> One for UDP Traffic and One for TCP Traffic

* 2 Real Time Rules -> NOTE you have use Filter within the 3 event evaluations to leverage variables - as of this posting there is a bug within AS ESM 4.5 in which variables cannot be evaluated within the conditions section of Real Time Rules. So Create Filter #1 for TCP and Filter #2 for UDP

And 2 Real Time Rules that leverage the appropriate Filter - Each Filter has a Beacon Variable that leverage the Time Difference in Seconds Variable calculating the difference in time between the End Time and Start Time of a TCP Teardown event or UDP Teardown

=========

TCP Beacon Filter As Follows With Beacon Variable being Difference in Seconds between End Time and Start Time of the event being evaluated

Type = Base

Data Type =  FWSM

Name = TCP Teardown

Category Outcome = /Success

Bytes  Out => 32

Bytes Out <= 1500

Beacon <= 120

Traffic  Direction = Outbound

Real Time Rule # 1 to Detect TCP Beaconing

Join Conditions

Event 2 Target Address = Event 1 Target Address

Event 3 Target Address = Event 1 Target Address

Event 2 Bytes Out = Event 1 Bytes Out

Event 3 Bytes Out = Event 1 Bytes Out

Event 1

TCP Beacon Filter

Event 2

TCP Beacon Filter

Event 3

TCP Beacon Filter

========

UDP Beacon Filter As Follows With Beacon Variable being Difference in  Seconds between End Time and Start Time of the event being evaluated

Type = Base

Data Type =  FWSM

Name = UDP Teardown

Category Outcome = /Success

Bytes Out <= 64000

Beacon <= 120

Traffic  Direction = Outbound

Real Time Rule # 2 to Detect UDP Beaconing

Join  Conditions

Event 2 Target Address = Event 1 Target  Address

Event 3 Target Address = Event 1 Target Address

Event  2 Bytes Out = Event 1 Bytes Out

Event 3 Bytes Out = Event 1  Bytes Out

Event 1

UDP Beacon Filter 

Event 2

UDP Beacon Filter

Event 3

UDP Beacon Filter


=========

* Traffic Direction Outbound

* Leveraging TCP Teardown/UDP event from the FWSM data type

* 1 Real Time Rule evaluating 3 events within 30 minute time window

* Beacon variable = Difference in seconds between the End Time and Start Time of the TCP Teardown Event

* Packet size meets the RFC constraints for maximum size of a TCP packet and the minimum size of a TCP packet so  >= 32 bytes  and <= 1500 bytes - for UDP Packets  <= 64000 Bytes in size

* Beacon Variable  <= 120 seconds

* Join conditions -> Event 1 Target Address = Event 2 Target Address, Event 3 Target Address = Event 1 Target Address --- Event 1 Bytes Out = Event 2 Bytes Out, Event 3 Bytes Out = Event 1 Bytes Out

* Join Conditions Matching within a 30 minute time window

* Category OutCome for Event 1/2/3 = /Success

* event Type for Event 1/2/3 = Base events

Justin Jessup

Contractor

ArcSight Professional Services SOC Consultancy

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Addendum

Matching Join Conditions Within 30 Minutes -

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Good stuff so far! My plan is as I am able I want to implement each system and then compare results.
0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Interesting and very valuable.

One technique used by botters to avoid detection of their MC machines (or C&C as they're sometime called,  for Command and Control) is the use of compromised proxies, so botted machines may be reporting back to Latvia via LA - which negates the Geo detection rule. 

Another is the use of short term domains:  Botter will register hndreds if not thousands of domain, only some of which will be contacted by the compromised machines.  The domain selection can even be controlled by hashing a string found on a perfectly innocent web site.  (The real-world analogy here is a spy figuring out where to dead-drop an information packet based on a classified ad in a newspaper).

So, one might enhance the rule by looking for connections to a) domains that no one in your enterprise has ever gone to before and/or b) going to a domain that has been registered for say, less than a month.

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Theo - do you have a source of data flowing into ArcSight that shows domains that have been active for less than a period of time?

I implemented a few things relative to command and control lists. The results are basically either 1) the lists are stale or 2) I don't have anything on the network trying to hit those IPs. While I would like to believe number two I suspect number one is more the reason.

On a whim and not knowing how big it would be I created a Trend that writes to an Active List. Trend fires once a day, looks at all outbound traffic and captures destination ports that were used by at least 10 unique IPs over that 24hr period - by zone (heavy network segmentation project over last year reflected in ArcSight). Some zones are whitelisted and there are lots that don't talk outbound at all. AL has about 230 entries which is a lot lower than I expected. The driver is an attempt to monitor the network for "normal" w/o monitoring every machine. I might have to adjust the threshold down from 10 unique machines. I added a QV that pulls up all of the AL entries with a count of 1 which would indicate they were just added to the list the previous night. I honestly haven't done much since. The longer term goal is to come up with a couple use cases that will allow me to inject triggers in a tiered triage system I developed (we aren't 24x7 /shrug).

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Not sure if anyone else ever did anything else along the lines mentioned above. I wanted to add transport protocol to the mix but somehow the subsequent trends that were designed to replace what I had in place are broken/breaking somehow. As aspect of that is being looked at by AS support. The problem we have locally is we really don't have a SOC/SOC analysts who look at an active list scrolling by. The trick then is deriving meaningful data to alert people on or just have it as a QV type of supplemental thing.

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

Hello John,

on page 4 of your whitepaper you wrote, that it would be helpfull to import the US CERT Watchlist

How or where can i download it, i havent found on the US Cert Site.

Karsten

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

That is not a public source, but other similar security services publish malicious IP lists that you can subscribe to

john

John W. Bradshaw

Principal Sales Engineer, Federal

ArcSight, An HP Company

443-827-3700

0 Likes
Reply
Highlighted
Absent Member.
Absent Member.

I've done something similar in the past. As far as thoery goes...

The basic idea is that you use a system of rules to interact with a very large active list that is acting as a matrix of data that both holds values and controls the order in which certain rules fire on certain conversations.

Basically you use firewall logs, an enormous active list, and rules to record every conversation with every ip registered outsode the US. Then a second layer of rules wait for that pair to talk again. It compares the timestamps between the two conversations to get the interval between them. Then it gets passed into a set of looping rules that continue to monitor every instance of those two hosts talking. Everytime they talk the rules measure the interval between them, if the interval is within 75% of the average interval for these two hosts (yes it also constantly recalculates the average interval of conversations between the two hosts) then the rule will increment a 'PASS' value meaning that the interval is within acceptable boundries. If the new interval is not within 75% similarity to the average for those two hosts then it will increment a 'FAIL' value meaning that it was outside of acceptable bounds. Each subsiquent conversation continues to loop through the same rules to measure the interval and increment the PASS or FAIL count. There are two rules in the final tier that wait for the PASS or FAIL counts to reach certain values (I used 5 passes or 3 fails). If any conversation passes 5 time then a final rule will fire letting you know that it has identified a beacon and it will report the foreign ip and the average beaconing interval between the two. If it fails 3 times then it will remove the pair from the active list and they will start the process over again the next time they talk. Doing it this way will reduce the number of false positives the use case produces.

Generally speaking this is very very accurate. A big strength is that it doesn't matter what port they talk on, if two hosts talk, it will find them even if they use identical or random ports for the conversations. You use the TTL of the active list to set the largest beacon that you want to watch for. Meaning if you set the TTL to 7 days, it will find any beacon even if the two hosts only exchane a single packet once a week.

This will only identify semi-regular beaconing with foreign hosts, so if the timings are random, or something constantly changes the ip that it talks to, this won't help you, but for what it's made to find, it's perfect.

0 Likes
Reply
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.