HOW TO - Enriching Logs with Entropy by External Mapper for Threat Hunting
This article has been written and submitted by mr_ergene Knowledge Partner
When it comes to threat hunting, we need to analyse logs manually. Entropy analysis is one of those manual methods. You can read more about this on https://www.redcanary.com/blog/threat-hunting-entropy/.
After finding the freq_server written by Mark Baggett on https://github.com/MarkBaggett/freq, I started to search for a way to implement this amazing function into ArcSight. My first attempts failed because of performance issues but finally I made it to work!
This post is similar to the one on https://community.softwaregrp.com/t5/ArcSight-User-Discussions/External-Mapping-with-Python-script/td-p/1505287. I recommend reading it.
I've done the implementation for Microsoft DNS Debug logs. To keep the post short, I'm not going to give details about installing/configuring the database part. You can check the post on the link above or do a google search.
So, let's start!
Pre-requisites(eveyrthing is done on a ubuntu server):
-Installing and configuring the freq_server:
1. Follow the steps on https://github.com/MarkBaggett/freq (my freq_server runs on http://127.0.0.1:9000)
-Postgresql Part (I recommend using PgAdmin tool):
1. Install postgresql server, plpython and pycurl
2. enable plpython on postgresql
2.a. restart postgresql service
3. create plpython extension:
CREATE EXTENSION plpythonu;
4. create a type
CREATE TYPE dnsquery AS (dnsquery text,freq_score text);
5. create the script
CREATE OR REPLACE FUNCTION arcsight_freq_calc(qnames text) RETURNS SETOF dnsquery AS $$ import pycurl import cStringIO result= for qname in qnames: buffer = cStringIO.StringIO() uri="http://127.0.0.1:9000/measure1/"+qname c = pycurl.Curl() c.setopt(c.URL, uri) c.setopt(c.WRITEFUNCTION, buffer.write) c.perform() data=buffer.getvalue() result.append([qname,data]) buffer.close() return result $$ language plpythonu volatile;
1. create folder named "extmap" under /current/user/agent/ folder.
2. find your agent Id for the destination you want to perform external mapping and create a folder with same name under extmap folder that you created. (example folder name: 3f+1w9h4BABCWK+nnWbNQnw==)
3. Create extmap.0.properties file under the folder you created at step 2 and edit it:
type=sql field.getter=destinationHostName field.setter.count=1 field.setter=flexString1 jdbc.class=org.postgresql.Driver jdbc.url=jdbc:postgresql://<postgresdb_server_IP:5432/arcsight jdbc.username=arcsight jdbc.password=OBFUSCATE.4.9.0:Hy2sNwfbOcvdMg3ygNJRjykxA6CcOfeIzwOAjrYHcouLDA0X jdbc.query= select dnsquery, freq_score from arcsight_freq_callc (ARRAY[?\u0000?])
1. you can create a custom user and db for on postgresql or use default ones.
2. If the postgresql server is installed on a seperate machine, don't forget to enable remote access.
3. for the password in extmap file, you need to generate it on the connector that you perform external mapping. don't copy from another connectors extmap file. the command is :
./arcsight agent runjava com.arcsight.agent.loadable._ExternalMapperComponent -p 'passwordForDbUser' (without quotes)
I've used python requests module first but the performance was very slow. After searching, I've found pycurl was the fastest method (https://stackoverflow.com/questions/15461995/python-requests-vs-pycurl-performance). That's why I used the pycurl.
I've done a simple benchmark on the postgresql found that the script can handle about 1300TPS for 1 thread, 1 client connection.
For the connector performance; I've installed a tcp syslog connector, applied external mapping on it, and then forward dns logs from the dns trace connector as cef. The syslog connector performed well around 1300EPS and even more but I didn't check for higher EPS. If you apply mapping on the same connector, check if the connector handles all the process.
You can apply this mapping for proxy logs, file names, process names, service names etc. All you need to do is setting up a dedicated freq_server for each one and create frequency tables for them. do not use same frequency table for everything and tune your frequency table.
HOW TO HUNT
Analyse the logs and check what is the score for normal records. Usually, if the score is higher than 5, you can think it as suspicious. You can do this on logger or investigate by searching. Or you can create rules on ESM.