Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..
1237 views

Anyone want to Help me build an new Primitive Base Parser

Anyone want to tackle a fun project in the Quick FLex to build a parser for - Cylance Protect 

 

its a Syslog Parser - the agent is Syslog via TCP

 

08-23-2016 18:27:55 Local7.Debug 52.63.15.218 1 2016-08-24T01:27:01.4434447Z sysloghost CylancePROTECT - - - Event Type: Threat, Event Name: threat_quarantined, Device Name: TEST-DEV, IP Address: (192.168.10.138), File Name: Setup.exe, Path: C:\Users\test\AppData\Local\Temp\a2aarPfHPg\ikvfHG8B\, Drive Type: Internal Hard Drive, SHA256: FC4B40A33084FB965473D6B5A69B87B1930B4BBB7F5387B7D6C66E4069168931, MD5: 125F05165117D7C5A17B83B8347A9A9C, Status: Quarantined, Cylance Score: 89, Found Date: 8/24/2016 1:09:45 AM, File Type: Executable, Is Running: False, Auto Run: True, Detected By: BackgroundThreatDetection, Zone Names: (1,A Zone With A Very Long Name 123,Mac Zone,MM_Zone,Zone A,Zone B)

Labels (2)
0 Likes
14 Replies
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

I will add a ZIP for the two loads of clean aka TEST Syslog feeds  - 

 

I am building the  Token Maps now for where it should go field for field using the Activate Framework - and the Malware / AnitVirus products as a template.

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

Okay so the Prmitive -or Base Regex I came up with is here 

 

(?:\[?([\w\.]*\[\w.]*)\]?)?\s*(?:([^\[\(]+))

 

Downside the event line above it matches every character except the  (  of the opening String lines of the IP address or the Zone name - 

 

Any takers for how to adjust that 

0 Likes
Highlighted
New Member.

I will take a guess (though may be wrong)...

\\\(

Have to escape the "(" along with declaring it a string (again, not a regex expert, so I may be way off).

 

0 Likes
Highlighted
Micro Focus Frequent Contributor
Micro Focus Frequent Contributor

Hello!

I just started taking a look at this and there are some points:

It looks like this log is being sent through an extra syslog receiver. Normally we should not receive the "08-23-2016 18:27:55 Local7.Debug 52.63.15.218 1", only the subsequent section.

The next section should also be parsed, automatically, by the syslog. In our case, the first section should be parsed. Anyway, I would ignore this until the syslog configuration can be adjusted to remove the double header.

This is not a work for only a single parser, this should be treated as a multiparser: because it is a syslog, it has to be a regex, but everything else is a key value. For ex:

 

Even Type = Threat

Event Name = threat_quarantined

Device Name = TEST-DEV

IP Address = 192.168.10.138 -- This one requires some extra processing to remove the parenthesis

It keeps going.

All that said, if we want to match the exact string we received, this would work as regex (there are no spaces on this regex):

(?:\d{2}-\d{2}-\d{4}\s\d\d:\d\d:\d\d\s\S+\s\S+\s\w)\s(\d{4}-\d{2}-\d{2}\w\d{2}:\d{2}:\d{2}\.\d{7}\w)\s(\S+)\s(\S+)\s\S+\s\S+\s\S+\s(.*)

As for the smartconnector version of the regex, it requires double backslashes. Use this version:

(?:\\d{2}-\\d{2}-\\d{4} \\d\\d:\\d\\d:\\d\\d \\S+ \\S+ \\w)\\s(\\d{4}-\\d{2}-\\d{2}\\w\\d{2}:\\d{2}:\\d{2}\\.\\d{7}\\w)\\s(\\S+)\\s(\\S+)\\s\\S+\\s\\S+\\s\\S+\\s(.*)

I am ignoring the mapping of the first portion of the syslog header (this will probably need some modification, but it matches the current line).

The last parenthesis should be mapped to some field, let's say flexString1. We will use this field to chain load an extra processor for keyvalue parsing.

Using the keyvalue has the added benefit of making it really easy to parse any extra fields that can show up on different events.

I would like to help on finish building this and explaining my process, but I think a better source would be required. Can the OP confirm how this file was generated? I feel like the if the stream is sent directly to a SmartConnector we won't see the first header. The device sending the event may need some renaming to not send sysloghost and actually send its own hostname.

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

I have uploaded the files I am currently working with - these were provided by the Vendor - not my Production environment 

 

THe ZIP attached is a CSV and what the vendor called a CLEAN version of the same events ---

 

I can grab replay out of my own environment if someone has directions on how to pull REPLAY events off a Running SmartConnector - on an ArcMC appliance 

0 Likes
Highlighted
Micro Focus Frequent Contributor
Micro Focus Frequent Contributor

The files you are working with don't look right to me, they could be, but they are definitely not standard format. Let's try to extract your raw logs?

There are some different ways to do it:

If you have ESM, you could enable the "Preserve Raw Log" option on the connector for ESM destination (double click on the connector on esm, go to default tab and look for preserve raw). After that, open the active channel, add the column Raw Event to the columns and see if there's data there. Select the events and export all columns (make sure there's no private/secret data in there before sharing, please).

If you don't have ESM, you can do all this from ArcMC, but I would add a CEF File destination, enable the preserve raw log option there. After extracting the file and confirming you have the raw log in there, I would delete the destination.

I don't have any quick guides to share on that front, feel free to send me a private message on the forum if you need.

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

Okay so the 6.11 Patch 1 ESM did somethign that the old Express 3.0 and 4.0 system would have choked over -

 

I got a CSV out of it - it has the RAW event field - 

 

SO I take it I strip the file down to just that field and use that for the REPLAY EVENTs 

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

<41>1 2017-09-28T20:15:21.7526558Z sysloghost CylancePROTECT - - - Event Type: ScriptControl, Event Name: Alert, Device Name: H-OSTNA-ME, File Path: c:\users\XXX\appdata\local\temp\kilab74.vbs, Interpreter: ActiveScript, Interpreter Version: 5.8.7600.16385, Zone Names: (5_LH_Endpoints_L5)

 

<41>1 2017-09-28T20:15:21.7526558Z <DATE TIME

sysloghost  <-- this is in every event

CylancePROTECT <<-- DEVICE PRODUCT

 

- - -  <<--- IN every Event 

 

Event Type: ScriptControl,  ----- Event Found by -

Event Name: Alert,        ---------------- ALERT

Device Name: H-OST-NAME,  --------- DEVICE IT HAPPENED ON

File Path: c:\users\USERNAME\appdata\local\temp\kilab74.vbs,  ---------- FILE PATH to script trying to exploit system

Interpreter: ActiveScript,  ------- Verified by Anti-Virus Module

Interpreter Version: 5.8.7600.16385,  -------- AV engine Version 

Zone Names: (5_LH_Endpoints_L5)  ------- ZONE Name for Where in the Appliction the Device can be found

0 Likes
Highlighted
Micro Focus Frequent Contributor
Micro Focus Frequent Contributor

All right,

This actually looks like something I would expect:

<41> -> Syslog header, means syslog.alert (facility.level)

1 -> shouldn't be there, but I think our Framework parser seen this before and will properly ignore this

date portion -> ok

sysloghost -> hostname portion... this is wrong and should have the actual hostname of your device sending the event. I can work around it and grab the ip address of the sending device, not a problem

CylancePROTECT -> will use this to identify the regex

- - - -> will be the last thing on the regex parser everything else will go to keyvalue.

I will update the thread tomorrow with a basic parser and a line by line explanation. When I do, feel free to ask any questions!

 

For some reason I am not able to attach files or do new posts. I will check back on Monday to see if I can post

0 Likes
Highlighted
Micro Focus Frequent Contributor
Micro Focus Frequent Contributor

I am attaching the zip file with the flexagent (it is working on my connector and I have the output on my ESM).

Based on the raw syslog you provided, I have created a file and treated is as a syslog file. In theory, your syslog should parse fine.

On my connector, I alway modify two properties on the agent.properties to look like this:

agents[0].customsubagentlist=flexagent_syslog

agents[0].usecustomsubagentlist=true

This will make sure the connector isn't doing any funky match of whatever log we receive against a different parser.

If we say the connector does parsing AND normalization, my files are taking care of the parsing portion. Assigning the right values to the right fields usually require more events and a better understanding of the event source. Also, during normalization, some "tokens" may require extra parsing using token operators: __regexToken or even submessages

Some of our connectors use extraprocessors for map files, so simple messages like "Alert" can have a better Event Name description.

Please, read and review the parser as it explain the reason for each line on it. I hope this helps everyone build better parsers in the future. 

I am replacing my server's CPU this week and my access to work on it will be limited, but feel free to ask any questions.

0 Likes
Highlighted
Outstanding Contributor.. Outstanding Contributor..
Outstanding Contributor..

Okay so the base regex - picks up everything with one caveat - 

the trailing (coma) that separates the event lines is the only non-essential character in the tokens.

in the CSV I added there are 21 fileds to be tokenized.

So I am working on those now - making content, parsers, and doing upgrades all at one time slows me down .

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.