Highlighted
Respected Contributor.
Respected Contributor.
825 views

Flex Connector - events adds special characters.

Jump to solution

Hi Community,

 

I am trying to develop a Flex Connector for a system.

My script is done and is parsing the events fine I tried it in Regex101flex.PNG

 

 

 

but when I tried to apply it in the connector. The events on the log files don't match and I'm getting this error on agent.log:

[2019-04-17 11:20:04,628][WARN ][default.com.arcsight.agent.sdk.a.t][parseTokensNow] Message [ < P R O C E S S _ C R E A T E D f i l e _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ t a s k h o s t . e x e " p i d = " 6 1 3 6 " p r o c e s s _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ s e r v i c e s . e x e " p p i d = " 5 8 4 " p a r e n t _ p r o c e s s _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ w i n i n i t . e x e " c k s u m = " 0 0 2 d 0 7 9 a b f 2 1 7 b 3 f d c 3 f 0 c 8 6 a f 9 3 a 2 c 5 1 a b 4 5 2 2 7 " e v e n t _ t i m e = " 1 5 5 3 3 1 8 0 2 3 7 1 6 " e v e n t _ t i m e _ u t c = " M a r 2 3 2 0 1 9 : 0 5 : 1 3 : 4 3 " u s e r _ n a m e = " N T A u t h o r i t y \ S y s t e m " / > ] did not match the common regular expression [\<([A-Z_]+)\s+file_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\spid="([0-9]{3,4})"\sprocess_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\sppid="([0-9]{3,4})"\sparent_process_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\s(?:cksum="([a-zA-Z0-9]+)"\s)?event_time="([0-9]+)"\sevent_time_utc="([A-Za-z]+\s+[0-9]{1,2}\s[0-9]{4}\:[0-9]{2}\:[0-9]{2}\:[0-9]{2})"\s(?:file_type="([a-z0-9\-]+)"\s)?(?:is_system_file="(true|false)"\s)?user_name="(.+)"\s\/\>], ignoring...

[2019-04-17 11:20:04,628][WARN ][default.com.arcsight.agent.sdk.a.t][parseTokensNow] Message [ < P R O C E S S _ C R E A T E D f i l e _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ r a s e r v e r . e x e " p i d = " 7 4 7 2 " p r o c e s s _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ s e r v i c e s . e x e " p p i d = " 5 8 4 " p a r e n t _ p r o c e s s _ n a m e = " C : \ W i n d o w s \ S y s t e m 3 2 \ w i n i n i t . e x e " c k s u m = " 2 c 5 5 0 a 3 2 b c 7 6 d d 6 c d f 9 7 7 7 d 5 8 6 3 3 1 2 f 6 9 f 3 e f e 0 5 " e v e n t _ t i m e = " 1 5 5 3 3 1 8 0 3 9 7 1 7 " e v e n t _ t i m e _ u t c = " M a r 2 3 2 0 1 9 : 0 5 : 1 3 : 5 9 " u s e r _ n a m e = " N T A u t h o r i t y \ S y s t e m " / > ] did not match the common regular expression [\<([A-Z_]+)\s+file_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\spid="([0-9]{3,4})"\sprocess_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\sppid="([0-9]{3,4})"\sparent_process_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\s(?:cksum="([a-zA-Z0-9]+)"\s)?event_time="([0-9]+)"\sevent_time_utc="([A-Za-z]+\s+[0-9]{1,2}\s[0-9]{4}\:[0-9]{2}\:[0-9]{2}\:[0-9]{2})"\s(?:file_type="([a-z0-9\-]+)"\s)?(?:is_system_file="(true|false)"\s)?user_name="(.+)"\s\/\>], ignoring...

[2019-04-17 11:20:04,628][WARN ][default.com.arcsight.agent.sdk.a.t][parseTokensNow] Message [ < S A M P L E S _ C R E A T E D f i l e _ n a m e = " c : \ s a m p l e . f i l e " p i d = " 1 2 3 " p r o c e s s _ n a m e = " c : \ s a m p l e . e x e " p p i d = " 4 5 6 " p a r e n t _ p r o c e s s _ n a m e = " c : \ p a r e n t . e x e " c k s u m = " 1 2 3 a b c 4 5 6 d e f " e v e n t _ t i m e = " 1 2 3 4 5 6 " e v e n t _ t i m e _ u t c = " J a n 0 7 2 0 0 6 : 1 2 : 3 4 : 5 6 " u s e r _ n a m e = " s a m p l e \ l a n g " / > ] did not match the common regular expression [\<([A-Z_]+)\s+file_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\spid="([0-9]{3,4})"\sprocess_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\sppid="([0-9]{3,4})"\sparent_process_name="([a-zA-Z0-9\:\\.\-\_\s]+)"\s(?:cksum="([a-zA-Z0-9]+)"\s)?event_time="([0-9]+)"\sevent_time_utc="([A-Za-z]+\s+[0-9]{1,2}\s[0-9]{4}\:[0-9]{2}\:[0-9]{2}\:[0-9]{2})"\s(?:file_type="([a-z0-9\-]+)"\s)?(?:is_system_file="(true|false)"\s)?user_name="(.+)"\s\/\>], ignoring...

_____________________________________________________________________

It looks like the events have special characters or white space in between every characters.

I tried to parse it in one field using (.*) in the regex script to see the full raw event on logger.

This is what I got:

On graph, it seems that there are special characters in between but on table it seems normal:

flex problem2.PNG

 

Does anyone experienced this? Is there a problem on the log files itself?

 

Thank you Comunity!

0 Likes
1 Solution

Accepted Solutions
Highlighted
Micro Focus Expert
Micro Focus Expert

The default character set is UTF-8. You can specify a different encoding with the

 

agents[x].encoding

 

property: Specifies the encoding or character set used in the log file. Only Java recognized encoding names are accepted. Informal names for encoding will result in assuming UTF8 as logs encoding value.

On Linux,

file sample.log

can provide more info about the type of the logfile.

--
Norbert

View solution in original post

8 Replies
Highlighted
Honored Contributor.
Honored Contributor.

I would run this through the arcsight regex utility and see if that has any issues, etc.

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

In which character set does you application write its log files? This looks like UTF-16.

--
Norbert
0 Likes
Highlighted
Respected Contributor.
Respected Contributor.

Hi Klasen,

The log file was just sent to me, I havent got a chance to ask. Does this have an impact on how i should develop the FlexConn?

Thanks

0 Likes
Highlighted
Respected Contributor.
Respected Contributor.

Hi Klasen,

What character set is recommended for the log file?

Thanks

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

The default character set is UTF-8. You can specify a different encoding with the

 

agents[x].encoding

 

property: Specifies the encoding or character set used in the log file. Only Java recognized encoding names are accepted. Informal names for encoding will result in assuming UTF8 as logs encoding value.

On Linux,

file sample.log

can provide more info about the type of the logfile.

--
Norbert

View solution in original post

Highlighted
Respected Contributor.
Respected Contributor.

Hi Klasen,

I have run the "file Filename" command and verified that the log file is UTF-16.

I have appended the agents[x].encoding line on the agent.properties and it worked.

I will now try to apply it on production.

 

Thanks for the help.

0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Can you share the regex in the properties file directly?

------------------------------------
Please use the Like button below, if you find this post useful or mark it as an accepted solution if it resolves your issue.
0 Likes
Highlighted
Respected Contributor.
Respected Contributor.
regex=<([A-Z_]+)\\s+file_name="([a-zA-Z0-9\\:\\\\\.\\-\\_]+)"\\spid="([0-9]{3,4})"\\sprocess_name="([a-zA-Z0-9\\:\\\\\.\\-\\_]+)"\\sppid="([0-9]{3,4})"\\sparent_process_name="([a-zA-Z0-9\\:\\\\\.\\-\\_]+)"\\s(?:cksum="([a-zA-Z0-9]+)"\\s)?event_time="([0-9]+)"\\sevent_time_utc="([A-Za-z]{3}\\s[0-9]{2}\\s[0-9]{4}\\:[0-9]{2}\\:[0-9]{2}\\:[0-9]{2})"\\s(?:file_type="([a-z0-9\\-]+)"\\s)?(?:is_system_file="(true|false)"\\s)?user_name="(.+)"\\s\\/>

token.count=12

token[0].name=tok1
token[0].type=String
token[1].name=tok2
token[1].type=String
token[2].name=tok3
token[2].type=Integer
token[3].name=tok4
token[3].type=String
token[4].name=tok5
token[4].type=Integer
token[5].name=tok6
token[5].type=String
token[6].name=tok7
token[6].type=String
token[7].name=Integer
token[7].type=String
token[8].name=tok9
token[8].type=TimeStamp
token[8].format=MMM dd yyyy:HH:mm:ss
token[9].name=tok10
token[9].type=String
token[10].name=tok11
token[10].type=String
token[11].name=tok12
token[11].type=String

event.name=tok1
event.fileName=tok2
event.deviceCustomNumber1=tok3
event.deviceCustomString1=tok4
event.deviceCustomNumber2=tok5
event.deviceCustomString2=tok6
event.deviceCustomString3=tok7
event.deviceCustomNumber3=tok8
event.endTime=tok9
event.fileType=tok10
event.deviceCustomString4=tok11
event.sourceUserName=tok12

event.deviceCustomString1Label=__stringConstant("Process Name")
event.deviceCustomString2Label=__stringConstant("Parent Process Name")
event.deviceCustomString3Label=__stringConstant("cksum")
event.deviceCustomString4Label=__stringConstant("Is System File")
event.deviceCustomNumber1Label=__stringConstant("PID")
event.deviceCustomNumber2Label=__stringConstant("PPID")
event.deviceCustomNumber3Label=__stringConstant("event time")

event.deviceVendor=__stringConstant("Automated Teller Machine")
event.deviceProduct=__stringConstant("ATM Applications")
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.