How to use Artificial Intelligence to Prevent Insider Threats

by in Security

Thank you to everyone who attended the webinar, “Insider Attacks – How to Use Artificial Intelligence to Prevent Insider Threats.

We had a great conversation covering some of the findings from the latest Insider Threat Report on the impact of insider threats, and how different types of AI such as machine learning, neural networks, and deep learning are better at detecting different types of cybersecurity threats and how companies of all kinds‒defense, energy, large telecom, video game companies, advertising agencies and critical infrastructure‒all benefit.

AI and Insider Threats

We discussed how AI impacts existing security tools such as SIEM, DLP, IAM, and pertinent data from such security tools and many others, is often obscured by the volume of data being emitted by a plethora of fragmented systems.

We also discussed how different types of cybersecurity threats eventually end up being insider threats, and thus classical insider threat detection for account compromise, account misuse, infected host, internal recon, insider fraud, lateral movement, data staging or exfiltration, also mitigate the effects of phishing attacks, malware, ransomware, and unauthorized users.

From there, Stephan Jou, CTO of Interset, explained the  history of AI and how the technology has evolved from the 1980s when the Pentagon tried using AI to detect tanks, with sub-optimal results as the “machine” did not detect the shape of the tank itself, but the differences in lighting between pictures of tanks and pictures of nature.

AI and Inside Threats

Today, with a significant increase in computing power, data, algorithms, and increased investment, we have the ability to leverage AI for insider threat detection.

Through the scalable measurement of “unique normal” behavior across thousands of entities (users, machines, printers, servers, file shares, etc.), across an enterprise, AI helps detect anomalies which are at best typically difficult to find, and at worst, never found at all. Stephan explained how unsupervised machine learning leverages advanced mathematical models and creates aggregate risk scores for each individual entity within an organization, and then how the Interset threat detection platform generates an aggregate enterprise-wide risk score.

We walked through some examples of anomaly detection, such as email anomaly detection, expense reporting anomaly detection, and bot anomaly detection, and concluded with a few questions from the audience. We received more questions than could be answered during the live webinar, so please refer to the 17 questions and answers below.

If you missed it, the webinar recording is available here and slides are available here:

Thank you again to everyone for joining us! For more information, please do not hesitate to reach out to us

AI & Insider Threats Webinar Q&A

Q1: How can you prevent false negatives?

A1: False negatives in threat detection are best prevented by removing reliance on rules and thresholds. Rules and thresholds are binary, creating rigid boundary conditions when a threat exists or does not exist, in a way that is applied the same way to all entities, like users, machines, printers, servers, websites, etc. This “one-size-fits-all” approach assumes all users are the same, all machines are the same, all printers are the same, etc.

“Unique normal” is the measurement of individually measured norms for a specific entity and is created through observation and continuous measurement. This results in a tailored baseline of normal to which “abnormal” can be compared in an accurate way. To scalably measure “unique normal” and relevant anomalies across the many entities in an enterprise, a big data-based architecture with unsupervised machine learning is necessary. Read more about false positives here.

Q2: Does supervised learning have to take place prior to unsupervised learning, or doesn’t it matter?

A2: There is no requirement to apply supervised learning prior to unsupervised machine learning.

The output our models provide is a measure of how anomalous an event is ‒ this is essentially a score. Then, we can aggregate all scores for each entity (e.g. user) during a time window to produce an assessment of “how abnormal” a set of anomalies are. Such scores can come from any type of model. We chose to use unsupervised machine learning models because they are better suited to sparser data sets that exist for the types of threats we detect, like account compromise, account misuse, infected host, internal recon, insider fraud, lateral movement, data staging/exfiltration.

It’s worth noting that supervised learning models tend to produce more accurate scores if there’s good training data (i.e., known anomalous data). However, it is very hard to find such training data, and supervised learning models tend to only detect similar events.

Q3: How often do you retrain? What triggers the process to retrain?

A3: Training is done continuously as new data comes in by updating our baseline models ‒ typically on an hourly resolution, which is the granularity we typically aggregate behavioral data for.

Q4: How can you make AI work better in small data environment?

A4: The techniques described in the webinar have worked well in environments with just a single data set, on a small population. What’s more important to the models is the amount of time in the dataset, rather than the population size. In our experience, after about 30-days worth of data in even a single data set, we catch bad guys. That’s why some of our customers are SMEs with fewer than 100 employees.

Q5: Does Atos provide your solution as a service? If yes, hosting only, or also consulting?  

A5: Yes, Atos offers the Interset platform via their MSSP business as a service, hosted in the cloud or as part of their consulting practice. You can find more info here.

Q6: How easy is your solution to integrate with ServiceNow?  

A6: The Interset platform includes a built-in capability called “Workflow” that allows analysts to quickly and easily send data from Interset to downstream systems for further action to be taken. In most cases, companies will use the Interset REST API to send data to downstream systems like ServiceNow, but there are also a variety of other ways to push data downstream.

Q7: How do you accomplish highly accurate anomaly detection in the beginning? For instance, when there are very low examples of unique individual behaviors as the system is brand new?

A7: Models should have a notion of goodness of fit or support, and using that, should “stay quiet” until it has enough evidence to “speak up.”

Q8: In the interim, do you simply leave the system unprotected? Or do you use the whole system approach temporarily despite the potential for high false negatives/positives until the unique models can speak?

A8: Interset has a “historic mode” where a customer would read in 30-days of historical data. That way, you have a great set of converged models on day one of deployment.

Q9: How does machine learning and A.I. reduce false positives?

A9: A human would reduce false positives by not just relying on a single clue. Instead, she would rely on multiple clues and correlate them together in her head. Our math framework functions similarly. We use AI, machine learning and algorithms to avoid having to rely on a single clue and instead use multiple clues and connect them together automatically. In our UI, we call this the entity risk which we find is a very straight-forward, simple concept to allow you to filter out all the noise. What you end up with is a very high-quality set of leads with low false positives.

Q10: If employees know that a solution is detecting anomalous behavior, wouldn’t it be easy to mask their activities and make them seem normal? For example, transferring small amounts of data over a period of time, etc.

A10: Yes, this is possible. It is called a low-slow attack. This is why AI scalably detects abnormalities that deviate from individual baselines with “unique normal” or dynamically measured “similar-in-behavior” peer groups, not necessarily just administratively pre-designed peer groups. Please see slides 25 and 26 for examples of individual and peer group measurement of anomalies.

Q11: If an account has already been compromised before you receive the data set, how will your solution identify the compromise?

A11: There’s a bad guy hidden somewhere in the dataset. He has been bad from day one, so how do you find him based on his behavior? The key is to not just to rely on behaviors with respect to the past. If a bad guy has always been bad, then you won’t see historical differences. However, you can create models that compare the bad guy to other users in the environment. This peer comparison is something we do well here at Interset.

Q12: Is AI/ML suitable for detecting attacks?

A12: Yes, AI and ML are a strong technology option for detecting cyber attacks.

Q13: Please define and clarify “normal” in this case of Artificial Intelligence.

A13: AI models can be designed to look for and measure specific behavioral patterns that are typical or “normal” for every user, machine, or IP address in a dataset. “Normal” in that context is the most observed behavioral pattern in the data set. This allows our AI models to learn normal behaviors like “typical working hours,” “normal programs used,” “typical ports and protocols combinations,” “typical email recipients,” etc., for each and every individual in a population, using statistical machine learning.

Q14: AI by itself does not reduce false positive, but false positive is a measure to assess the quality of different ML/AI algorithms. So other than data aggregation and assembling different ML models, what is it unique about your product that reduces the FAR (i.e., false positive rate)

A15: You are correct. It is not the technology itself that determines the outcome. The unique capability of Interset ML/AI algorithms is that that they are used to measure anomalies against a vast number of individually measured “unique normal” baselines.

“Unique Normal” is the measurement of individually measured norms for a specific entity and is created through observation and continuous measurement. This results in a tailored baseline of normal to which “abnormal” can be compared in an accurate way.

Then, using statistically modeling, we score the information for each entity (e.g. user) during a time window to produce an assessment of how “abnormal” a set of anomalies are. Read more about false positives here.

Q15: What have you done to actually understand human behavior to put into you algorithms – what psychological models or others go into this? Behavior science input approach would be another way to ask this.

A15: We derive many of our insider threat behavioral models from the work of Carnegie Mellon’s CERT Insider Threat research, as well as Upton and Creese’s work based out of the UK and Oxford, the University of Leicester, and Cardiff University.

Q16: I have never heard of AI APT being associated with signature-based APT can you please elaborate?

A16: Advanced Persistent Threat (APT) campaigns have historically been searched for using a static set of rules. The use of rules as described in the webinar has limitations, and unsupervised online machine learning overcomes these issues.

An APT can last months and move slowly across multiple entities trying to compromise a key account. It is by its nature hard to detect. Since the analytics looks for changes in behavior in context with the entities interacting with the account, there is a greater chance the APT will be detected. If the attacker changes their tools, tactics, or techniques during the attack, security analytics will still detect it.

Signature-based rules would need to be updated based on the attack. This is a backward facing problem because you need to have found the APT already. APTs are also focused, generally unique to that account being compromised, and rarely used in the same way again. This is what makes them hard to detect. Since the behaviors seen in the attack do not change, this gives security analytics an advantage over signature-based techniques.

Q17: How does your solution assist with the identification of zero-day exploits?

A17: Zero-day exploits are famously difficult to detect. This has a lot to do with the historical method of hardcoding signatures, whereby a signature may be based on the very first time a person notices a particular malware and is then hardcoded to a blacklist and looked for in the future. Zero-day exploits are by definition brand new, meaning that we’ve never seen this exact binary before. What is consistent across zero-day exploits, however, is their behavior. 

Security experts can identify zero-day threats because they behave differently in measurable ways. There’s a predictability to the behavior of binaries: they should never reach out to the internet this way, connect to a particular database this way, never be accessing this part of a database this way, etc. This new behavior is an anomaly. In this way, Interset can capture zero-day threats because our solution detects and quantifies any changes in behavior.