Sally is Going to Do What!? Predictive Analytics Explored
By : Brian Contos, VP & Chief Security Strategist; Igor Baikalov, Chief Scientist; Tanuj Gulati, CTO; and, Nitin Agale, VP Sales Engineering, Securonix
Losing the Battle
If organizations aren’t losing the security war, it’s at least fair to say that they’ve lost plenty of battles. In 2015, cyber attacks resulted in an estimated 480 million leaked records and, by some accounts, over $400 billion in financial losses. These statistics are derived from attacks that were discovered. We, obviously, lack statistics on attacks that remain undetected or unreported. Across threat actors like nation-states, cyber criminals, hacktivists and insiders, perpetrating sophisticated or simple attacks, a critical weapon in the cyber security fight has been missing: predictive analytics. We’ll get to more on that in a moment.
We often divide our security posture into three categories: incident prevention, detection and response. The problem is that even when we’re doing well in all three of these categories, we’re still being out maneuvered by the bad guys. While prevention, detection and response are essential components of a security program; they don’t completely address today’s threat landscape.
Traditional information security prevention measures were built for an outdated scenario in which valuable assets were held on the inside, bad guys were on the outside and we stuck a wall in the middle. The wall was constructed with cyber preventative controls such as anti-malware, firewalls, IPS and the like. That’s castle building 101. But the scenario has changed. In today’s cyber war, walls can’t stop everything.
Consider the Maginot Line, a mega-structure built by the French in the 1930s to defend against the combat tactics of World War I. It was a concrete fortification that was virtually impervious to aerial bombings and tank fire. But by Word War II, Germany was using mobile armor and nimble offensive tactics. When the German Army invaded France in the Battle of France, Germany simply went around the Maginot Line and invaded through Holland and Belgium. Was the Maginot Line an awesome piece of engineering with massive preventative capabilities? Yes. Did it work? No. Like so many preventive controls, it failed to keep up with changing attack vectors.
Detection and Response
Where organizations fail to prevent, they detect and respond. This has lead to solutions like log management, SIEM and a new generation of incident response tools and services. Incident detection and response solutions are cornerstones of a solid security posture. They are absolutely critical to security because they successfully fill in the gaps left by preventative controls in the physical and cyber world.
Think about a bank. It would be much safer if the bank never unlocked its front doors, thus preventing thieves (and customers) from entering. Then again, that’s probably not a very successful business model. Banks understand that prevention tools are limited, which is why they implement detection and response capabilities such as security cameras, security guards, buttons that call the police and radio-controlled incendiary devices called dye packs. But what if part of the bank’s security posture wasn’t solely dependent on prevention, detection and response? What if the bank could predict who would rob their bank, when and how, before it happened?
Forecasting malicious behavior before it happens can sound a bit like the Minority Report, a movie based on a future where the police arrest criminals before crimes are committed. However, with the right combination of data feeds and analytical techniques, that is exactly what a security analytics platform can provide.
A security analytics platform needs to be able to collect and mine data from multiple sources:
- Log aggregators and flows such as syslog servers, log managers, SIEMs and NetFlow
- Security products like DLP, endpoint monitoring tools, web and email gateways
- On-premise and cloud applications, databases and perimeter devices
- Non-traditional sources including HR systems, identity management solutions, employee performance review applications, physical security devices, remote access systems, phone records, travel records and audio/video feeds
- Big data lakes for large scale and longer term analytics
Exploiting log aggregation and traditional data sources has become fairly common practice in most mature organizations. With the right math, there are few solutions that can expand on these datasets to support the multitude of linkages needed for predictive analytics. The math that drives predictive security analytics is machine learning.
While the science of machine learning is beyond the scope of this paper, we’d be remiss if we didn’t include a few essential components.
Analytical methods from disciplines outside of computer science, such as the life sciences and military science, are very useful for finding patterns of life that can be borrowed and applied to cyber security. These are patterns of behavior that are mapped by the habits of any entity such as a person, an application or a physical device. Once a baseline of normal behavior is established, an entity’s behavior can be compared to that of its peers, making behavior outliers easy to identify. Predictive analytics for cyber security are focused on detection of anomalies – deviations from normal behavior – that indicate a threat.
Here are just a few examples of this type of analysis:
- K-Means clustering: An algorithm that partitions observations into clusters where each observation belongs to the cluster with the nearest mean
- Hierarchical clustering analysis (HCA): A method of building cluster hierarchies
- Markov chain: The modeling of randomly changing systems where it is assumed that future states depend only on the present state and not on the sequence of events
- Generalized linear model (GLM): A flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution
Simplified, predictive analytics extract patterns from existing data and match them to a model in order to forecast the probability of unknown events that may or may not have happened yet. Behavioral analytics can be used to detect unknown threats based on abnormal behavior alone, or they can be combined with predictive models to estimate the probability of a known attack scenario. So how do we operationalize all this science?
Predictions are only as good as the models that drive them. Predictive models that depend on signatures, set patterns and rules aren’t particularly good at accurate threat prediction because they generate excessive “noise” in the form of false positives. Advanced security analytics use a combination of entity and peer analytics to all but eliminate false positives.
Predictive models work best when they are based on behavioral algorithms featuring adaptive profiles across disparate domains. Because there is only so much context that can be gleaned from SIEMs, firewall logs, IPS alerts and anti-malware solutions, effective analytical security models go far deeper into nontraditional data sources that paint a context-rich picture of the threats to an organization. They ingest HR information, identity and access, instant messaging, and physical security information that can be used to predict (and detect) insider threats, fraud, data theft and cyber attacks.
Meet Sally: Sally is Going to Do What!?
Predictive security analytics detect signs of nefarious behavior before the perpetrator completes an attack. Let’s use “Sally” as an example. Sally is an employee who conducts a series of anomalous behaviors, correlated with other indications of risk that add up to a high likelihood of sensitive data theft.
- Sally demonstrates many behaviors outside the norm of those with people with a similar job tile, within the same department and within the same facility.
- Sally has received two consecutive, negative performance reviews.
- Sally’s vacation time and sick leave has increased over the last three months.
- Sally has recently been browsing job sites.
- Sally has been physically entering the building on weekends and late at night – something she never did until recently and her peers never do.
- Sally has made a number of multi-gigabyte downloads to a USB removable drive.
- Sally accessed an application she rarely accesses from a remote location she has never used.
- Sally has multiple, cloud-based email accounts to which she is sending emails from her corporate account with large, encrypted attachments.
This is where the Minority Report comes in. By capturing independent behavioral threat indicators from multiple sources with a time series, we are able to peg those indicators to a pattern of life. Simply put, we can automatically and accurately predict what Sally will do next.
Some of Sally’s early behaviors were warnings that would prompt further investigation. Others more clearly indicated malicious activity. The more indicators that can be collected the better, but strong analytics will also work with sparse data sets. Note that Sally could just as easily be a device or application, rather than a actual person as it’s essential to consider all entity types when preforming analytics.
Insider threats, cyber attacks and privileged account abuses are three top-of-mind concerns of security leaders. Most address these concerns with solutions like DLP, SIEM and IAM. These tools are necessary and offer great value, but they cannot accurately detect or predict most attacks and because of a lack of analytics, are limited to signature-based rules that often miss attacks and generate false positives. Predictive security analytics allow organizations to mine more value out of their existing security tools by transforming the data they generate into predictive intelligence. The potential applications of predictive analytics are almost limitless. Use cases range from data exfiltration, snooping and malware to fraud, money laundering and trade surveillance.
- Predictive analytics can help organizations achieve:
- Greater ROI on existing controls
- More rapid threat detection and thus remediation
- Fewer time and resources spent chasing down false positives
- Greater focus on aligning security strategically with the business instead of tactical fire fighting
Not that long ago, we thought large cement walls were the answer, but as General George S. Patton said, “Fixed fortifications are a monument to the stupidity of man.”
An even shorter time ago, we thought firewalls, anti-malware and SIEM were the answer. Ultimately we must adapt as the threat actors have adapted and embrace predictive analytics as a critical new component of cyber security that integrates with and augments prevention, detection and response.
About the Authors
Brian Contos, VP & Chief Security Strategist, Securonix
Brian is a seasoned executive, security company entrepreneur, author and blogger. At Securonix he is responsible for security strategy worldwide.
Igor Baikalov, Chief Scientist, Securonix
Igor has over 25 years of experience in data analysis and enterprise application development in the areas ranging from structural biology and bioinformatics to insider threat and risk monitoring. As a Chief Scientist at Securonix, , Igor leads cyber security research and threat analysis to develop a comprehensive portfolio of adaptive, risk-based behavior models of cyber-attacks, and to further advance Securonix’ signature-less threat detection capabilities through the innovative application of machine learning and anomaly detection techniques.
Tanuj Gulati, CTO, Securonix
Tanuj drives engineering, product management, marketing and delivery services for Securonix. He brings over 12 years of experience in the information security industry with expertise in providing innovative enterprise security solutions. Tanuj participates in governance boards for information security programs at several large organizations where he provides his expertise in implementing identity governance, risk and compliance initiatives.
Nitin Agale, VP Sales Engineering, Securonix
Nitin has over 12 years’ experience serving organizations in information security, risk management, and compliance. He specializes in the domains of Data Protection, Insider Threat, Identity Management, Cyber Threat Management, PCI DSS Compliance and Third Party Risk Management and frequently speaks on these topics.
Edited by Peter Bernstein