The Desperate Need for Accuracy and Efficiency in Security for Detecting Network Intruders and Other Threat Actors Quickly

According to 2015 research reports published by Ponemon, Mandiant, and others, median intruder dwell time in a target network prior to detection ranges from just under to just over 200 days. That is a little over six months and as everyone agrees, totally unacceptable.

How is it that an intruder can get into a network and remain active for over six months before being discovered? As a quick point of reference, it is important to note that these statistics include both external attackers and internal attackers. The latter are much more difficult to detect by normal/traditional means because they are supposed to be in the environment and they have valid credentials. The same is true with external attackers that compromise an internal user account. For all intents and purposes, they are perceived to be the actual insider by applications and systems because they can authenticate. In a third case, we have an attacker that uses a vulnerability to get into a system. If the vulnerability is already known and patched, the attack is often identified up front and is unsuccessful because no vulnerability actually exists. If the vulnerability is known but not patched, that attack may be successful but will most likely be detected at some stage. So really, the detection issue falls around zero-day or unknown attacks and misuse of identity. Given one of these situations, the threat actor has a high probability of going unnoticed.

The other problem security faces is the volume of alerts they receive. Many of the publicized breaches noted that they received alerts that had indicated an attack, but they never saw the alert among hundreds or thousands of others. The more entities within a network, the more alerts generated. Each day, normal users just doing their jobs can generate dozens of security alerts. This can be from web surfing activities and downloads to operating an out-of-compliance system, mistyped passwords, accessing sensitive applications for business, installing an application, or changing system configurations. Multiply these activities by 25, 100, or 5,000 people in the organization and you begin to see the scale of the problem faced by organizations of every size.

Business applications, servers, and endpoints, as well as tools like firewalls, intrusion detection, URL filtering, and data loss prevention, all generate logs and alerts. Many of these alerts arrive as high priority alerts. Recent EMA research titled, “Achieving High-Fidelity Security” identified the amount of time organizations spend triaging security alerts. Seventy-six percent (76%) of small to medium businesses (SMB) have a full-time employee equivalent (FTE) triaging where 58% of midmarket organizations have the same but an additional 32% have three FTE performing this task. The number for enterprises are virtually the same as the midmarket. However, 42% of large enterprises have three FTE triaging events. This may not seem terrible but 59% of the organizations that are applying three FTE receive between 100 and 499 critical/severe alerts per day and 60% of the organizations applying between three and five FTE receive 500 and 999 critical/severe alerts per day. To put this into perspective, 64% of SMBs investigate ten or less of their critical/severe alerts per day while the same is true for 77% of midmarket organizations, 71% of enterprises, and 54% of large enterprises. This means that these organizations are in a continually increasing backlog of critical/severe alerts from which they will never recover without a significant change in their capabilities.

To overcome this nearly insurmountable challenge, they must improve their alerting efficiency to reduce alert volumes and hone their accuracy through improved context so the reduced alerts are valuable. The combination not only reduces the deluge of alerts by multiple factors of ten, but also improves prioritization so security has fewer critical alerts and the most impactful are investigated first.

In EMA’s Data Driven Security Reloaded research, participants were asked what factors impacted their confidence in their ability to detect a security event before it caused a significant impact. Fifty percent of respondents indicated, “Too many false positive alerts” was the problem (that was the top answer). An additional 38% identified “Excessive uncorroborated data/lack of context” as their largest issue. Accuracy is gained through context. Most organizations have the systems in place that can provide alerts, but they lack the ability to properly analyze, not merely correlate, the alerts. Accuracy through context and analysis drive the ability to identify the most significant issues in the environment so the proper attention and resources are applied quickly. Tools that can identify abnormal behaviors and “connect the dots” in those activities to show why the activity is abnormal and why it is a threat are in dire need (the former without the latter often drive poor alert prioritization).

Efficiency is the ability to isolate and remediate an issue in a timely manner. This is exemplified in the idea that finding a needle in a haystack is difficult but security is looking for the steel needle in a stack of aluminum needles, which is impossible to find without the proper tool. Security personnel have the choice of working longer and harder to inspect each needle, giving the attacker more dwell time, or someone can give them a magnet and they can find the steel needle in a very short period of time. The proper tooling drives efficiency. Enabling proper workflows, hand-offs, and status updates while collecting investigative data and providing a flexible work interface for forensics are all key aspects that must be addressed. Every tool change, context switch, and hand-off an investigator has to make to complete the investigation is more time a threat actor has to gather intelligence and cover his or her tracks.

Security organizations need a means to understand more about the source of alerts and what initially generates the alerts. Root cause can have a significant effect on prioritization (though a reasonable threat, a malware attack is usually less of an issue than an active attacker). To do understand how effective their alerting tools are, security needs to be able to measure the accuracy and effectiveness of those tools to determine where their gaps are and drive changes to close those gaps, whether that is removing and/or replacing ineffective tools to increase accuracy, or adding to the analysis and remediation capabilities to increase efficiency.

LightCyber has introduced a new service to help security programs gauge their alerting accuracy and effectiveness. Take a look at it at http://lightcyber.com/lower-security-alerts-metrics/.

EMA Research Blog: Navigating IT and Security Horizons