MITRE Engenuity ATT&CK® Evaluations & The Question of How to Measure Quality in a Managed Security Service

Chris Cheyne is the SOC Director & Partner Owner at SecurityHQ

The results are out for this year’s MITRE Engenuity evaluations for Managed Security Service Providers. Only eleven teams took part, with SecurityHQ participating for the first time.

As the SOC Director at SecurityHQ I awaited the results of the evaluation with the same excited anticipation as that of a parent awaiting their child’s exam results. Our objectives were simple, SecurityHQ wanted to provide a high-quality service to our Engenuity team, who were our simulated customer. This means:

100% Detection
Low Noise
High Quality
On Time

In our maiden voyage into the evaluation, we are happy to report that we achieved 100% Step Detection, 100% Tactic Coverage, and 77% Technique Coverage. On these scores I am thrilled.

It’s been interesting watching the marketing arms of each participant organization work to spin their results in the best possible light. This has led to some very interesting discussions, all of which present a valuable snapshot of the state of the industry and the increasing disconnect between: What is being communicated as important vs. what Security organizations really need.

A little background on the evaluation before we get into that.

Emulation Synopsis

The attack simulation emulated a multi-subsidiary compromise with overlapping operations focusing on defense evasion, exploiting trusted relationships, data encryption, and inhibiting system recovery. The real-world adversaries simulated included both menuPass and BlackCat/ALPHV. These took the form of Advanced Persistent Threats (APTs) designed to dwell in the network post-breach and execute harmful activity over time.

SecurityHQ Results

This was SecurityHQ’s introductory foray into the MITRE evaluations, and we learned a lot. Some of our key results included:

Detection Events

Incidents & Updates

Unlike most of our competitors, we achieved our detection success with among the lowest possible noise and false positive ratio.

Low Noise to Signal

Signal-to-noise ratio is a measure used that compares the level of a desired Signal to the level of Background Noise. SecurityHQ Achieved 33 Detection Events with a total of 200 Notifications, a Noise to Signal Ratio of 1:5, less than half of the competitor average.

Takeaways from the Results & Subsequent Discussions

Coming back to the genesis of this article: What are security leaders being told they need vs what they need? The exercise highlighted 2 questions:

How do you measure quality in a Managed Security Service?
Has the industry become dependent on Endpoint Detection & Response (EDR) tooling?

How to Measure Quality in a Managed Security Service?

When we crossed the finish line, I cast a sideways look at the competition to see how we fared and was pleased to see that we were in the race, competing well with some of the big guns in the industry. Immediately and inevitably the competitor marketing machines went into overdrive, claiming winning positions.

Most frightening was the assertions from many that they were the “winners.” “We are the fastest.” “We have the best detection.” But little attention seems to being spent on the cost of the results and what impact the process would have on a real-world Security Team.

All projects have goals, and it is no different when you employ a Managed Security Service Provider. We can take a cue from the decades-old Iron Triangle of planning that balances the different variables of Time + Cost + Scope to measure Quality. The Iron Triangle is “Iron” because of the interdependences or tradeoffs between those variables.

Constrains of the Iron Triangle

A high-quality service is one that achieves the Mission Statement with a balance of Time-Scope-Cost.

Mission Statement: Detect Threat Events and curate Incidents with clear Threat qualification and prioritization of response actions. Optimize the customer efforts and help them respond to Incidents quickly and efficiently.

The Real Cost of Ownership

Cost can be measured in two fundamental ways:

1) The amount of money an organization spends on a given service.

2) The cost in terms of people-hours required by the customer to consume and manage the service.

While we know we typically compete very well on #1 and would encourage any interested parties to do their diligence there, that’s outside the scope of this evaluation, so we’ll focus the conversation here on #2.

Despite Alert Fatigue consistently being mentioned as a top issue for CISOs and security leaders, we have observed many competitors, specifically the Endpoint Vendors, spamming the customer with noise in the hope that some of those alerts might stick. This leads to a huge amount of noise and effort from the customer to surface the real threat.

	Competitor Average	Competitor Max	SecurityHQ vs Average
Alerts	483	1119	200 (59% Less)
Critical Alerts	78	559	7 (91% Less)
Noise of False Positives	448	1081	167 (63% Less)
Detections	35	42	33 (6% Less)
Noise to Signal Ratio	1:13	1:36	1:5 (62% Better)

Hold on… someone needs to attend to those alerts!

The Dangers of Severity Fatigue

When we raise the flag for a Critical Incident, we mean the house is on fire and we start waking customers up to man the pumps. SecurityHQ raised only 7 Critical Incidents during the MITRE Evaluation, which aggregated all the notifiable events into a handful of actionable situations.

I was flabbergasted to observe two Venders raising over 100 incidents… the maximum was 599 Critical Alerts. The danger here is “Severity Fatigue”, where the Customer becomes completely de-sensitized to Critical or Major problems, and when the worst event occurs the Customer is unprepared, late to respond, and the impact becomes fatal.

Equally worrying are the vendors that failed to identify even a single Critical incident. Given that the emulation included complete system compromise, including ransomware encryption, it is devastating from a customer perspective that this is not categorized by some as a Critical Incident.

A Customer should be able to “Trust” that their service provider understands what is important to them. And that the definition of what constitutes a Critical or Major problem is the basis of trust. If we spam a customer with 599 Critical Incidents, or worse, we fail to recognize a single Critical Incident, then all trust is lost.

The True Cost of Ownership

The partnership between the Customer and Service Provider is important, and a provider who creates unnecessary work is a liability. Incidents need attendance, and if we simply assume each alert demands an average of 10 minutes for a Customer, Security Engineer, or Analyst to interpret and they work an 8-hour working day, the real cost of ownership is extreme for many MSSPs.

Using the above example, the varied cost of “Owning” the service is extraordinarily with three venders costing over 10-days of your valuable Security Engineers or Analysts time. And that is only over a 5-day simulation period. That’s between 2 to 5 full-time resources!

So how should one view results in context? To explore where the balance resides in achieving good detection with high fidelity alerting (low noise), the results have been plotted and explored by classifying with a relatively qualitative assessment of what constitutes low to high noise.

Congratulations to Bitdefender, who did a great job of both excellent detection and low noise, closely followed by SentinelOne and SecurityHQ.

To see the full set of SecurityHQ results, follow this link.

Has The Industry Become Dependent on EDR Tooling?

The emergence of Endpoint Vendors purporting to offer a Managed Security Service powered only by the Endpoint has become a risky proposition. Endpoint is your last line of defense, not your first, and a host compromise is a Breach Detection service, rather than a true Managed Detection and Response service that consumes all the security layers.

I read with interest a recent article published on CSO, entitled ‘CISOs may be too reliant on EDR/XDR defenses’. With research highlighting how attackers are increasingly evading EDR systems to deliver their attacks, the emphasis was also on the CISO’s overdependence on EDR tooling.

The basic principles of Information Security remain that a layered approach is best. Take for example NIST 800 53 (Security and Privacy Controls for Information Systems and Organizations), which has over 1,000 controls across 20 distinct control ‘families.’ Only a small handful of those relate to endpoint-related threat management.

The same is true for Threat Detection. Security events exist across ALL your security controls, applications and services. Rather than waiting to detect a host compromise, it is far better to detect and stop intrusion at the boundary points and gateways, even in a zero-trust world.

In Summary

We love the data and learning within the MITRE Engenuity Managed Services evaluation, but we are also cautious about the real-world application of this data. It appears that many providers are tunnel vision focused on Endpoint, rather than taking the wider view of all security event sources. Coupled with the horrendous noise that many service providers generate, we question how useful any of those services would be within a Corporate SecOps environment. A service that spams 10 false alerts for every 1 true detection is a hinderance, not a help.

We continue to advocate the case for 100% Visibility, coupled with Advanced Correlation, Machine Learning (ML) Behavioral Analysis/Threat Modelling, and low volume, high fidelity incident generation.