Hey folks, below is the transcript for my talk. I'm so excited to finally publish this material.
You might find this interesting if you want to:
- Take a step back and learn a mental model for decision making in day-to-day operations.
- Defend operational decisions with more than just your intuitions.
- Learn how EDR sensors work, and recognize similarities in decision making.
- Take a structured and deliberate approach to circumventing EDR sensors.
- Preview a declarative-based capability for endpoint sensor evasion.
- Learn about the potential of applying optimization techniques to attack paths.
Click here to skip to the summary and takeaways!
- Part 0: Welcome and Agenda
- Part 1: Introduction to OODA Loops
- Part 2: Inside the Operator Loop
- Part 3: Inside the EDR Sensor Loop
- Part 4: Conclusions and Future Directions
Click here to view slides beside speaker notes.
Welcome and Agenda
Hey everyone, welcome to my talk on Operators, EDR Sensors and OODA Loops, understanding and exploiting defensive loops to win operations.
It’s an honor to be able to present this, and I wanted to thank you for taking the time to watch this, and a big thank you to the organizers of x33fcon for the work they do behind the scenes.
I started a blog post series last year on how to evade EDRs (or EPPs as some might call them), but didn’t get around to completing it. I’d like to get back to it later when I have time, but if any of you have been following that, then you can consider this talk as another paradigm to think about the same topic without necessarily supplanting the paradigm in that series.
I’m also aware that I certainly have gaps in my understanding on this topic, so I absolutely welcome any critical feedback to challenge my assumptions. I think it’s a great way to learn.
Just I’m not wasting anyone’s time, I wanted to say this talk isn’t about some special secret sauce or universal silver bullet that can bypass EDR products. Instead, this talk is about how we make decisions as operators, and how we can use that to our advantage when up against EDR sensors.
On the slide you can see the agenda, I’ll go over OODA loops for those that aren’t familiar with them, as well as my interpretation on how that applies to operators. Then we’ll go over how to leverage the same decision-making process we use as operators to understand how EDR sensors work and how to circumvent them. And finally, I’ll sum it up with some takeaways and actionable things you can consider to evade EDRs.
My name is Jackson, also known on Twitter as @Jackson_T. A bit about myself: I did my bachelor’s degree at the University of Toronto in Canada (which is where I’m from). I majored in Public Policy, and did a double minor in Psychology and Computer Science. My hobbies are in Red Teaming and Vulnerability Research. I used to work for Disney’s Internal Red Team, and prior to that was a consultant at Security Compass focusing on application security and vulnerability research, which is how I got my start in this field.
I bring up the social science background because, although I’m not an expert, I like to understand how people think and how they make decisions. Some might recognize the avatar there as the logo for Kumon. It’s what they call the “thinking face.”
When I was a Red Team Operator, our operations were very scenario based where we had a specific starting point and a specific business goal to reach at the end of the op. One of my favourite parts about each op was this process where, as a team, we reasoned about our current situation and then we decided on what our next course of action should be. And we would do this again and again and again as we inched closer to our goal.
What it came down to was this repeatable or “looped” process where as operators we observed the environment we were up against, then we identified our options, we weighed them, and then selected one that we acted on.
Of course, this situational awareness and tactical decision making isn’t unique to Red Teaming; it can be applied to many areas in life, and is one of the focuses of the cognitive sciences, operations research, military psychology, and more. This eventually led me to John Boyd’s work on what he called the OODA loop.
Introduction to OODA Loops
I’ll give you a moment to look at this, especially for those that aren’t already familiar with it.
OODA stands for Observe, Orient, Decide, and Act and this loop provides a structured way of transforming the information we sense and perceive into action. It’s a way of showing how we make complex decisions based on limited context and evidence of unfolding situations.
To speak a bit about each process, Observe is where you're scanning the environment and gathering information from as many sources as possible. Orient is where you analyze the information you get, and synthesize it into meaningful knowledge where you can understand what options are available to you, and what you can do with it. With the Decide process, you consider available options and then selecting a subsequent course of action. Finally, with Act, this involves carrying out the selected decision.
As you’re looking at this, and are internalizing it (for those that aren’t already familiar with it), it’s worth noting that this isn’t a single-threaded flow where you focus on these one at a time, in isolation. Instead, these are parallel/interrelated processes that you continuously cycle through.
In the next few slides, I’ll go over what this loop looks like for Operators (at least the way I’m interpreting it based on my experience), and provide an example after that.
Inside the Operator Loop
Before I begin explaining the Operator’s OODA loop, it’s worth saying that (1) this isn’t the main part of the talk, and (2) this is most likely stuff that you already do and there’s nothing particularly new to learn from. So why am I talking about this? Well, the idea is to take a step back and explain our day-to-day in the context of this model, so that when we go over EDR sensors later with the same model, we can see the parallels more clearly.
Starting with the Observe process, this is where we seek and gather information from our environment, and this gets fed into the Orient process. From an operator’s standpoint, I look it as a combination of what we gather from the target environment, and what we gather outside of it.
We can make observations in the target environment, such as enumerating a target company’s subdomains, or open shares in its network, and it also includes observing the consequences of our actions—whether what we did brought us closer to our objective, or further from it, and whether it raised suspicions from defenders.
Self-growth also plays into this, where we’re gathering information about the latest TTPs from our Twitter feeds (as an example), and—generally speaking—continuously consuming other material that help us learn and grow.
Next we have the Orient process which is where we analyze and synthesize our observations and then produce options as a result. You can call them solutions, or attack paths—I use them somewhat interchangeably in this context. But this is where we ask ourselves, what does the information mean to us, and what can we do about it?
I think this is why tools like BloodHound, awspx, Grouper, etc. provide so much value because they’re good at automating some of the Observe and Orient parts for us.
For example with AD, the SharpHound collector serves as an active observer to gather information about the environment. And when the results of that get fed into BloodHound, we can get a sense of what options or paths are available to us to reach a certain resource.
When I look at the paths available to us from an evade EDR standpoint, there are four categories of options that come to mind:
The first one is Avoidance, where you just avoid directly operating on systems that have the products installed. Many companies have sensor saturation issues, so it’s something to take advantage of where you can. The second one is Abusing Blind Spots, where sometimes the sensor doesn’t collect on or report certain types of events. This could just be a configuration, or it could be a missing implementation in the software.
Third is Blending In, where the malicious activity we’re performing fits in so well that it’s hard for defenders to distinguish between what’s benign and what’s not. And lastly we have Sensor Tampering which I see as reducing the sensor’s capabilities across any of these four processes.
Decide is the next process, and this is where we deliberate on the options we’ve identified and then select a course of action. This is where we consider what is important to us, and figure out what to do that best represents those values.
I usually look at this as a constraint satisfaction problem, and usually for Red Team Operations (RTOs), I try to pick the one moves us closer to the operation goal, is unlikely to be detected, and is a path of least resistance.
But we should take a step back and ask: why are those our constraints?
Well, in my mind, one way we can decide on a course of action is to tie it to the engagement type. You might have Internal Penetration Tests with the goal of achieving a certain scenario, but without needing to be covert, whereas it would matter for RTOs.
In some operations, there could be additional constraints that make it a top priority to reduce liabilities that lead to attribution and other consequences in the event the attackers are detected. This is a concept known as “Program Security” (not to be confused with Operational Security)—we'll touch on this later. That said, these constraints aren’t likely to be highest priority for many RTOs because if you get caught, the stakes aren’t as high, and you can eventually put your hands up and say it’s the Red Team.
I know there are varying definitions when it comes to these engagement types, but in my mind, what matters most for this decision-making process is identifying the constraints that have been imposed and then evaluating our options against them.
At this point, we’re going to take a small detour… And I’ll show another way to look at constraints by using six principles from Matthew Monte’s CNE (Computer Network Exploitation) framework. He wrote about these in his book called “Network Attacks and Exploitation” which is definitely a gem as far as offensive security books go.
Although I’ll try to spend the next few minutes describing these principles, I can’t quite do it justice here, so I’d recommend getting his book instead. It’s not too long of a read and you can probably spend a weekend ingesting the material. After I go through them, I’ll then map it back to how we can use these principles to make decisions during ops.
The first principle is Knowledge which is the broad and deep understanding of technology, people, and organizations. If we can think of this principle as being variable, we might want to consider what type of skillset do we want to emulate? Could it be a script kiddie that can get some IP addresses and run tools against them? Or a more advanced adversary that not only understands the technology, but also the psychological characteristics of the people they’re targeting, and be able to find gaps to exploit in the target enterprise’s policy, norms, and organizational structure?
Let’s say if an attacker wanted to slow down or distract a target company’s defensive teams, instead of leveraging a ransomware or DDoS campaign, they can approach it from a people and process perspective instead. For example, they can send exaggerated alerts to the attention of high-level authority figures within the company who may not know any better, but the messaging is conveyed in a language they can feel worried about.
Especially so for those companies with rigid organizational structures that deal more in nuts-and-bolts than bits-and-bytes. In situations like that where employees are structurally entrenched in process, politics, and posturing, then something like this might corner people into a box where the choice for self-preservation—given the situation—is to trickle down to operational defensive teams and squeeze their bandwidth.
All of this achieves the same effect without the need for technical attacks.
The next one here is Innovation, which is creating or adapting technology and methodology to new circumstances. So an example could be getting stuck on something, having a script not work, making adjustments to adapt to an environment, and you’re good-to-go and not blocked anymore. When I look at this as a variable, I see it as the degree to which we intend to demonstrate novel capabilities.
Awareness involves mapping out the operational domain, and being able to detect and monitor relevant events. So you can think of using BloodHound as a good example where you’re actually mapping out AD domains. And digging through internal wikis to identify targets can also help build awareness.
But mapping out the operational domain is half of it, the other half is being able to get a sense of what’s relevant to you in near real-time. And what you get when you combine those two is something like a Marauder’s Map. So with this, you can orient yourself in the environment and get a sense of where you should be going, but also be able to understand the likelihood and consequences of getting caught.
Precaution is the fourth one, and this is about minimizing the impact of unwitting actions that are outside of your control. When a codebase is refactored, then you might lose the backdoor you’ve implanted. Or when ACLs change, you might lose your access. But the fact of the matter is that these things occasionally happen as part of day-to-day efforts in the target environment, usually without them knowing that you’re there, so this is where adding points of access that are redundant and diverse come into play.
Operational Security is probably one almost all of us actively think about during Red Team Ops. And it involves minimizing three things: the exposure, detection, and reaction to an operation.
Lastly we have, Program Security, which is containing the damage caused by the compromise of an operation. Like I mentioned earlier, when real attackers get caught, there’s more at stake, and they can have a lot more to lose. There are liabilities around attribution, attacker infrastructure and the exposure of private capabilities.
I think this is something that can be emulated to a degree in most Red Teams, but it takes a lot of investment to emulate this at a high degree of sophistication. I'm not sure how common or how often it would be appropriate to emulate to that degree, whether the service is provided in-house or externally.
Matthew’s book goes into detail with these principles and a lot more, so it's definitely worth a read.
These are the principles I try to keep in mind when doing operations. I'm definitely not perfect at it, but I can see how they fit in the context of this OODA loop model.
Whenever I have some attack path options available to me, I can decide on which to choose by looking at these principles as variables in a data type we can call a “tradecraft policy”. Each variable maps back to the degree to which we intend to demonstrate the principle (perhaps as a Likert scale). And these variables become a mix of hard and soft constraints when a policy is defined. What this means is that any engagement type can be represented with a “tradecraft policy”.
As we review the options available to us, we can then evaluate them against the constraints and more objectively get a sense how each of the available options score against the policy. Then we can pick the one that’s optimal (if there is one), otherwise we pick one within the feasible region of options.
Again, I’ll say that the substance of this shouldn’t be new to most of us, this is just part of our intuition. But that said, I find it useful to have a mental framework to cross-check that intuition and be able to defend decisions that way.
Now, when you bring these variables back to labels like “Red Team”, “Adversary Emulation”, “Internal Pentest”, and so on… I’ll say with what we’re seeing here, this isn’t a perfect depiction, and there’s certainly room for disagreement. Though in my mind, I try to put the labels aside and think about what these engagement types mean based on the constraints imposed on them.
For example, an Internal Pentest might take OpSec less seriously than a Red Team Operation, and an RTO might take Program Security less seriously than a CNE operation. It’s relative.
Moving on, in the Act process, this is where you execute on the decision you’ve deliberated on to use a TTP. For those of you familiar with the MITRE ATT&CK Framework, you can think of an Action as the same as executing on any of the TTPs here. And every OODA loop process that occurs before this one can explain how and why you chose your TTPs.
In my mind (I’m not sure if this is a fair interpretation), but I would expand the Act process to include the things you do both in the operational environment and the non-operational environment.
In the operational environment, there are also administrative tasks like seeking guidance on targeting and tradecraft, and producing periodic status reports. In the non-operational environment, there’s continual self-growth and continual development with capabilities and infrastructure.
At this point, I’m going to move away from the abstract concepts, and put some grounding on them by way of a simplified operational example. This is a sanitized and modified version of a scenario I had encountered in the past: where there was an assume breach starting point, and the goal was to get access to a server with the crown jewels.
Starting with Observe, we gather a lot of information during operations, usually at a faster pace than we can articulate, so for the sake of this scenario I’ll just go over a few.
First, we have the data from our SharpHound collection which includes computers, group memberships, active sessions, and so on from the domain we’re targeting. We also have credentials for a utility service account were found in a file, at some point during the op. As we’re traversing through the environment, we can see that this “Fancy EDR 2000” product is installed on most endpoints we come across, and that LSASS is lacking protections. And in addition to the observations we make from the operational environment, from a self-growth standpoint, we have this growing knowledge of various LSASS dumping techniques.
In the Orient process, you can see a depiction of a situation model which is the mental representation of the situation described in our problem.
Based on our BloodHound data, there’s a path available to the “crownjewels” server. But we can’t directly authenticate to the crownjewels server using the credentials we currently have. So we can instead pivot to a jump box which has an active session for an account that can authenticate to the target server.
Another path involves this “berthascott” user. It turns out that the service account we have can change the password of this user, and we can use that to directly log into the crownjewels server.
We can tell from some previous experience (cross-checked with online searches), that this Fancy EDR 2000 product actually isn’t that fancy. It’s good at detecting malicious command lines, but not much else. So we might be good if we can manage to not use LOLBins that have easily signatured command lines.
Lastly, if we go with the credential dumping route, we can try to map our understanding of LSASS dumping techniques we know about to the operational environment. In the slide, they’re divided into techniques invoked by certain command lines or processes, versus techniques where you invoke certain sequences of Windows API functions.
It appears as though we have a couple options from the Orient phase to get to the crown jewels server, but the Decide process is where we can apply constraints to make an optimal choice.
In this scenario, we’re doing a “Internal Penetration Test” with the Blue Team looped in, so OpSec isn’t a very high priority. If we’re caught, it’s not the end of the op and we can still keep going. But at the same time we want to avoid Denial-of-Service where possible, so practically speaking, that dampers our ability to change berthascott’s password and we can’t get access to the crown jewels more directly.
Program Security is considered important, in the sense that if we have secret sauce for dumping credentials, we’d want to use them only when it fits the situation and the timing is right.
We’re also given some guidance to try to achieve the path-of-least-resistance so we can put a cap on capabilities that require higher levels of Knowledge and Innovation, unless required. This means we should put API-based LSASS dumping techniques on hold unless we need to use them.
We also know the EDR is great at blocking processes with suspicious command lines, so to work around that, we should use Task Manager to dump LSASS over RDP which doesn’t emit a suspicious command line.
We’ve now effectively filtered down our options to one that we can act on. We can use the utility service account to RDP into the jumpbox server, then launch Task Manager and dump LSASS, which can let us get credentials or hashes for that crown jewels service account. And with those, we can finally laterally move to the crown jewels server for collection.
Hopefully, I didn’t bore you too much with this section. The point here isn’t to teach how to dump LSASS or show any new operational technique. It’s just to spell out how the decision-making process in our day-to-day operations work fits into this OODA loop model so that when we go to this next section on EDR evasion and apply the same model, we can see the parallels more clearly.
Inside the EDR Sensor Loop
And that brings us to the main question: How do we win against EDR sensors?
Most the historic public research I’ve read around EDR bypasses looked at them as a black box. Many operators still take this approach, where you can try throwing a bunch of payloads at them and see what works based on what alerts show up in a customer dashboard. That collective experience is what shapes how we decide on how to bypass products the next time we’re up against them. The downside though is that we usually don’t really see the technical reasons why something works or fails, it’s just anecdotal.
And that’s been changing, there continues to be growing amount of public research on understanding per-product internals and taking advantage of them. From what I’m seeing, a larger subset of them focus on the Observe part (like using unhooking, disabling AMSI, and using direct syscalls to deliberately limit the EDR’s visibility).
It’s reasonable to approach it from this angle first, since it focuses on the sensor’s perimeter and we can more easily leverage existing knowledge of OS internals to understand how sensors interface with the OS. So it serves as a path-of-least-resistance in some way. There is also a smaller subset of offensive research that focuses on how the sensor makes sense of the raw events it receives and how it reacts to them.
I’ll go into these in the next few slides, but what I’m trying to convey here in this slide is that when we look at EDR sensors in this cognitive model, we should recognize that EDR sensors also have OODA loops, and that to win against them, using Boyd’s terms, you have to operate “inside” their OODA loops.
What that means is understanding EDR sensors in the context of each process of the loop, and then figuring out how to circumvent each process where appropriate. The way I’ve approached this is to loosely look at each process in the context of the CIA triad.
From there, you’ll also want to operate your offensive loop at a faster pace than the defensive loop. Although defenders have been catching up quickly—which is great—attackers still fundamentally have more agility by operating in smaller and nimble groups.
Essentially, what you do is improve each process in your operator loop, in a way that allows you to make better and faster choices than defenders. And as a result this can create ambiguity, confusion, and occasionally paralysis for defenders which will slow down their ability to respond.
So that’s a very high-level overview of “how to win” according to the OODA model. I’m now going to dive a little deeper and take you through how EDRs work in the context of each process of the OODA loop. I’ll also go over how we can circumvent each process in the context of the CIA triad, and provide some examples.
I mentioned earlier that I was loosely applying the CIA triad to this. I’m not sure if I’m using the best terms here or if we even need to be this granular, but I added this slide here to provide more context.
With Confidentiality, you’re not really tampering with the sensor, it’s more about deep dive research and analysis. You’re just learning more about how it works and how it’s configured so you can use that knowledge to win. But with Integrity and Availability, that’s where it can involve some level of tampering, where you’re shaping or disabling specific behaviours that works toward your interests but against defender interests.
Starting with the Observe process, this involves the intake of information which includes observing both OS-level events, and also accepting communications directed toward its different components. When the sensor decides to perform an action on the system, then it’s also in a position to observe the events that unfold from that interaction.
A few examples listed here include OS interactions (like being able to monitor process creations, file writes, API usage, etc.). There are also updates to signatures and models, as well as updates to the sensor’s configuration and policy. These, as we’ll go into in the next sections, help respectively with the Orient and Decide processes.
If we go back to the “Task Manager LSASS Dumping” example, we can ask ourselves what raw events are available for a sensor to observe?
We don’t have to get into the details as much, because that could be a talk on its own, but with what we have listed here, we have: A couple kernel-mode callbacks that can let the sensor know about the creation of the Task Manager process in addition to whether it created a process handle on LSASS. An ETW provider and a redundant inline user-mode hook for when Task Manager attempts to read process memory for LSASS. And another filter driver callback which lets it know about that dump file being created in the temp folder.
Of course, there could be more relevant and redundant forms of telemetry, but for the sake of the example, I’ll limit it to these.
When it comes to attacking the Observe process of the sensor’s loop, we have a few options available to us. First, we can get visibility into exactly what kind of telemetry is being collected from the OS. That’s what the Telemetry Sourcerer tool on the top-right does for enumerating kernel-mode callbacks, inline user-mode hooks, and ETW providers. It’s worth spending the time to understand what each of these can capture, so if you are tampering with sensors, it’s done in a meaningful way.
A popular option these days for affecting sensor integrity tends to involve bypassing user-mode hooks through unhooking, direct syscalls, or using alternate implementations of Windows API functions. Perhaps what’s less common is patching the injected sensor DLL to prevent it from reporting the telemetry to its service component.
Many sensors rely on kernel-mode callbacks for telemetry around processes, threads, objects, files, the registry, and so on. Although patching those out would require higher-privileges since you’d usually load a driver, anecdotally-speaking, it looks like several products don’t detect this right away.
With ETW and AMSI, you can patch user-mode API functions to interrupt processing. With appropriate privileges, it's usually possible to use the Windows API to remove providers from ETW trace sessions associated with sensors.
The last approach that comes to mind is getting familiar with the product’s interprocess-communication or IPC channels, and use that as a way to spoof or tamper communications from other sensor components.
Most of this isn’t new, but for me when I fit it into this OODA loop model, it forces me to think about what I haven’t considered from these other subsequent processes.
Continuing on the Observe process, I wanted to briefly introduce a small project I was working on with help from The Wover, which is something we’d like to open-source in the future.
It’s called Flashbang, and basically, it’s a declarative-based capability for evasion. It borrows the “phishlets” concept from evilginx, in that you can have YAML “blindlet” files at a per-product or per-environment level. Each blindlet contains a sequence of actions (e.g. suppressing telemetry) which can abstract away the need to write code. Once a blindlet is enabled, you can start doing suspicious stuff with the telemetry collection suppressed, and then you can disable it which will rollback the changes.
As I mentioned, a payload in Flashbang consists of a sequence of actions. You can see a few of them here on the slide, e.g. a few ways of suppressing telemetry, or doing memory patching, some utility functions too.
This project embraces the notion that there isn’t a universal or silver-bullet solution, but instead that there’s a toolbox, and the idea of creating templates to adapt to different products or situations.
A small example here (click on image above to zoom), where you can see a payload from a blindlet on the left, and on the right the output from Cobalt Strike when it’s enabled (edited a little bit to make it more readable). The payload contains a driver settings object which gives you some flexibility in loading it, and below that is the sequence of actions.
Here’s another example using a test harness against OpenEDR with some more verbose output. On the top right you can see the blindlet, and at the bottom right you can see the kernel-mode callbacks in Telemetry Sourcerer, showing that they haven’t been suppressed yet. When it’s enabled, you can refresh in Telemetry Sourcerer and see that they’re now suppressed, and it’s at this point you can do some malicious stuff with reduced risk of getting caught. Then you can disable it, which reverts the changes.
Moving onto the Orient phase of the sensor loop, this is where it processes its observations to internalize them and determine what its options are. If you recall from the operator loop, this is somewhat similar at a high level, but from the sensor side this involves a few things:
There’s the continual process of building an internal situation model to represent the system (which, at a high level, is similar to the crown jewels exercise we did in the earlier example). Then there’s normalizing the raw events we get from the Observe process so it could be reasoned by the sensor. The event data gets synthesized into the situation model, which allows it to look for malicious activity through signatures and models.
In the next few slides, I’ll continue the LSASS dumping example to provide a more concrete but simplified flow of what you’re seeing on this slide.
Here’s a simplified example of what a situation model could look like. You can have a variety of objects of different types, like sessions, users, processes, and files.
The Task Manager process is created first, which emits an event from a kernel-mode callback called PsSetCreateProcessNotifyRoutine. This can be normalized as a
PROCESS_CREATE event which gets synthesized into the model, represented as the Bertha Scott user creating that process.
Next, when we right-click the LSASS process and dump it, that results in Task Manager creating a process handle on LSASS, which can be normalized as a
PROCESS_ACCESS event. I’ve seen a few examples online where folks use direct syscalls to evade this, but that’s usually not meaningful because the sensor may well be receiving relevant telemetry from the kernel-mode callback instead of a user-mode hook, so direct syscalls or unhooking would be useless.
Third, when the memory of that process is read using the handle, that can get normalized into a
PROCESS_READ event. And lastly, when the lsass.DMP file is created, that can be normalized as a
This activity can be detected with a variety of behavioral signatures, each with their own pros and cons. Just for the sake of example, if there was signature for Task Manager accessing LSASS, then what if you could just dump from another process? Or if there was a rule to catch processes that are not running as SYSTEM accessing LSASS, can we just elevate our agent, or run our dumping command line as SYSTEM instead?
I think as operators, we are sometimes in the dark on exactly how an EDR is detecting a certain technique, and we’re in this brainstorming process to get around that. But if we could know more about what specific malicious behaviours the product is looking for, the less it feels like we’re taking a shot in the dark. And it makes us faster at developing tailored bypasses that are specific to the product and environment we’re up against.
When it comes to attacks, the brainstorming we just did about bypasses is something we can enhance, if we can manage to recover behavioural signatures from the sensors. We can also recover any on-sensor ML models or static signatures to do the same exercise, but the caveat to keep in mind is that not all of this detection logic is stored sensors. It depends on the product, but if a product supports prevention while the endpoint is offline, I think it’s reasonable to assume that whatever detection logic relevant for that should be available on-sensor. That said, the caveat again is that there could be additional logic on the server-side for which we won’t have visibility.
The other capability we can gain from deeper sensor analysis are software bugs, and design or architectural issues. There are certain types of attack indicators (like suspicious command lines) for which EDR products can model and develop detections very quickly. But there other attack indicators which can be harder to model because perhaps the telemetry isn’t there or it’s unaccounted for in the way the events are being internalized in the sensor.
Other times it isn’t a modeling issue, but a consequence of the natural frictions or constraints imposed on a product, like endpoint resource utilization or privacy compliance. This depends on the product, but some common examples include being unable to scan large binaries, or not being able to auto-upload suspected maldocs, respectively. So these modeling issues and frictions tend to take longer to remediate, assuming they are a high enough priority, and the capabilities that generally take advantage of them could have a longer shelf-life.
On the Integrity and Availability side, you can try to modify detection logic or somehow try to degrade the functioning of that component, but in most products that might trigger anti-tampering mechanisms. From a path-of-least-resistance standpoint, I think the confidentiality attacks are more practical and less risky especially if you can get that level of introspection from a lab environment. But if it makes sense to go down the avenue of tampering, then it’s worth considering some of the lower-cost efforts you can do, like limiting its visibility so you don’t have to dig as deep into individual products.
I’ve got a couple non-exhaustive case studies I can go over for this section, the first one being by @commial. They looked into the Attack Surface Reduction or ASR signatures for Windows Defender. What they were able to do was extract and decompile relevant LUA scripts from VDM files which are resource files for the engine. Within those, you can see generic ASR parameters for each rule, like path exclusions and monitored locations.
This is powerful stuff because from an operator standpoint it shines some light on whether a bypass attempt would have worked, by seeing details like this. For example, if the rule on the slide to prevent LSASS dumping is enabled, one way we can try to bypass it is by registering a service host DLL which should trigger the path exclusion. So by recovering behavioural signatures like this, it gives us something to work with—even if it's just as a starting point—compared to shooting in the dark.
The second case study is about Cylance’s ML model for file classification. The authors at Skylight Cyber published it a couple years ago, and I thought it was really interesting because there wasn’t much out like this at the time.
These models contain a treasure trove of intellectual property, with usually a lot of training going into getting them into a more workable state. But when they’re deployed on endpoints, developers can lose control over how that model is used.
There were a couple capabilities they developed from this. The first was to decouple the model from the product, and build a test harness to validate potentially malicious files against them. So that way you can scan files without having the telemetry go up to the cloud. And the second was to find biases in the model to come up with “universal” strings you can pad any malicious binary so you can confuse the model into making false negatives.
Moving to Decide as the next process, the sensor has to make decisions based on how it’s configured. As operators, this at a high-level isn’t too different to how we select a course of action based on formal rules of engagement and tradecraft policy. But there is a difference, in that we can also make decisions from our intuition. And I’m not sure if we’re quite there yet with sensors, where some sensors today primarily make decisions based on explicit rules and configurations (although not exclusively).
A sensor could take different directions based on its configuration. An example could be the degree to how much data should be collected and/or phoned home. Clearly, it could be too noisy to receive the full firehose of raw data that the sensor is observing, so that has to be filtered further. But also configurable so that the customer can decide on the tradeoff involved. There could be more implicit constraints on reducing network bandwidth, or endpoint utilization, and avoiding potential legal and privacy issues with what’s sent.
Some sensors can also take action on malicious behaviour based on how confident its detection is, which can include notifying the customer, and automated prevention if configured.
As far as attacks go, the main ones I consider is just being able to recover sensor configurations to something that’s humanly-readable, and also doing the analysis to understand how that data is being used in conditional logic throughout the sensor. Some decision logic could be hardcoded into the sensor binary, whereas other logic may be stored in a configuration file parsed by the sensor.
With tampering, I’ve had some success with manipulating those values at rest and with altering the logic by patching conditional jumps, but most sensors have some anti-tampering mechanisms like detecting when a handle is opened to its service process or monitoring changes to its settings in the registry. So you would likely have to find some sort of workaround to that first. And how easy or how difficult it is really depends on the product.
A couple examples above to show how configuration values could be stored.
On the left, this product has different flags for the collection types it supports, so if you can pull those values on a target system, you can get a better sense of what you’re up against. Similarly on the right is a screenshot of Matt Graeber’s script for parsing out Sysmon configurations which I thought was really cool—especially because Sysmon configurations contain the granular logic of what the sensor would and wouldn't report.
Moving to the Act process, this is where the sensor executes on the decisions it selects. And in understanding this, I find it useful to look at the sensor not as a monolithic agent, but rather as a patchwork of different components each with their own capabilities for action.
For example, a communications module could be responsible for sending telemetry to the cloud. An IR module could be responsible for responding to any server-side taskings. And a prevention module could handle terminating and deleting malware (if it’s configured to do so).
When it comes to attacks, I think what interests me most (at least in being enlightened) is getting visibility into the events generated by the sensor. Because once you have that, it can unlock the opportunity to build some other capabilities with program security in mind, like building offline lab environments to observe how sensors react to different TTPs.
You can also attempt to selectively filter events, where the idea is that you intercept the sensor events before they’re phoned home, and filter out the ones you don’t want the server to see. It’s something I’ve successfully tried before and while I think it’s an interesting concept, I personally haven’t found any need to actually use a capability like this. Also this might be easier on some products than others, so this and other tampering techniques are things you might want to look at opportunistically, especially if it’s something you can avoid having to do in the first place.
I’ll briefly dig into this notion of having an offline telemetry backend. I wrote about it in depth last year, so feel free to check it out if that interests you. Basically, the idea is to treat an EDR sensor almost as if it’s malware in a contained environment, where you try to dissect its C2 protocol and communications.
At a high-level there are three steps to it—two required, one optional. The first one is capturing the raw event buffers from the sensor in real-time. The second is recovering any schemas so the buffers can be parsed to be more human readable, and optionally, third is pointing the sensor to a mock server that can collect the events and ship them off to your own SIEM.
In the example here, we’re using Elastic Stack. With that, we can also build dashboard views to make better sense of the data we’re getting.
Conclusions and Future Directions
Alright, so that pretty much covers the EDR sensor’s loop, so now let’s get into conclusions and recap. We’ll start with trying to directly answer the question of how to win against EDR sensors by getting inside their OODA loops.
Looking at the Observe process, my first recommendation is to identify the telemetry sources used by the sensors you’re up against. You can use that information to guide tradecraft and develop capabilities to blind or filter sensor events (when needed). I first approached this by building off the works of others and creating Telemetry Sourcerer which helped me understand this aspect of sensors better.
The problem with it is that it was meant for lab environments and couldn’t easily be used from an operational standpoint. So I started writing Flashbang after that to operationalize the blinding capability in a flexible way where you can write configuration files for different products, similar to how we have phishlets in evilginx for different websites.
The next recommendation is to get familiar with how telemetry flows between different components of the sensor, and see where you can find opportunities to tamper or interrupt that flow. Most sensors have a user-mode and a kernel-mode component, so you can leverage existing OS internals knowledge to dig into how IPC functions between those components. And like I mentioned earlier, an alternative to removing inline user-mode hooks is to keep them there but patch out functionality to prevent the injected sensor DLL from passing the telemetry to its core component. This is similar to how user-mode ETW reporting can be patched out.
With Orient, there’s certainly an opportunity to understand more deeply how sensors are able to identify malicious behavior. And when you’re able to see how things work at a more granular level, it naturally becomes easier to develop capabilities that retain some level of stealth.
So with that, the third recommendation is to get familiar with any on-sensor signatures or models at an ongoing technical level. By doing this, it keeps you in inside the sensor’s loop by knowing how techniques get detected, and it builds a pipeline for bypass generation ideas to come to you for triage.
The fourth recommendation is to get familiar any design or architecture limitations or bugs in sensors, because although some bypasses can be rendered ineffective with a quick signature update, others might need a software update which can take longer because of complexity and limited resources. Bypasses that take advantage of that are more likely to retain some longevity.
And in general, if you’re in a position where you’re continuously building a library of new capabilities and program security matters, you can take somewhat of a mechanical ratchet approach where the new capes slowly graduate out of lab environments only when its appropriate to do so. Especially if things are still working with less sophisticated techniques, it could be risky (depending on your tradecraft policy) to unnecessarily introduce more sophisticated capes in a live environment. Any opportunity to use a capability is also an opportunity to potentially expose it.
In the Decide process, we can abuse sensor configuration details to get a better idea of how it’s tuned for detection and prevention. There are already several useful tools and scripts available for detecting the presence of a sensor on an endpoint, but we can take that a step further to parse local configurations and use that to guide tradecraft.
In this script, we can see configuration data that’s locally stored on the endpoint. Which can include information about settings you can abuse or be cautious about, as well as information about path exclusions and blacklisted hashes. Of course, the degree to how much or how little you can get really depends on the product you’re writing the script for.
The sixth recommendation involves the notion of formalizing some of our operator intuition into reasoning, where we could build toward tradecraft suggestion tools that internalize operational awareness against tradecraft policies. I think when we use Clippy to represent this concept, it shows both the promise and the potential for poor execution that can come with this.
And lastly with Act, the most utility I got out of this was getting direct visibility into the telemetry that sensors phone home. Having that offline telemetry backend was game changing for my workflow because on one screen you can have a VM with a sensor where you’re interactively exposing it to malicious stimuli, and on another screen you can see the telemetry it would have sent to the cloud without the risk of actually exposing information about the malware.
And of course you can build a detonation lab on top of this and validate payloads headlessly. So when you combine this with whatever payload generator you use for initial access, this effectively serves as that fitness function.
As you’re automatically generating payloads with various configurations, you can throw them against the validator programmatically, and filter out the ones that generate on-sensor detections or other suspicious telemetry.
Zooming out a bit, we can ask ourselves what does this partially imply for the present (or future) direction in the space of operations R&D?
The first point that comes to mind is introducing more data-driven EDR bypasses in addition to anecdotal ones, because instead of solely relying on our intuition, we can now have technical reasons to justify a bypass. You can get a higher-level of confidence when you have direct knowledge of the signatures that are being bypassed or the telemetry that is or isn’t being sent by the sensor when it’s exposed to certain stimuli.
The second is to enhance operations with optimization techniques (such as those from operations research, evolutionary computation, or automated planning). In my mind, the decision-making process in OODA loops naturally lend to what are called “combinatorial optimization” problems, where you are trying to find the best solution to a problem out of a very large set of possible solutions.
And as operators, we are constantly trying to find the best attack path out of many that is optimized to reach the operational goal while in accordance with the tradecraft policy we use for an engagement. So as we put more effort into bridging that gap between the data we get and the decisions we make, we can make our OODA loop go faster which is how you win according to this model. And it’s interesting to see how automation efforts in this space can further tighten that timeline between observations and decisions. I hesitate to call this stuff AI, only because it’s an area I’m not deeply familiar with, but I can see the value proposition from approaching things this way.
An example to consider is modeling initial access payload generation as a optimization problem.
Take a look at the graph in the slide above. On the left you can see that there are groups of nodes that are linked with edges representing legal transitions from one node to another. We start at the top with a delivery vector and end all the way down to an action like executing shellcode. When we use this model to build out initial access capabilities, it boils down to adding more nodes to each group, and creatively thinking of ways to add new edges to connect them.
When we think of how to generate the best solution out of the many combinations here, we can add weights to the edges by scoring them against the tradecraft policy that’s being used. Doing this can give you a ranked list of potential solutions to choose from. From there you can further filter down solutions by running them against a “fitness function”, namely a detonation lab with a private telemetry backend. This is where you can get a better approximation of whether a sensor with a representative configuration would have detected the payload, without the risk of sending all that telemetry to the cloud.
Putting this all together, it helps reduce uncertainty and serves as a way of tightening that gap between what you see, what you know, and what you choose. Anyway, that was a bit of a mouthful. But I wanted to highlight the potential of taking this cognitive model a step further.
Zooming out further, I think the quote that captures the moral of the story is “to cut down a tree in five minutes, spend three minutes sharpening your axe.” It’s nothing new to a lot of us, but it’s worth reminding ourselves to slow down, think, and prepare before we act.
Another observation was that while I was looking into this, it reinforced this notion that when you compare two things against the same model, you can gain some additional insights about each. The insight I got about EDR sensors was a deeper understanding of how they attempt to emulate a decision-making process that is somewhat similar to our own, even if it’s currently at somewhat of an nascent stage. And the second insight for operations was this quantification of tradecraft policy to give sanity checks on our intuition.
My next thought was that if we can actually quantify our tradecraft policy to some degree, how can that translate and get baked into the code-level for developed capabilities? And as is common in InfoSec, if we’re thinking about it, where has it already been done or attempted?
Alright, well that’s all I have. Thank you so much for taking the time to watch/read this and listen to me say a bunch of stuff ツ. I know that most of the technical details probably weren’t novel to many of you and my perspective is limited, but I hope this provided some food for thought and that I convinced some of you to look at things with this OODA Loop model. This model is something that’s much broader than its application for operations and EDR sensors.
I also have DMs open, and can continue this conversation where I can. Lastly, I wanted to thank those that reviewed this talk and thank the organizers of x33fcon for hosting this.