Viktor Petersson logo

Podcast

Follow Me

Join Viktor, a proud nerd and seasoned entrepreneur, whose academic journey at Santa Clara University in Silicon Valley sparked a career marked by innovation and foresight. From his college days, Viktor embarked on an entrepreneurial path, beginning with YippieMove, a groundbreaking email migration service, and continuing with a series of bootstrapped ventures.

Demystifying eBPF with Liz Rice: A Deep Dive into Kernel Programming and Security

Play On Listen to podcast on YouTube Listen to podcast on Spotify Listen to podcast on Apple Listen to podcast on Amazon music
19 MAY • 2024 51 mins
Share:

In this episode, I’m joined by Liz Rice, a security expert and open-source advocate, for a deep dive into the fascinating world of eBPF. Liz’s expertise in kernel programming and security offers unique insights into how this technology is reshaping modern infrastructure.

We start by breaking down what eBPF actually is - dynamic programming of the Linux kernel to alter its behavior. What particularly caught my attention was how this technology has evolved far beyond its original purpose of packet filtering. Liz shares her introduction to eBPF through Thomas Graf’s presentation on Cilium at DockerCon 2017, while I highlight Brendan Gregg’s groundbreaking work at Netflix using eBPF for network diagnostics.

The conversation gets especially interesting when we explore how eBPF is revolutionizing traditional tools like IP tables. Liz explains how eBPF’s efficiency makes complex tasks like network policy enforcement and zero-trust networking more achievable in modern cloud-native architectures. Her insights into combining eBPF with tools like WireGuard and IPsec for secure communication reveal the practical implications for modern infrastructure.

I was particularly intrigued by our discussion of Tetragon, a Cilium project that leverages eBPF for runtime security. Liz’s explanation of how it enables real-time, low-overhead monitoring and threat response showcases the practical applications of this technology. We also tackle recent supply chain security challenges, like the Log4j vulnerability, exploring how eBPF-based tools can adapt quickly to emerging threats.

If you’re interested in infrastructure security, kernel programming, or the future of cloud-native technologies, you’ll find plenty of practical insights here. Liz brings both deep technical knowledge and practical experience to the discussion, making complex kernel concepts accessible while maintaining their technical depth.

Transcript

Show/Hide Transcript
[00:03] Viktor Petersson
Welcome back to nerding out with Victor today.
[00:06] Viktor Petersson
My guest is Liss Rice, who is a security geek and open source champion.
[00:11] Viktor Petersson
I got to know Liz a number of years ago when were both part of the early London Kubernetes scene.
[00:16] Viktor Petersson
Since then, Liz has moved up in the world and written no less than two books.
[00:20] Viktor Petersson
The most recent one about EBPF, which I actually got a copy of right here, thanks to Liz.
[00:25] Liz Rice
Amazing.
[00:27] Viktor Petersson
And today's conversation will be largely around EBPF, but not limited to EBBF.
[00:33] Viktor Petersson
But without further ado, welcome to the show, Liz.
[00:35] Liz Rice
Thanks for having me, Victor.
[00:37] Liz Rice
Good to be here.
[00:38] Viktor Petersson
It's always a pleasure catching up with you.
[00:40] Viktor Petersson
So I guess we should start with EBPF.
[00:44] Viktor Petersson
And so I guess a good question to frame that is, what's EBPF, and why should I care?
[00:51] Liz Rice
Sure.
[00:51] Liz Rice
So you might think, oh, it's EBPF.
[00:54] Liz Rice
That sounds like an acronym.
[00:55] Liz Rice
I want to know what it stands for.
[00:57] Liz Rice
I can tell you it kind of used to stand for extended Berkeley packet filter, but you can honestly forget that, because, like, it does so much more than packet filtering.
[01:08] Liz Rice
What it lets you do is program the kernel so we can change the way the kernel behaves dynamically.
[01:16] Liz Rice
And the reason why that's so powerful is that the kernel is involved with pretty much anything interesting that would do with computers.
[01:25] Liz Rice
So the kernel is really the bit of the operating system that interfaces with hardware.
[01:31] Liz Rice
So if you want to write to a file or, you know, get a network message or something like that, then the kernel is involved.
[01:38] Liz Rice
User space developers don't normally have to really think about it, but their programming languages are using these abstractions that call system calls that ask the kernel to do these things with hardware.
[01:52] Liz Rice
The kernel is also looking after things like permissions and privileges and coordinating multiple different processes.
[02:00] Liz Rice
So if we can change the way the kernel behaves, we can use that to influence all sorts of things in the system.
[02:10] Liz Rice
Really common use cases would be instrumenting events in the kernel to use that for observability security, because we can affect whether or not things are allowed to happen or not.
[02:22] Liz Rice
So we can do some really cool dynamic security tooling and networking where we can essentially take network packets and kind of do whatever we want with them, possibly passing them back to the kernels networking stack, or possibly doing something.
[02:40] Liz Rice
So tons of really cool kind of infrastructure level things that we can do with Ebbs.
[02:47] Viktor Petersson
And if I'm not mistaken, the early days of, well, Ebbs claimed the fame really was Brendan Greggs at Netflix trying to debug firehose of data and traditional network diagnostic tools could really not cope with the volume of traffic that Netflix had even back then.
[03:06] Viktor Petersson
And that's where it really started, if that is correct assessment or I.
[03:10] Viktor Petersson
I.
[03:10] Liz Rice
Mean, Brendan certainly was involved right from the early days and did some amazing work around using EBPF, that kind of observability for tracing, for performance, you know, measurements.
[03:23] Liz Rice
Yeah, tons of really great work there.
[03:27] Liz Rice
In fact, behind me, I've got the poster for the EBPF documentary.
[03:31] Liz Rice
Little plug for that.
[03:34] Liz Rice
Alexei, who's at the top of the.
[03:36] Liz Rice
In the middle of that poster, Brendan is on there.
[03:40] Liz Rice
He's to the.
[03:43] Liz Rice
If we're looking at the screen, he's to the left of Alexei.
[03:46] Liz Rice
So that's Brendan.
[03:47] Liz Rice
Greg, Alexei's in the middle.
[03:49] Liz Rice
And he was at.
[03:51] Liz Rice
Well, what was Facebook back then?
[03:52] Liz Rice
Meta now.
[03:54] Liz Rice
And he had this kind of crazy idea to run programs in the.
[04:01] Liz Rice
In the kernel.
[04:02] Liz Rice
One of the consequences of meta being involved really early on is, yeah, again, things being done at massive scale.
[04:10] Liz Rice
And every network packet that has been sent or received to Facebook or Instagram or whatever else since, I want to say, 2016 from memory, has all been processed by.
[04:25] Viktor Petersson
Oh, wow.
[04:26] Liz Rice
Which is massive.
[04:29] Viktor Petersson
That's.
[04:30] Viktor Petersson
Yeah, that is pretty crazy.
[04:31] Viktor Petersson
And,
[04:32] Viktor Petersson
Yeah, so I think I first came across EBPF when there was, I believe, a Falco port into EBPF.
[04:40] Viktor Petersson
And this must have been what, five, six years ago?
[04:43] Viktor Petersson
Probably something like that.
[04:45] Viktor Petersson
And that's when I started digging into EBPF and started playing a little bit with the tooling and got really interested.
[04:50] Viktor Petersson
And then I've seen quite a few talks around that since.
[04:53] Viktor Petersson
But you were, I believe you were at Aqua at the time, roughly, I believe.
[04:58] Viktor Petersson
Right.
[05:00] Viktor Petersson
Was that your first exposure to EBPF around that time period or when did you start diving into that?
[05:05] Liz Rice
So my first exposure to EBPF was seeing Thomas Graff, who I now work with at isovalent, presenting on psyllium at a docacon.
[05:15] Liz Rice
And it was, I want to say 2017, might have even been 2016.
[05:21] Liz Rice
I think it was 2017.
[05:23] Liz Rice
And I was doing a talk about how containers work, and Thomas was doing this talk in the same track about cilium and how it was using EBPF to do container networking.
[05:38] Liz Rice
And I remember thinking at the time, this is amazing, but back then, you needed a version of the kernel that was really cutting edge, that nobody, actually.
[05:48] Viktor Petersson
None of the cloud vendors had ran it.
[05:50] Liz Rice
Yeah.
[05:50] Liz Rice
But I remember thinking at the time, this is going to be super interesting when people, you know, don't need to build their own kernel to try it out.
[05:58] Liz Rice
And so I'd kind of been keeping my eye on it.
[06:02] Liz Rice
And then when I was at AQA, one of my colleagues was talking about using EBPF.
[06:08] Liz Rice
He was actually doing some kind of postgrad project using EBPF on Android, but he had the idea that perhaps we could use it for security purposes.
[06:21] Liz Rice
And yeah, around the same time, I think Falco was doing its port from kernel modules to EBPF, they still support both versions.
[06:31] Liz Rice
And then, so we built Acqua, a project called Tracy.
[06:38] Liz Rice
And then I spoke at the EBPF summit in 2020, which is surveillance organised.
[06:46] Liz Rice
And I kind of realized through that event that although I felt like I knew something about EBPF, my tip of the iceberg, when there was this enormous sort of additional amount of information and that all the experts in it, a lot of the experts, was a ton of expertise in isovalent.
[07:08] Liz Rice
So that was the beginning of what has turned out to be a very fruitful conversation that we had back then about me getting involved with isovalent.
[07:17] Viktor Petersson
Very nice.
[07:18] Viktor Petersson
I guess it started as a diagnostics tool, but evolved into far more of a programming language and a vm in a sense, from a programming perspective.
[07:31] Viktor Petersson
To run.
[07:34] Viktor Petersson
Talk to me a bit more about what you guys are doing right now at isoviolet with networking, routing and all these things around.
[07:43] Viktor Petersson
Because, I mean, it started with diagnostics, relatively easy to wrap your head around, but now we're doing far more complicated things like building networking and zero trust networking models around EBPF on top of EBPF, I guess.
[07:58] Viktor Petersson
Walk me through how that actually works and how well, basically the architect of how something can be accomplished using EBPF.
[08:07] Viktor Petersson
Really?
[08:07] Liz Rice
Yeah.
[08:08] Liz Rice
So the other character I should maybe point out from the poster on the other side, Alexei's right, is Daniel Balkman, who is at ISO Valent.
[08:21] Liz Rice
And he saw Alexei's idea about this kind of virtual machine in the kernel.
[08:29] Liz Rice
He saw that being posted mailingly.
[08:32] Liz Rice
And I think the way it was originally presented was maybe a little bit much for the kernel community to absorb.
[08:40] Liz Rice
But then you kind of looked at it and thought there's something really useful here.
[08:44] Liz Rice
And he was involved in kernel networking and he, I guess, saw the potential for using EBPF on the networking side.
[08:52] Liz Rice
So he worked with Alexei and they became the maintainers of the EBPF subsystem in the kernel at the time.
[09:02] Liz Rice
And I guess what Daniel could see, you know, what he was able to anticipate was we could intercept network packets and either manipulate them or drop them or redirect them in the kernel stack.
[09:22] Liz Rice
So we can insert EBPF programs into a variety of different places in the networking stack and make decisions about what to do with the packets that we see.
[09:32] Liz Rice
And for kernel, sorry for container networking, this is super interesting because of the way we create network namespaces for different containers.
[09:44] Liz Rice
So you typically say you've got a container and a running on a host, the host has a network namespace and we typically put the container into its own network namespace.
[09:58] Liz Rice
So if a packet comes, you know, it arrives over a physical interface into the host.
[10:04] Liz Rice
It gets traditionally routed all the way through the host's networking stack.
[10:10] Liz Rice
And then there's a virtual ethernet connection into the containers networking namespace where it then goes through the whole networking stack again to get to the application.
[10:21] Liz Rice
And what we're able to do with EBPF is intercept the packet early on and say well I can see this is intended for the container, so rather than going all the way through the host network, I'll just send it directly to the virtual Ethernet connection and make that a much shorter part.
[10:39] Viktor Petersson
So you replaced entire iptables because Docker container is basically our blackmagic around iptables essentially.
[10:48] Viktor Petersson
Right, which is very fragile in many.
[10:50] Liz Rice
Ways and can be a performance hit, particularly in a dynamic world like kubernetes where we're just creating and destroying IP addresses all the time.
[11:03] Liz Rice
And those require IP tables to be rewritten.
[11:08] Liz Rice
The way iptables works, they have to be rewritten.
[11:11] Liz Rice
One change, the entire table has to be rewritten.
[11:15] Liz Rice
Doesn't scale very well.
[11:17] Liz Rice
But we, yeah, we can bypass all that with ciliate.
[11:20] Viktor Petersson
Okay.
[11:21] Liz Rice
And then we also get the option to look at those packets and say, use it for network policy so we can make decisions about dropping packets if they don't comply with the network policy.
[11:34] Liz Rice
So right from the really early day, you know, that initial demo that I saw Thomas doing back in 2020 2017 was about network policies and the ability to drop packets and build these powerful network policies very easily.
[11:50] Viktor Petersson
Right.
[11:51] Viktor Petersson
And I guess the two questions that comes to mind around docker networking, what networking in a Kubernetes or cloud native workload is.
[12:00] Viktor Petersson
The first one is you can explain the fiscal routing to the container, but how would you do in a zero trust world, vm to vm routing?
[12:13] Viktor Petersson
Is that something that's covered by users?
[12:15] Viktor Petersson
Because that would be, I guess, how do you see that?
[12:18] Viktor Petersson
I guess in the upfront world, can you do host to host communication within that framework as well?
[12:24] Liz Rice
Yeah.
[12:24] Liz Rice
So if we think about you have to have a network connecting to machine and, you know, we're not involved at the physical layer, but everything kind of above that we can be involved in.
[12:42] Liz Rice
Silium's vision is really to enable all networking, you know, any workload to connect to any other workload without having to worry about whether it's in Kubernetes or on a legacy machine or where it is from.
[12:59] Liz Rice
You know, in a kind of naive way, they're just IP addresses.
[13:05] Liz Rice
Now we might have to know how that IP address matches to a particular endpoint, but provided we know how to route to that IP address, we can.
[13:15] Liz Rice
Why can we not?
[13:18] Liz Rice
We can do that, but we can be quite sophisticated about knowing what all these different endpoints are having.
[13:25] Liz Rice
Network policies, understanding things at layer seven as well.
[13:31] Liz Rice
So we could be doing things like domain name based network policies and API aware network policies.
[13:41] Liz Rice
So for example, GRPC policies that are able to interpret different GRPC requests or HTTP requests only allow posts to this particular endpoint.
[13:59] Viktor Petersson
Right.
[14:01] Viktor Petersson
So are you able to do, I guess, encryption as well when you do node to node communication in the EBPF layer?
[14:08] Viktor Petersson
So you can actually do TL's directly in the EBPF layer?
[14:11] Liz Rice
Well, the way we typically do it is to leverage either wireguard or iPsec.
[14:18] Liz Rice
You know, there's that old saying about don't roll your own crypto, so.
[14:25] Liz Rice
But we can do that again very efficiently, the way I'm more familiar with wireguard than I am with Ipsec kind of implementation layer.
[14:36] Liz Rice
And wireguard presents itself as a virtual network interface, much like the way that containers present a virtual Ethernet interface.
[14:46] Liz Rice
So from a silium perspective, we can configure these interfaces and send packets.
[14:55] Liz Rice
We're making choices about where to send packets, which to use a wireguard interface.
[15:04] Viktor Petersson
Okay, that makes a lot of sense.
[15:07] Viktor Petersson
One of the things that I got very excited about in the Kubernetes world in the early days, not so early days, but was istio?
[15:15] Viktor Petersson
And that is a project that had a lot of promise.
[15:19] Viktor Petersson
Everybody was really excited about it, but it kind of fell flat.
[15:23] Viktor Petersson
I know there are a lot of people who did implement it, but I haven't heard many success stories but didn't include a lot of engineers on page of duty twenty four seven to monitor it.
[15:36] Viktor Petersson
Is that kind of the alternative to what you're looking at, or how do you see istio fitting in together with this?
[15:44] Viktor Petersson
Or they're completely unrelated.
[15:46] Liz Rice
Yeah.
[15:46] Liz Rice
So it is really interesting when we think about what is a service mesh?
[15:52] Liz Rice
And if you ask two different people, you will get two different.
[15:56] Liz Rice
What you want from a service mesh depends on your requirements, your environment and what have you.
[16:02] Liz Rice
But really what service meshes are about is let's make sure applications can speak to each other.
[16:09] Liz Rice
Yeah, maybe there's some load balancing involved.
[16:12] Liz Rice
Maybe there's some, you know, things like rolling out load balancing between different versions of a particular service.
[16:19] Liz Rice
And I remember when, early on when I joined isurveillant talking to Thomas and he was saying, you know, we already have like 80% of a service mesh because, you know, we already have encryption, we already have load balancing.
[16:35] Liz Rice
Really, the only bit at that point we didn't have was ingress.
[16:38] Liz Rice
And we now support ingress, and we now support gateway API Kubernetes world as the kind of scalable big brother to ingress.
[16:51] Liz Rice
And the advantage that we have in the cilian world is that we're covering all the layers of the networking stack.
[16:59] Liz Rice
So Istio has kind of had to build additional resources to represent these kind of concepts that are really networking based that we can kind of handle within the networking stack.
[17:19] Liz Rice
So we think about, you know, our vision of just being able to connect everything to everything.
[17:25] Liz Rice
There's enough information in a gateway API definition to tell us, well, okay, this request is supposed to be sent to this particular type of backend.
[17:36] Liz Rice
And we already had Kubernetes services, we already had the ability to load balance across them.
[17:42] Liz Rice
It's a relatively small lift to go from that to what people are calling a service mesh today.
[17:51] Liz Rice
So, yeah, I mean, when people say, okay, we're running istio, how can we, you know, can psyllium do this in a more lightweight way?
[18:02] Liz Rice
You know, the answer is yes.
[18:04] Liz Rice
Do we reproduce all the cilium resource types?
[18:06] Liz Rice
No.
[18:07] Liz Rice
All the istio resource types?
[18:09] Liz Rice
No, we don't.
[18:11] Liz Rice
Do we use.
[18:12] Liz Rice
I guess the other point about it is the way that well, the way that we can avoid using.
[18:25] Liz Rice
So there's a new thing called ambient mesh coming into istio, but let's just park that for a moment and talk about traditional istio, where it uses sidecars in every pod.
[18:39] Liz Rice
And there are a few sort of downsides to sidecars that apply.
[18:47] Liz Rice
Whether we're talking about service mesh or any other kind of instrumentation, you have to get the sidecar container into your pod, which basically means restarting the pod.
[18:59] Liz Rice
So your instrumentation and your application lifecycle start being tightly coupled.
[19:08] Liz Rice
You have to be confident that the, all the pods are actually instrumented correctly, that the container is being injected, whether that's in CI CD or some kind of webhook, it's got to happen, whereas if we can do everything at the EBPF layer, we only have to instrument the host, we don't have to instrument every single individual pod.
[19:30] Liz Rice
So that saves us lots of resources.
[19:32] Liz Rice
It means we don't have to modify running pods.
[19:36] Liz Rice
We can just enable features in the kind of using EBPF in the kernel layer and in the cilium agent, and it just works for existing running application.
[19:53] Liz Rice
The other cool thing is if you have a proxy running inside the pod, so we use the same proxy, we use envoy, which is the same proxy that IsDA.
[20:07] Viktor Petersson
Yeah.
[20:08] Liz Rice
But instead of having one inside every single pod, we have one per host.
[20:13] Liz Rice
In the sidecar model, every single packet has to traverse through the proxy and then out of the pod, whereas not every packet necessarily needs to be terminated.
[20:29] Liz Rice
If you've got a layer three packet or you've got a udp, you don't necessarily need to have gone through a proxy.
[20:37] Liz Rice
So we can avoid going to the proxy if it's not necessary.
[20:42] Liz Rice
And if you imagine you have two pods on the same host, if they're communicating with each other, it's going through two proxies in this sidecar model, because there's a sidecar in each pod, whereas if we've just got one proxy on the host, we only have to go through the proxy once, if at all.
[21:03] Viktor Petersson
Yeah.
[21:04] Liz Rice
So the performance is really quite significantly improved.
[21:09] Viktor Petersson
How do you deal with, I guess, attestation and workload identity?
[21:12] Viktor Petersson
Because that's two of the things that istio does, right?
[21:15] Viktor Petersson
Because that's the MTLs or zero trust element of istio.
[21:19] Viktor Petersson
Right.
[21:20] Viktor Petersson
If you're running a single, like one of the benefits, I guess, of running a sidecar is that you can tie tightly couple the identity with the workload, whereas if you're running on the host, I guess you have less tight mapping, I guess.
[21:36] Viktor Petersson
How are you thinking about that?
[21:37] Liz Rice
Yeah, so one point about MTLs is absolutely, a lot of people have a requirement for all traffic to be encrypted at rest and in flight.
[21:53] Liz Rice
So you want to know that all connections are encrypted.
[21:57] Liz Rice
MTLS is not the only hammer for the encryption now.
[22:01] Liz Rice
Right.
[22:02] Liz Rice
We talked about wireguard and Ipsec earlier.
[22:05] Liz Rice
So for a lot of use cases, if what you're looking for is encrypted traffic, you don't actually need mtls from workload to.
[22:15] Viktor Petersson
Right.
[22:15] Liz Rice
You do want mtls or something like it.
[22:20] Liz Rice
If you are actually wanting to cryptographically verify the identities at either end of that connection.
[22:27] Liz Rice
And if you do, then mtls is one way of solving that problem.
[22:33] Liz Rice
What we've done in silium is separate out the kind of handshake part that happens when you upgrade TCP connector.
[22:45] Liz Rice
The TL's, we've separated that from the encryption part.
[22:52] Liz Rice
So what we actually do is have the agent at either end use a spiffy identity.
[23:00] Liz Rice
We're using spiffy Spire to get the cryptographic identity for each workload.
[23:06] Liz Rice
The agent actually does that handshake on behalf of the work.
[23:12] Liz Rice
And if that handshake is successful, we then allow traffic to pass and we can use wireguard as the encryption mechanism.
[23:22] Liz Rice
So the encryption and the identity verification are kind of separated out.
[23:29] Viktor Petersson
I see, I see.
[23:29] Viktor Petersson
But you still run some kind of PKI, then, within the cluster or within the workload as part of your agent server agent model, I guess.
[23:39] Liz Rice
Yeah.
[23:40] Liz Rice
We built this with a kind of pluggable API for the identity mechanism.
[23:46] Liz Rice
The out of the box solution is spiffy Spire, but it could be something else, right?
[23:52] Viktor Petersson
You could plug into your cloud based PKI if you wanted to, I presume.
[23:56] Viktor Petersson
Or.
[23:56] Liz Rice
Yeah, I mean, whether or not there's actually an implementation today against that API is, you know.
[24:03] Viktor Petersson
Yeah, but at the end of the day, it's x 509 at the end of the day, right?
[24:06] Liz Rice
Yeah, yeah.
[24:06] Liz Rice
It's standard.
[24:07] Liz Rice
I.
[24:09] Viktor Petersson
Cool.
[24:10] Viktor Petersson
Let me take a step back and talk more about, I guess, security of EBPF.
[24:16] Viktor Petersson
Obviously, you mentioned already there was a lot of reluctance in accepting this into the kernel from earlier, for obvious reason.
[24:22] Viktor Petersson
Because now you can execute code from use space inside the kernel, which opens a pandora's box of attack vectors, I guess.
[24:31] Viktor Petersson
What have you seen around that?
[24:33] Viktor Petersson
Obviously, it must be reliable vm if it has made it into the mainline kernel.
[24:38] Viktor Petersson
There are a lot of paranormal people auditing that code before it makes it there.
[24:42] Viktor Petersson
What do you make of that?
[24:43] Viktor Petersson
And how do you see that?
[24:44] Liz Rice
Well, yeah, so the important thing about EBPF is it goes through this process of verification before the program is allowed to be, or as the program is loaded into the kernel, it gets verified, and if it doesn't pass verification, the kernel rejects it.
[25:02] Liz Rice
Now, what verification does is it ensures that memory access is safe, it ensures that programs will complete, that it can't crash.
[25:12] Liz Rice
You know, we're checking every single memory access to make sure there aren't any null pointed dereferences and things like that.
[25:22] Liz Rice
So that gives us some guarantees about the program being able to run successfully.
[25:32] Liz Rice
Can't bring down what we can't do with the verifier is tell the difference between a malicious EBPF program and a, you know, a well meaning EBPF program.
[25:45] Liz Rice
So a really great example is dropping packets.
[25:49] Liz Rice
You know, I might want to drop network packets because I'm enforcing a network policy, or if I'm a bad actor, I might want to drop network packets.
[25:58] Liz Rice
To mess with your traffic and verify has no way of knowing what my intent was when I wrote a program to drop certain network.
[26:09] Liz Rice
So this is where, you know, the provenance of your EBPF program is super important.
[26:16] Liz Rice
You need to trust your supplier of that program that the thing you're running is going to be, you know, well intentioned.
[26:27] Viktor Petersson
Yeah.
[26:28] Liz Rice
So supply chain security and mining and all that kind of thing.
[26:35] Liz Rice
Now you can absolutely do the kind of normal supply chain security around user space agents that typically come with an I application.
[26:46] Liz Rice
So say, for example, if you're running silium, you don't actually have to get involved in the nitty gritty of loading and unloading EBPF programs into the kernel.
[26:55] Liz Rice
There's user space agents to do that for you, right?
[27:02] Liz Rice
But if we're talking about a particular EBPF program and trying to have the kernel validate its provenance, you might think, well, can't you just sign the program?
[27:14] Liz Rice
The problem is you can't because as you load the program into the kernel, it's actually getting adjusted by the user space loader to make sure that the bytecode in the kernel matches the data structure pile once run every.
[27:38] Liz Rice
So it means because you're actually adjusting the instructions as you kind of load them in the program, that you end up loading into the kernel is not the same as the kernel, as the program that you would delivered, you know, you downloaded from wherever you downloaded it, or that you compiled.
[28:00] Liz Rice
So you can't just cite, or you can't then have the kernel check a signature matches like a hash of the instructions because the instructions have changed.
[28:12] Liz Rice
So this is quite an interesting challenge for the UBPF kernel community at the moment.
[28:19] Liz Rice
And there are a few proposals around.
[28:24] Liz Rice
There's one I'm aware of where they're essentially validating the that the user space agent that does the loading is signed and as expected, but having that verified.
[28:43] Viktor Petersson
From within, because, I mean, it sounds like EBV is the ultimate bad actor toolkit in the sense, because you could essentially obfuscate or hide the content or like a line in the shadows file or password file or even SSH configured and inject all kind of roadmap backdoors that you would never see from user space because it's intercepted and obfuscated.
[29:08] Viktor Petersson
We even obfuscate.
[29:10] Viktor Petersson
Like, I've heard of attacks where they compromise, like the LS command, for instance, so it doesn't output what you expect it to output and actually hides files and whatnot.
[29:19] Viktor Petersson
So I guess it's an interesting attack vector that's kind of new for this.
[29:23] Liz Rice
Yeah, yeah.
[29:24] Liz Rice
I mean, there's definitely, with great power comes great responsibility.
[29:30] Liz Rice
There are tons of interesting new types of attack that you could build with EBPF.
[29:37] Liz Rice
You can use EBPF to kind of monitor and observe EBPF.
[29:43] Viktor Petersson
Right.
[29:44] Liz Rice
So use EBPF to see what other EBPF programs are loaded, for example.
[29:50] Viktor Petersson
Becomes very meta.
[29:52] Liz Rice
Yeah, definitely.
[29:54] Liz Rice
But you can also get into a bit of an arms race there where you need the, you know, if the attacker kind of gets in there first, they might be able to obfuscate themselves.
[30:07] Liz Rice
You know, it's kind of like you want to know that the colonel that you boot is the colonel that you intended to be.
[30:16] Viktor Petersson
Right?
[30:17] Viktor Petersson
Oh, we kind of got that with secure boot in a way.
[30:20] Viktor Petersson
Right.
[30:20] Viktor Petersson
But now, I guess EBPF, in a sense, incompatible with secure boot, because it kind of makes it void in a sense, I guess because.
[30:31] Liz Rice
Yeah, yeah.
[30:31] Liz Rice
I don't know whether there are, you know, you want to be the first EBPF.
[30:42] Liz Rice
Your EBPF monitoring code needs to be the first EBPF code loaded.
[30:47] Liz Rice
So whether.
[30:48] Liz Rice
Yeah, part of your.
[30:52] Liz Rice
I think there's more work to be done in that area.
[30:56] Viktor Petersson
Yeah.
[30:57] Viktor Petersson
And I think, I mean, there's probably a big opportunity for the falcos of the world, right, to do IDs on the EBPF level and monitor all these bad actors.
[31:07] Viktor Petersson
And I guess that brings me to an interesting question that I've been pondering.
[31:11] Viktor Petersson
I've seen a lot on social media lately, the need for, I guess, antivirus in a traditional sense, more in the Linux world that we've been blessed for not having to worry about for decades now.
[31:25] Viktor Petersson
But given things that transpired in the last few weeks, we're starting to get to a point where actually maybe we do need to think about these things again, and maybe that is something to do with EBPF.
[31:38] Viktor Petersson
What do you think?
[31:39] Liz Rice
So I feel like I need to mention the cilium project called tetragon, which is kind of like the next generation of Falco.
[31:50] Liz Rice
So tetragon actually instruments some lower level functions within the kernel.
[31:58] Liz Rice
So Falco is instrumenting.
[32:00] Liz Rice
System calls tetragon is instrumenting, sort of some well known locations within the kernel, but the maintain.
[32:10] Liz Rice
So there are some reasons why that's actually more secure than the way that Falco does syscall attachment.
[32:18] Liz Rice
But the real improvement is that tetragon can filter events within the.
[32:24] Liz Rice
So Falco, the EBPF code is going to tell you, let's say you're looking for file open events.
[32:31] Liz Rice
So it will report all of those file open events and then in user space filter out the ones that match the policy.
[32:39] Viktor Petersson
Right.
[32:40] Liz Rice
With Tetragon, it's actually pushing the policy into the kernel.
[32:44] Liz Rice
So you do the filtering in kernel, you end up sending a much much reduced set of events to user space, and therefore you get incredibly low overhead.
[32:58] Liz Rice
I think if you're using Falco with a significant set of policies, you will really know the overhead.
[33:06] Liz Rice
I want to say there was a benchmark that the Tetragon team did.
[33:10] Liz Rice
I'm terrible at remembering numbers, but I'm pretty sure that the Falco implementation with some single digit percentage cpu and tetragon equivalent was like 0.1.
[33:23] Viktor Petersson
Oh wow.
[33:23] Viktor Petersson
That is a significant improvement.
[33:25] Viktor Petersson
Yes.
[33:28] Viktor Petersson
The other thing I wanted to cover was obviously the XE vulnerability attack, or supply chain security attack that happened.
[33:35] Viktor Petersson
Well, we're recording this on April 5.
[33:37] Viktor Petersson
I don't know when this would go live, but this has transpired at last, I guess week now.
[33:42] Viktor Petersson
And I was curious about the blog post that was on Isobelan's blog about detecting this vulnerability.
[33:50] Viktor Petersson
And I was curious about the technical side of that, how that actually, how you, how do you detect that, those vulnerabilities or those.
[33:56] Viktor Petersson
That injection, I guess.
[33:57] Liz Rice
Yeah.
[33:58] Liz Rice
So that's using Tetragon.
[34:00] Liz Rice
I haven't looked at the blog post in enough detail to know exactly what events it's looking for.
[34:05] Liz Rice
But I think it's a really good example of how within really short space of time of that vulnerability being known, were able to write a policy.
[34:17] Liz Rice
It's just a policy.
[34:18] Liz Rice
We didn't have to change Tetragon.
[34:20] Liz Rice
We wrote a policy that detects and inform.
[34:25] Liz Rice
Seeing that xe.
[34:30] Viktor Petersson
Yeah, I mean, because that's.
[34:31] Viktor Petersson
It is something that I think.
[34:33] Viktor Petersson
I mean, supply chain security, as you alluded to before, is a massive field that is evolving quickly.
[34:39] Viktor Petersson
And I guess we're not quite ready for it, to be honest, in the open source world, because you start to see more and more of these state sponsored attack vectors, attacks against the open source community and open source libraries maintained by a single guy like the open SL attack springs to mind, where there are like three people on the planet that actually understands its source code and they're all working as volunteers on these projects.
[35:06] Viktor Petersson
Where's your head around this in terms of future of the industry, really, like supply chain security?
[35:11] Liz Rice
Yeah, I mean, as well as the sort of individuals who are involved and the fact there is, how do you keep track of what each individual is doing?
[35:21] Liz Rice
There's also just the sheer scale, the number of different possible dependencies that people have out there.
[35:27] Viktor Petersson
Yeah.
[35:27] Liz Rice
And I think, I mean, I've been saying for a really long time now that I think runtime security is really the important part of this.
[35:38] Liz Rice
So, I mean, I definitely believe in defence in depth, you know, don't get me wrong, the supply chain security stuff is incredible, incredibly important.
[35:46] Liz Rice
But what we really want to be able to do is detect malicious behavior or unexpected behavior at runtime, this whole kind of ids idea.
[35:56] Liz Rice
And I think if we contrast this with networking, everybody has run firewalls, you know, enforcing network traffic drops for decades.
[36:10] Liz Rice
I'm going to say, you know, it's not a new thing to have a firewall that will enforce your network policy.
[36:19] Liz Rice
In the runtime world we seem to be, or in the kind of program execution world, there's a lot more hesitancy about this idea that like, well, if we try to spot something doing something malicious, we might inadvertently also catch, you know, expected behavior, and then we're going to get in the way of our programs and we're going to cause more problems than we solve.
[36:45] Liz Rice
And also the overhead concern about if we're going to monitor everything that our programs are doing, is there going to be any cpu left to actually do anything?
[36:55] Liz Rice
And I think this is where EVPF is proving to be incredibly powerful with implementations like Tetragon.
[37:03] Liz Rice
Because if we can find a way to express what intent, what, you know, what's intended in a much better, we don't want.
[37:14] Liz Rice
A lot of this has been done with like Seccomp and trying to interpret this at the syscall layer, and who knows which syscall.
[37:23] Liz Rice
You know, if you're writing a web application, you have no idea which syscalls you're intending to call, but what you probably do know is things like whether you're expecting to talk to a database and what domain names you're expecting to communicate with, and what files you're expecting to access.
[37:41] Viktor Petersson
If you get a read on your password file from your python library, there's probably a red flag there.
[37:46] Liz Rice
Exactly.
[37:47] Liz Rice
Yeah.
[37:48] Liz Rice
So if we can express those policies in ways that make sense to humans, that we can say it's a set of files I'm happy to access here's the set of domain names I'm happy to access eventually, not just tell me if we see activity outside of these bounds, actually just prevent it, and then that's something new that we can do with tetragon.
[38:13] Liz Rice
In the EBPF world until now, we've had the ability to send an event to user space and say, oh, I think this is malicious, and then have user space go, this is a malicious process, so I'm going to kill the process.
[38:26] Liz Rice
But by that time it's done asynchronously, so bad things would have already happened.
[38:33] Viktor Petersson
Right.
[38:34] Liz Rice
With hechagon, we can intercept those, you know, let's say it's a file open or a socket open.
[38:39] Liz Rice
We don't like the look of it.
[38:40] Liz Rice
It doesn't match policy.
[38:42] Liz Rice
We can actually kill the process synchronously from within the kernel.
[38:47] Liz Rice
So that's his call.
[38:48] Liz Rice
Never gets a chance.
[38:50] Liz Rice
And it's much more powerful.
[38:52] Liz Rice
And yeah, if we can express the policies in a meaningful way, I think that's the future of security.
[39:00] Viktor Petersson
There's probably like a piratu rule at play there as well, where you could probably block 80% of attacks with just a very small subset of blocks, really.
[39:11] Viktor Petersson
Right.
[39:12] Viktor Petersson
Because there are some common behaviors like your go library on your web server or your app server is unlikely to need to open a socket or whatever it may be.
[39:20] Viktor Petersson
Right.
[39:21] Viktor Petersson
For a backdoor or something like that.
[39:22] Viktor Petersson
Right.
[39:24] Liz Rice
And I think, and many attacks, if they're intended to exfiltrate data in any way, they're going to have to do some kind of network act or maybe they're going to have to write a file to somebody's thumb drive or something.
[39:38] Viktor Petersson
But it's definitely, they need to get it out somehow.
[39:42] Liz Rice
Yeah.
[39:44] Liz Rice
Should be possible to write policies that contain what we expect to happen.
[39:52] Viktor Petersson
And I guess an argument for that approach is code obfuscation is becoming crazy good with AI these days.
[40:02] Viktor Petersson
There are so many ways to obfuscate code today that you can't write a regex to fix this.
[40:09] Viktor Petersson
It's near impossible because AI will think of ways to, obviously things that nobody's ever thought about, and it can do it unique per payload for a million payloads.
[40:20] Viktor Petersson
Right?
[40:20] Liz Rice
Yeah.
[40:22] Viktor Petersson
But the underlying way it needs to either read data or send or write data is essentially the same.
[40:31] Viktor Petersson
Right.
[40:31] Viktor Petersson
So I would agree.
[40:32] Viktor Petersson
I think that playbook probably is a lot more sound to catch a lot of issues rather than trying to do it on a more tailored basis.
[40:41] Viktor Petersson
I guess.
[40:43] Viktor Petersson
So.
[40:44] Viktor Petersson
I guess what you're saying, what you're thinking is ids is the new antivirus for the Linux world, I guess in EBF's user space.
[40:54] Liz Rice
Yeah, yeah.
[40:56] Liz Rice
I mean, I don't know whether we'll continue using the terms like ids and whether or not that has the right, you know, doesn't necessarily leave people with the right kind of sense of excitement and power that they need.
[41:13] Viktor Petersson
Yeah, I mean it is funny that this comes up, I see at least a few times a year when we do at skinny, when we do like security audits or security surveys for customers.
[41:23] Viktor Petersson
And it's so often that you come up like what's your antivirus vendor for your Linux servers?
[41:28] Viktor Petersson
And I was like, I don't have one.
[41:33] Viktor Petersson
But that's still, I've heard this conversation from plenty of friends of mine who are ctos for companies that do sell to enterprise, that they end up installing some McAfee garbage on their service purely to make the buyers happy.
[41:49] Viktor Petersson
Not because it actually solves the problem, but because it's just the path least resistance to get them off your back, get compliance off your back.
[41:57] Liz Rice
This is a problem with all sorts of things in the security world where there's a set of compliance checklists and people just want to know that you've ticked the box.
[42:08] Liz Rice
Whether or not that's sensible, meaningful or sensible thing.
[42:13] Liz Rice
It's a bit like the password change rules.
[42:16] Liz Rice
You know, everybody has known for years that you shouldn't ask humans to change their passwords because they'll just fall into a pattern that makes it easier to identify what their password is.
[42:27] Liz Rice
And yet there are still, you know, organizations out there that will require, you know, Microsoft.
[42:35] Liz Rice
Yeah.
[42:38] Liz Rice
Really?
[42:39] Viktor Petersson
Yeah, I mean it largely come back to these compliance framework.
[42:44] Viktor Petersson
Now we have a new NIST framework which hopefully I haven't read the new Nist one in depth, but hopefully that's gonna debunk some of the myths, particularly around password because I think that password reset and the password strength, I think that derives from the first NIST standard, which is what, 1520 years old by now?
[43:01] Viktor Petersson
Well, at least ten years old.
[43:02] Liz Rice
I think you're right.
[43:03] Liz Rice
I think that has been, certainly some of the compliance frameworks have wised up to that.
[43:08] Viktor Petersson
Well, you would hope.
[43:09] Viktor Petersson
Well, now even like Sboms is part of the new, I believe, part of, at least alluded to in the NiST two framework.
[43:15] Viktor Petersson
I believe so at least that's up a little bit there.
[43:20] Viktor Petersson
And maybe that's something I have covered in the past on previous episodes.
[43:25] Viktor Petersson
But I'm curious about your view of the s bomb world because that's everybody talks about sbom these worlds these days and it's the cool thing to talk about, but I'm curious, where is your head around that and how do you see s bombs fixing supply chain security?
[43:41] Liz Rice
Yeah, I mean it's a world that I'm less close to now in what I'm doing these days.
[43:46] Liz Rice
But from what I can see there's some good work being done.
[43:54] Liz Rice
I suppose I do worry a little bit that we'll come up with all these amazing ways of enumerating every possible piece of software and doing all these.
[44:07] Liz Rice
It's all about dependency management and vulnerability management.
[44:12] Liz Rice
But all of that work will always take time.
[44:15] Liz Rice
And if that distracts people from thinking about but what about the runtime then, you know, I totally understand why supply chain security has had the attention it had, you know, but the materials is an essential part of that thinking.
[44:36] Liz Rice
But I do worry a little bit that we might be, you know, I can't think of there's a, you know, an idiom, a cliche about, you know, looking over here because, you know.
[44:50] Viktor Petersson
Right.
[44:51] Liz Rice
Being distracted from the real.
[44:53] Viktor Petersson
Yeah.
[44:54] Viktor Petersson
I mean if it's purely an exercise about collecting this for the sake of collecting it, then I completely agree with you that if it's not actionable, it's kind of pointless, I guess.
[45:04] Liz Rice
Yeah.
[45:04] Liz Rice
And I think people worry about like false positives and just being overwhelmed and.
[45:10] Viktor Petersson
Yeah, well that's a massive one, right.
[45:13] Viktor Petersson
In like it's trivial to pull an S bomb from most software stacks today.
[45:17] Viktor Petersson
It's pretty trivial to correlate that SBom file with some CV database and then you get a list of oh, your vulnerable three is 20 cv's but in reality you probably not, you're probably vulnerable to like maybe one of them if like many of them are going to be false positives, right?
[45:37] Liz Rice
Yeah, I think there's some interesting work.
[45:39] Liz Rice
One of my ex colleagues from ACWA has a startup called Backslash where they're analyzing which of those vulnerabilities are actually reachable in your application.
[45:53] Liz Rice
I'm sure there are other startups doing similar work and then there's also the approach of, you know like Docker Slim having a smaller image so that.
[46:03] Liz Rice
Yeah, you know, the smaller it is, the smaller the likelihood that you have that dependency and reducing cv's that way.
[46:10] Liz Rice
So there's definitely, you know, interesting things that can be done there.
[46:17] Viktor Petersson
But yeah, it is currently, at least I haven't played, I haven't seen that tool, but currently it's very much a manual effort of like, oh, we actually are not using this function that's affected by the CV, so therefore it's kind of moot.
[46:30] Viktor Petersson
Yeah, but path least resistant, you should just upgrade it and not have to worry about it.
[46:36] Viktor Petersson
Right?
[46:36] Liz Rice
Yeah.
[46:37] Liz Rice
And I guess over the last few years, you know, we've seen a lot of auto remediation.
[46:41] Liz Rice
You know, press this button and GitHub will update the dependencies for you, that kind of thing.
[46:47] Liz Rice
It's good.
[46:49] Viktor Petersson
Absolutely.
[46:50] Viktor Petersson
Absolutely.
[46:50] Viktor Petersson
No, I think that's, I don't think it's a good thing that we have more security oriented thinking at least.
[46:58] Viktor Petersson
And supply chain is actually making the realms in popular press even today.
[47:04] Viktor Petersson
Almost the last thing I wanted to chat a bit about is compliance because I know you guys obviously work a lot with regulated industries where you need to make sure that you are compliant.
[47:16] Viktor Petersson
How do you find a bit of the dichotomy, I guess, that we spoke about before between checking all the boxes whereas actually solving security.
[47:27] Viktor Petersson
How have you found those when you're working with these real life customers, in particular in financial services where they are heavily regulated and all that, or how you found working in that world?
[47:38] Viktor Petersson
I guess.
[47:39] Liz Rice
Yeah.
[47:41] Liz Rice
It is obviously a, you know, a front of mind challenge for people in these regulated industries.
[47:49] Liz Rice
How do I make my environment compliant?
[47:52] Liz Rice
Can I, do, you know, can I use kubernetes?
[47:55] Liz Rice
You know, if I do this, will I still be compliant?
[47:58] Liz Rice
So on and so forth.
[48:00] Liz Rice
We didn't talk about this beforehand, but this is going to sound as though we talked about it and I wanted to plug because we isurveillant and our friends at control plane came together and we've recently written a white paper about how you use cilium in a Kubernetes environment to meet a lot of these compliance requirements around things like trust to do configuring policies.
[48:27] Liz Rice
Yeah, it kind of maps different requirements to some of those like NIST framework table of requirements.
[48:41] Liz Rice
So there's some great work gone in there by my colleagues.
[48:46] Viktor Petersson
Yeah, Andy was actually the first guest on the show.
[48:48] Viktor Petersson
So full circle.
[48:50] Viktor Petersson
Yes, we talk about a lot of these things, but not necessarily around regulated industry.
[48:55] Viktor Petersson
But yeah, I find it interesting when you deal with these regulated industry because we get across a fair bit of them as well.
[49:03] Viktor Petersson
And it's, yeah, it's old meets the new because they want to be bleeding edge, but they are held back by a lot of these old methodologies, I guess, in a sense, and from these frameworks.
[49:16] Viktor Petersson
But yeah, I'm looking forward to taking a look at the white paper.
[49:19] Viktor Petersson
I'll probably chuck that into the description of the episode as well, if people are nerd out about that.
[49:25] Viktor Petersson
And I think we've covered a lot of good ground today.
[49:29] Viktor Petersson
Is there anything else you want to cover?
[49:31] Viktor Petersson
Anything you think we missed that's worthwhile for viewers to think about or be aware of or other than plugging your book, of course.
[49:39] Liz Rice
Oh, there we go.
[49:40] Liz Rice
Yeah.
[49:43] Liz Rice
I guess the other thing that we didn't talk about is Cisco.
[49:46] Viktor Petersson
Oh, yes.
[49:48] Liz Rice
At time of recording, Isobelan is, you know, undergoing its acquisition.
[49:53] Liz Rice
So it was announced before Christmas and hasn't closed yet, but hopefully very soon.
[49:59] Liz Rice
And I guess quite a lot of people have been asking about, you know, what.
[50:03] Liz Rice
What does the acquisition mean for, like, facilium and for isovalent customers and for isovalent employees as well.
[50:12] Liz Rice
And so I'm really excited about it because basically the whole team is moving as one into Cisco in the security business group security business.
[50:26] Liz Rice
And we get to carry on doing what we're doing.
[50:28] Liz Rice
So we're going to carry on supporting silium.
[50:32] Liz Rice
I know Cisco is super excited about the open source work that we do and really supportive about that.
[50:40] Liz Rice
Yeah, it's like we get to carry on being isovalent, but we have the resources.
[50:51] Viktor Petersson
It sounds like a good strategic fit for Cisco.
[50:53] Viktor Petersson
And I guess that would mean quite a few more trips to Milpitas for you then, I guess.
[50:59] Liz Rice
Yeah, I guess I'll find out after close.
[51:01] Viktor Petersson
Yeah.
[51:03] Viktor Petersson
Excellent.
[51:03] Viktor Petersson
All right.
[51:04] Viktor Petersson
Thank you so much for coming on the show, Liz.
[51:07] Viktor Petersson
Very much appreciate it.
[51:08] Viktor Petersson
It's been a lot of fun, and I have learned a lot more new things about EBPF, and I'm very much happy for that.
[51:15] Viktor Petersson
So thank you much for coming on the show, and have a good one.
[51:19] Viktor Petersson
Thanks.
[51:19] Liz Rice
Thank you very much for having me.
[51:21] Viktor Petersson
Cheers.
[51:21] Liz Rice
Bye.

Found an error or typo? File PR against this file or the transcript.