Nerding out about Prometheus and Observability with Julius Volz

Podcast Host

Join Viktor, a proud nerd and seasoned entrepreneur, whose academic journey at Santa Clara University in Silicon Valley sparked a career marked by innovation and foresight. From his college days, Viktor embarked on an entrepreneurial path, beginning with YippieMove, a groundbreaking email migration service, and continuing with a series of bootstrapped ventures.

Links

Listen to podcast on YouTube

Listen to podcast on Spotify

Listen to podcast on Apple

Follow Me

Podcast Host

Listen to podcast on YouTube

Listen to podcast on Spotify

Listen to podcast on Apple

Follow Me

Join Viktor, a proud nerd and seasoned entrepreneur, whose academic journey at Santa Clara University in Silicon Valley sparked a career marked by innovation and foresight. From his college days, Viktor embarked on an entrepreneurial path, beginning with YippieMove, a groundbreaking email migration service, and continuing with a series of bootstrapped ventures.

Nerding out about Prometheus and Observability with Julius Volz

Play On

Listen to podcast on YouTube

Listen to podcast on Spotify

Listen to podcast on Apple

Listen to podcast on Amazon music

15 JAN • 2024 1 hour 5 mins

Share:

In this episode, I’m joined by Julius Volz, co-founder of Prometheus and founder of PromLabs, to explore the fascinating world of systems monitoring and observability. Julius’s journey from working on Borgmon at Google to co-creating Prometheus offers unique insights into how modern monitoring systems evolved.

We start with the technical foundations of Prometheus. What particularly caught my attention was Julius’s explanation of their dimensional data model and how it revolutionized metrics-based monitoring. His breakdown of common pitfalls, especially around metric design and “cardinality bombs,” provides invaluable guidance for anyone implementing Prometheus.

The conversation gets especially interesting when we dive into long-term data storage challenges. Julius shares practical insights about solutions like Cortex and Thanos, demonstrating how to handle large datasets effectively. His live demonstration of PromQL, showing functions like rate, irate, and increase, reveals the powerful querying capabilities that make Prometheus stand out.

I was particularly intrigued by our discussion of future trends in observability. Julius’s thoughts on eBPF integration, OpenTelemetry, and the OpenMetrics project show how the monitoring landscape continues to evolve. We also explore the simplicity of writing Prometheus exporters, highlighting how accessible the technology can be even for those with minimal coding experience.

If you’re interested in systems monitoring, observability, or infrastructure management, you’ll find plenty of practical insights here. Julius brings both deep technical knowledge and hands-on experience to the discussion, making complex monitoring concepts accessible while maintaining their technical depth.

Transcript

Show/Hide Transcript

[00:00] Viktor Petersson

Hello and welcome to this episode of Nerding out with Victor.

[00:03] Viktor Petersson

Today.

[00:03] Viktor Petersson

I got a very special guest with me today, Julius from Prometheus.

[00:09] Viktor Petersson

Maybe we should start with doing a quick intro to yourself, Julius, for people who are not familiar with who you are and about what Prometheus is in big picture.

[00:20] Julius Volz

Yeah.

[00:20] Julius Volz

So I'm Julius, I live in Berlin.

[00:23] Julius Volz

I am the co founder of the open source Prometheus monitoring system, but then also the founder and the sole person behind the company Promlabs.

[00:33] Julius Volz

So Prometheus is the open source, open governance monitoring system that is being developed and used by many people and there's many companies around it and Promlabs is just myself and it's just one of those companies.

[00:48] Viktor Petersson

Got it.

[00:48] Viktor Petersson

Perfect.

[00:49] Viktor Petersson

And I guess for those not familiar Prometheus.

[00:53] Viktor Petersson

Do you want to give kind of a sense of how widely used Prometheus is today and how it's being used in general?

[01:01] Julius Volz

Yeah, I mean Prometheus has pretty much become the de facto standard in metrics based systems monitoring in the open source world.

[01:12] Julius Volz

At least.

[01:13] Julius Volz

There's some closed and hosted competitors of course, but at least for the metrics based monitoring, it is pretty much the standard.

[01:20] Julius Volz

And yeah, I mean so you can find it really everywhere from small startups to really big banks, corporations, enterprises, even people run it at home to monitor their homes.

[01:33] Julius Volz

And yeah, you can use it in the classic data center use case to monitor your IT in a data center, but people also monitor hardware with it like sensors and chips and wind parks and all that kind of stuff.

[01:50] Viktor Petersson

It's super interesting.

[01:52] Viktor Petersson

Let's take a stroll down memory lane because I'm really curious about the early days of Prometheus and really dive into that and if I'm not mistaken started around 2012 or so back at SoundCloud.

[02:05] Viktor Petersson

Back in those days.

[02:06] Viktor Petersson

Do you kind of speak a bit more about what happened and what led to the invention of Prometheus and what kind of pain point you can solve with for that?

[02:16] Julius Volz

Yeah, exactly.

[02:17] Julius Volz

So this was 11 years ago, 2012.

[02:20] Julius Volz

So by now all of this is not as exciting maybe anymore, but back then it was.

[02:26] Julius Volz

So my job previous to SoundCloud was at Google as a site reliability engineer in one of the services.

[02:34] Julius Volz

And all the site reliability engineers at Google used a tool called Borgman to monitor their production services at Google, whether this was Google Search or Gmail or the service I was on, which was an internal backup service.

[02:48] Julius Volz

And this was a tool that was very similar to what Prometheus is now.

[02:53] Julius Volz

So the idea was to collect time series, store Them with a dimensional data model, so a metric name and then a set of labels attached to them so you can see in detail where something happened.

[03:07] Julius Volz

Which is great especially for dynamic cloud based systems which Google already had back then.

[03:13] Julius Volz

Now the thing is, after Google I went to SoundCloud and Matt Proud, another ex Googler, also went to SoundCloud at roughly the same time.

[03:21] Julius Volz

And were basically hired to try and make SoundCloud more stable and reliable and faster as platform or system engineers.

[03:31] Julius Volz

And we really found that the monitoring system world, especially in the open source world outside of Google, was severely lacking at that time.

[03:41] Julius Volz

Either the systems could only do alerting, but didn't really have any idea of history like time series, right?

[03:48] Julius Volz

Nagios didn't really have much of a data model to speak of or the alerting conditions you could create in there were very simplistic things and check scripts and so on.

[04:00] Julius Volz

And then you had systems like Graphite or opentsdb which either didn't have a dimensional data model and or didn't have a proper query language to do math between whole sets of numbers that are correlated in some ways and so on.

[04:16] Julius Volz

And then also efficiency of storage, UIs, all those things were kind of really lacking.

[04:22] Julius Volz

And so eventually we got to the point where we said, well, let's at least try in our free time, like on the weekends and after work at first to build something that is more similar to what were used to at Google.

[04:34] Julius Volz

And this became Prometheus.

[04:37] Julius Volz

We threw it up on GitHub under an Apache license from day zero, but we hadn't really told many people about it yet.

[04:46] Julius Volz

Then from there on, maybe two months in, we had the roughest of prototypes that could collect some data, store it and show it.

[04:55] Julius Volz

But then the first prototype is very easy and then comes the 99% of hard work.

[05:03] Julius Volz

So after that it was years of a convincing people that this makes sense to build our own monitoring system, then to actually build it until it was stable and scalable enough and wouldn't just oom all the time and all those things.

[05:18] Julius Volz

And then also to have at SoundCloud internally enough of a killer use case for it, which came and hadn't given that context which made Prometheus so important at SoundCloud specifically is that SoundCloud had very early on built a cluster scheduler like Kubernetes, much simpler of course, but before Docker, even before anyone else in the world was using containers, basically not everyone, but they built this using go version 0.9 something and like raw LXC containers and these hundreds of microservices running on that cluster would just be rescheduled on different hosts and different ports every time a new revision was rolled out.

[06:04] Julius Volz

And so that's what made it extra hard to monitor it.

[06:07] Julius Volz

With existing monitoring systems like Graphite for time series or Nagios for alerting when there was a latency spike, it was almost impossible to find out was it the entire service that is just getting slower or was there one specific instance contributing to this because we didn't have enough scalability to track per instance process level stats and very short lived time series and all that.

[06:33] Julius Volz

So yeah, the killer use case really started coming or I guess the first one of those was instrumenting this internal cluster scheduler system.

[06:43] Julius Volz

So every developer immediately now could see without doing anything, instance level stats of memory usage, CPU usage and so on.

[06:52] Julius Volz

Like all these per process, per container stats that are completely commonplace these days.

[06:58] Julius Volz

And they could key it by process, by revision by different labels that they cared about to track whether something was related to a revision or a specific host, et cetera.

[07:10] Julius Volz

Right, that was one thing that really helped it internally.

[07:15] Julius Volz

And then we needed a dashboard builder.

[07:18] Julius Volz

So I had built Prom Dash back then before Grafana was out.

[07:22] Julius Volz

So Grafana didn't really exist yet.

[07:24] Julius Volz

Prom Dash doesn't exist anymore.

[07:26] Julius Volz

So by now we're also all just using Grafana.

[07:28] Julius Volz

There's some new competitors on the horizon now.

[07:32] Julius Volz

And yeah, I mean at some point we reached an internal threshold of where people started, you know, adopting this more and more, finding it so useful that eventually there was an edict saying no new service without Prometheus Metrics.

[07:47] Julius Volz

And over the years we, yeah, we already kind of like told some other ex Googlers hey, we are writing a system similar to Borgmon.

[07:59] Julius Volz

Maybe you want to use it at your next company as well.

[08:03] Julius Volz

So we attracted like a handful of external users already.

[08:06] Julius Volz

But we only really fully published this in beginning of 2015 with a blog post from SoundCloud and another early user company.

[08:16] Julius Volz

And that's when things really exploded on Hacker News and everywhere else.

[08:20] Julius Volz

And you know, shortly later we joined the cncf and now the project is completely neutrally hosted in a foundation that belongs to the cncf.

[08:30] Julius Volz

So that's for people who don't know, that is the Cloud Native Computing foundation, which in turn belongs to the Linux Foundation.

[08:38] Julius Volz

And they were just setting that up originally just kind of Google with the lf.

[08:42] Julius Volz

They set it up to house Kubernetes neutrally.

[08:45] Julius Volz

And were the second project joining that and.

[08:48] Julius Volz

Yeah, so I guess that's the beginnings of it all.

[08:51] Julius Volz

And since then everything has grown a lot and evolved a lot.

[08:56] Viktor Petersson

Yeah, that's.

[08:57] Viktor Petersson

I mean, I think there are.

[08:58] Viktor Petersson

The interesting thing here is because it was such a important or the timing really played in the favor because I think it was just around that time where the whole ephemeral workload really started taking off and it sounded like obviously what you guys were doing predated kubernetes, which is kind of the default runtime these days.

[09:15] Viktor Petersson

But back then, traditional monitoring tools, they were really not built for ephemeral workloads.

[09:21] Viktor Petersson

Right.

[09:21] Viktor Petersson

I think that's.

[09:22] Viktor Petersson

That was.

[09:22] Viktor Petersson

Yeah, it sounds like that was the big killer feature that really gained adoption.

[09:27] Viktor Petersson

Right.

[09:27] Julius Volz

Yeah, especially maybe not ephemeral in the sense of seconds because Prometheus is not the greatest at that.

[09:33] Julius Volz

Because at least you need some thing that can then keep the state of whatever just ran but great for short lived time series or ephemeral workloads in the sense of things that maybe just run a day and then they get rescheduled with a new revision or on a new, you know, process or something.

[09:51] Julius Volz

So not just tracking one fixed set of identified time series over time, but really, you know, having a lot of churn in the identities of your time series over the day as developers roll out new revisions and scale down or up apps and so on.

[10:07] Viktor Petersson

Yeah.

[10:09] Viktor Petersson

So more ephemeral worker pools rather than ephemeral in the sense of like serverless, I guess.

[10:15] Julius Volz

Yeah, yeah.

[10:16] Julius Volz

You can also make that work, but that you need some extra bits for that.

[10:19] Viktor Petersson

Yeah, yeah.

[10:22] Viktor Petersson

So one thing I'm really curious about, obviously you have building and work with Prometheus now.

[10:28] Viktor Petersson

Well, since day zero by definition.

[10:31] Viktor Petersson

But what I'm really curious about is what is a common mistake you see in metrics?

[10:35] Viktor Petersson

I'm sure you've done you.

[10:37] Viktor Petersson

I've.

[10:37] Viktor Petersson

I think I've said some blog posts you've been through about this and what?

[10:41] Viktor Petersson

There's.

[10:42] Viktor Petersson

There is some Prometheus consultancy out there that have written quite a few blog posts.

[10:45] Viktor Petersson

But I'm curious about your vantage point of that.

[10:47] Viktor Petersson

Like what really are the most common mistakes people do when they're implementing Prometheus?

[10:54] Viktor Petersson

I guess in general.

[10:57] Julius Volz

Yeah.

[10:58] Julius Volz

And one of the companies writing blog posts about exactly this is myself.

[11:02] Julius Volz

So that is Promlabs.

[11:03] Julius Volz

Also if you go to promlabs.com and you go to the blog section, there's an article called Avoid these six mistakes when getting started with Prometheus and the very clickbaity title leads you to blog post mentioned.

[11:17] Viktor Petersson

I read the post some time ago.

[11:19] Julius Volz

Yeah, some of the common things that stump newbies and these are very common.

[11:25] Julius Volz

I think the really the most common one is cardinality bombs.

[11:28] Julius Volz

This is just when you discover this concept of dimensionality and labels for the first time, especially back then when it was new, people first were a little bit skeptical, what are these labels?

[11:42] Julius Volz

But then they said, oh, this is really useful.

[11:45] Julius Volz

Let's put everything into a label.

[11:46] Julius Volz

Like let's also split up all our time series by the exact user ID or an email or something of which there are an unbounded huge number of possible values.

[11:58] Julius Volz

And the main thing in Prometheus is that every set of labels generates one time series automatically.

[12:09] Julius Volz

The unique label set identifies a time series together with a metric name.

[12:13] Julius Volz

And so if you have one label in there that has a million different values, not only do you generate a million time series, but a million times whatever the other possible number of values are on the other labels on that same time series.

[12:27] Julius Volz

So it all multiplies together, the cardinality.

[12:30] Julius Volz

And your a big Prometheus server will maybe comfortably handle 10 or so million time series.

[12:39] Julius Volz

It really depends on how large you, how many resources you give it.

[12:42] Julius Volz

So several million time series are kind of normal for a big Prometheus server, maybe even more.

[12:48] Julius Volz

But that is your total budget.

[12:50] Julius Volz

And you want to kind of design your metrics in such a way that you stay under that total budget.

[12:57] Julius Volz

And how you exactly do that is kind of up to you.

[12:59] Julius Volz

Whether you have fewer processes to monitor because the process is also a label dimension, or whether you in the process split up a metric by fewer labels.

[13:11] Julius Volz

Or maybe there's ways to summarize the potential different values within one label into groups so you don't have to.

[13:21] Julius Volz

Instead of tracking a million different status codes, maybe you can group them into error success or something else.

[13:27] Julius Volz

Or something like this.

[13:28] Julius Volz

Right?

[13:29] Julius Volz

A very common thing where this happens is also when people want to track HTTP request statistics and they put an entire path in there, like hey, slash posts, user ID, post ID or whatever.

[13:44] Julius Volz

And these IDs, right?

[13:45] Julius Volz

They can get very high cardinality.

[13:48] Julius Volz

And there you could also say, hey, before I track this, I will actually replace these high cardinality bits in my path with something that is just a placeholder.

[13:57] Julius Volz

So now I'm just generating one label value for that particular pattern of labels.

[14:06] Julius Volz

So that's one.

[14:08] Julius Volz

I'm just looking here for others.

[14:12] Julius Volz

The others are even a little bit more technical, I guess, that are in my blog post.

[14:16] Julius Volz

So I Wouldn't go into them too deeply, but I think this whole metrics design is one of the biggest ones that everyone gets hit with initially.

[14:28] Viktor Petersson

Yeah, I can definitely relate to that.

[14:29] Viktor Petersson

When we first started adopting Prometheus, we tagged and over tagged everything and it became, like you correctly pointed out, kind of unusable after a while.

[14:40] Viktor Petersson

So less is more, I guess in some ways and start expanding from there.

[14:45] Viktor Petersson

Speaking of metrics in general, so those are some common mistakes with Prometheus.

[14:51] Viktor Petersson

If you're putting your sre hat back on for a moment, what do you consider kind of the blueprint or the best practice for server monitors in general?

[15:03] Viktor Petersson

If you're looking away from the application stack but more metric that you care about when you do observability and how you do go about to do that and set up those dashboards.

[15:14] Viktor Petersson

So how do you think about that?

[15:16] Viktor Petersson

If you think from that angle, you.

[15:18] Julius Volz

Mean like monitoring servers.

[15:22] Viktor Petersson

So assuming we are.

[15:24] Viktor Petersson

Yes.

[15:25] Viktor Petersson

Like the health of the server in general.

[15:26] Viktor Petersson

Right.

[15:27] Viktor Petersson

And so what are the metrics that you care about in terms of monitoring a server in general really?

[15:34] Viktor Petersson

If you're thinking from like best practices for somebody who hasn't had proper monitoring in place, perhaps, and setting that up from day zero, presume that would involve node exporter.

[15:46] Viktor Petersson

Perhaps.

[15:47] Viktor Petersson

But I'm just curious, what do you think about that in general, how you see that Monitoring in general.

[15:55] Julius Volz

Right.

[15:55] Julius Volz

So yeah, when it comes to like monitoring hosts.

[15:58] Julius Volz

Right, that's what you mean.

[15:59] Viktor Petersson

Yes.

[16:01] Julius Volz

There's often not that much you actually want to alert on.

[16:04] Julius Volz

So it depends what your environment is, of course, and what your philosophy is.

[16:09] Julius Volz

But often for hosts at least there's.

[16:13] Julius Volz

The node exporters is one of those agents that the Prometheus project officially offers that exposes all kinds of metrics about the host it is running on to the Prometheus monitoring system.

[16:26] Julius Volz

So you can collect CPU memory usage and network usage and everything you would expect these days if you're running it on Linux, it gets it from the PROC file system, from the sys file system, like all these different kernel interfaces and then just bridges it over to Prometheus metrics.

[16:42] Julius Volz

And so the first step is just run the node exporter really easy.

[16:46] Julius Volz

It's easy to deploy on every host you have and then just monitor it so you have the metrics.

[16:52] Julius Volz

You can configure it in different ways to for example, exclude some pseudo file systems or Docker container volumes and stuff like that.

[16:59] Julius Volz

A lot of those exclusion flags are set pretty reasonably by default, so often you can just start it without Any flags and it'll work fine.

[17:09] Julius Volz

And yeah, then there's just like a few things you might want to alert on.

[17:12] Julius Volz

Like, you know, back then it was pretty common to alert on stuff like load average being high or certain resource usage being high.

[17:22] Julius Volz

But the problem is that sometimes that doesn't really point to a real problem and might actually generate too noisy alerts that don't immediately affect the service in a bad way.

[17:34] Julius Volz

And so alerting I would usually more approach from the kind of the SLA perspective of what does the user of whatever service you offer expect or what kind of agreement do you have with them or what service level objective SLO did you make for yourself?

[17:54] Julius Volz

At least this is the latency we want to have like 90th percentile latency should always below 100 milliseconds.

[18:04] Julius Volz

We should have a 99.9% uptime over given months and so on.

[18:10] Julius Volz

And then have alerts more based on these user visible metrics rather than on all possible underlying causal metrics.

[18:19] Julius Volz

You still want to collect all those underlying causal host metrics to be able to figure out when you do get alerted what might be the reason, what might be the underlying cause, are there any weird things in your graphs and so on.

[18:31] Julius Volz

But host wise there's not a lot of things you wouldn't alert directly on.

[18:36] Julius Volz

But of course there's some stuff like disk usage at least if it's completely full or you can predict that it is already very, it's maybe at 80% full and you're filling up quite fast and you can use a linear prediction in the Prometheus query language promql that it will be full in one day.

[18:55] Julius Volz

If it develops further like this, then it's still a good idea to alert on these imminent dangers to your actual service.

[19:05] Julius Volz

But other than that I would be very conservative with actual host level alerts.

[19:09] Julius Volz

And again it depends a lot on your environment of course, because also some environments will be completely happy if one host completely dies, whereas others will just, you know, completely have an issue if you don't have that redundancy.

[19:26] Julius Volz

Right.

[19:27] Viktor Petersson

Yeah, I mean I like the vantage point for SLO or SLA is of what is customer impacting really.

[19:33] Viktor Petersson

I think that's a good, I think that's a good view of how what you should alert.

[19:39] Viktor Petersson

I think that's a good guiding principle.

[19:42] Viktor Petersson

You kind of alluded to alerting, but do you want to speak a bit about Alert Manager as well?

[19:46] Viktor Petersson

Because that's obviously an important corner piece in Prometheus at large.

[19:51] Julius Volz

Yeah, so the idea how Prometheus alerting works in general is that first Prometheus collects all these dimensional metrics, and then anything useful you want to do with the collected data, you probably will use the Prometheus query language, PromQL.

[20:07] Julius Volz

So this is like a dimensional, very useful functional query language for processing your time series data and generating some answers based on the selected data, whether you want to aggregate or select or do mathematical correlations, and so on.

[20:23] Julius Volz

And you can use this query language both for dashboarding, ad hoc debugging, and automation, but also for alerting.

[20:33] Julius Volz

So alerting uses the same PromQL query language.

[20:36] Julius Volz

So this is not the alert manager yet that still happens in Prometheus.

[20:40] Julius Volz

The alerting rules are actually configured as part of the Prometheus server, and the Prometheus server on a regular basis runs the contained PromQL expression in an alerting rule.

[20:53] Julius Volz

And if there is any output, any time series being returned from this PromQL expression in the rule, those outputs will become alerts, subject to some other thresholds that are configured in the rule, and so on.

[21:07] Julius Volz

And then once they do become alerts, they get sent to this separate server component that is called the Alert Manager.

[21:13] Julius Volz

And so the alert manager just receives fully baked, hey, here's a problem style alerts from various Prometheus servers in your infrastructure, and then it routes them based on their labels to some.

[21:29] Julius Volz

So basically the alert manager conflict is like a big routing tree, and you end up in some node of that routing tree based on what labels your alert had.

[21:38] Julius Volz

And you might root on stuff like the team or the service or the severity level of the alert, right?

[21:44] Julius Volz

And then in that routing node that you actually reach, you can configure all kinds of things like do I want to send this to slack or to PagerDuty, for example, to which teams Slack, and so on.

[21:59] Julius Volz

You can say, hey, throttle this in a certain way over time, group this by certain labels.

[22:10] Julius Volz

So, for example, hey, don't send me one alert for every host that is down, but send me one alert that contains all the hosts that are down.

[22:19] Julius Volz

So you can kind of bake many notifications into one with more detail so you don't get flooded.

[22:26] Julius Volz

And yeah, there's a bunch more settings that are possible in the alert manager, but basically this the component where all your Prometheus servers send their raw alerts to, and then it routes and throttles and aggregates and finally dispatches your final alert notifications to a human.

[22:44] Viktor Petersson

Okay, that's perfect.

[22:46] Viktor Petersson

One thing that I think people that start using Prometheus eventually run into is long term storage because obviously Prometheus is not designed for long term storage, it's designed for short term storage where I think Its default is 14 days I believe it is, if not mistaken and then you can set it to higher.

[23:06] Viktor Petersson

But I think it would probably peak out about a month or so on most hosts before it becomes useless.

[23:13] Viktor Petersson

Kind of what I mean, what's your view on that?

[23:17] Viktor Petersson

There have been quite a lot of development with Cortex is one of them and quite a few others.

[23:22] Viktor Petersson

How do you see that evolving and what are you thinking around that?

[23:27] Viktor Petersson

How do you recommend people looking?

[23:29] Julius Volz

Good point.

[23:30] Julius Volz

So Prometheus comes with its own built in TSTB time series database storing files on local disk.

[23:39] Julius Volz

And I would have to say for many use cases that are not huge, this actually works pretty well even as a long term storage.

[23:45] Julius Volz

So there are people like me who have years of data in a single Prometheus database.

[23:51] Julius Volz

But of course if you do collect a lot of data about millions of different time series every couple of seconds, eventually you will hit the scalability bottleneck of one disk and this local TSDB implementation.

[24:06] Julius Volz

And you may also hit depending on your requirements, robustness or reliability issues, when that one disk dies or you lose the data or something, there's ways to back it up so you can still make it work.

[24:19] Julius Volz

So first of all I wanted to say actually you can use the normal Prometheus TSDB quite well for certain long term storage scenarios, but eventually of course you will outgr what a single node can do.

[24:33] Julius Volz

And there's different ways for going around that.

[24:37] Julius Volz

So one is, hey, let's configure the Prometheus server to send all the data or subset off that it collects to some remote storage system.

[24:48] Julius Volz

So it doesn't only store it locally, but forwards it to some remote endpoint.

[24:54] Julius Volz

You can even now if you want to, you could even turn off the local storage completely so it only forwards.

[24:59] Julius Volz

And then that remote endpoint needs to implement a protocol that we call remote write.

[25:05] Julius Volz

This is a protocol that the Prometheus monitoring system standardized.

[25:11] Julius Volz

And by now there are so many different integrations for that protocol.

[25:16] Julius Volz

Like all the major cloud providers accepted.

[25:20] Julius Volz

If you go to like, you know, Amazon, Google and all these, you can usually send Prometheus remote right there or also Chronosphere, Grafana cloud, all these players and they have their own proprietary databases for how they can then scale and store that data in different ways and give you extra features on top and give you a global view over all the data from Your different Prometheus servers.

[25:45] Julius Volz

So this is one option and there are also open source alternatives to this one is Cortex.

[25:53] Julius Volz

And yeah, this also came from a long overlap with Prometheus maintainers.

[26:00] Julius Volz

I was one of the original Cortex creators as well, along with Tom Wilkie.

[26:05] Julius Volz

And this is more of a clustered system that you can run locally, but it is quite complex.

[26:10] Julius Volz

So it's like mainly for people who want to offer Prometheus data storage as a service in their big organization, as a central point where all the different teams can send their Prometheus data and then it will offer you a PromQL compatible API that you can use with Grafana and other tools, even Alert Manager and so on.

[26:32] Julius Volz

So that is really a horizontally scalable multi tenant tool.

[26:37] Julius Volz

It is a bit more complex to deploy.

[26:40] Julius Volz

So most people who do want to have some kind of long term storage on premise tend to use a thing that works a little bit differently called Thanos.

[26:50] Julius Volz

And Thanos works in this way, at least the way it was originally envisioned.

[26:56] Julius Volz

Now it supports different modes as well.

[26:58] Julius Volz

Is you run still all your different Prometheus servers in your infrastructure, monitoring different services and regions and data centers and so on.

[27:09] Julius Volz

But now you want to add both a global view over them and long term storage and then also some ha deuplication and so on.

[27:18] Julius Volz

This is what Thanos can do.

[27:20] Julius Volz

So you still have your Prometheus servers, but now you run little sidecars next to each of them and they all integrate together with the Thanos Querier into this kind of mesh where you can in a federated way query over all of them at once, give them a single PromQL query and the query knows which Prometheus servers it actually needs to hit based on some labels in your query.

[27:43] Julius Volz

So it doesn't need to always go to all of them.

[27:46] Julius Volz

And these sidecars running next to each Prometheus server can also ship all the finished TSDB blocks.

[27:55] Julius Volz

So everything that is a bit older than like 2 or 3 hours to an object storage like either S3 or minio or GCS and others.

[28:05] Julius Volz

And then the Querier, the Thanos Querier can integrate that long term older data as well.

[28:12] Julius Volz

And so there it's really safely backed up right in any of those object storages, you can easily replicate it and all that.

[28:20] Julius Volz

So yeah, you get like pretty cheap very long term storage, including some down sampling features and so on.

[28:28] Julius Volz

So most people who want to scale Prometheus internally, they just use Thanos.

[28:33] Julius Volz

And then others who really want to invest in a dedicated team running such a cluster for other teams would use Cortex.

[28:41] Julius Volz

There's also Mimir, which is a Grafana fork of Cortex now.

[28:45] Julius Volz

And then others again, they would choose some cloud services to do that.

[28:51] Viktor Petersson

Yeah, no, we've been using tunnels at screen enough for a while and we use it both for metrics on our clusters infrastructure, but also on actual devices.

[29:01] Viktor Petersson

So pushing metrics from actual devices.

[29:03] Viktor Petersson

But one of the things I find great and this kind of beautiful simplicity of Prometheus is really the output.

[29:12] Viktor Petersson

Because if you look at how a slash metrics feed looks like, it is so simple and kind of that's what's beautiful about it in many ways.

[29:21] Viktor Petersson

Because writing a Prometheus exporter is very trivial.

[29:26] Viktor Petersson

I mean, you can literally write one in Bash versus many other monitoring frameworks that I worked with in the past.

[29:32] Viktor Petersson

And I guess that combined with the modularity of the remote write, which is kind of recursive almost because you use the same way to push to it as you can do to push further out in the federation.

[29:46] Viktor Petersson

I find that as a beautiful implementation.

[29:51] Viktor Petersson

So so far, what is the most interesting use case you've seen of Prometheus or out of the box use case of Prometheus?

[29:59] Viktor Petersson

You spoke a bit about monitoring your home, but I'm curious, like what's the most crazy use case you've seen so far?

[30:07] Julius Volz

Yeah, the ones that I always find the coolest or most interesting is when I hear someone talk somewhere or tell me about physical use cases like we had.

[30:17] Julius Volz

So maybe I'll just mention like three or four different examples.

[30:21] Julius Volz

One was a prom con.

[30:23] Julius Volz

So we also have a Prometheus conference, a promcon.

[30:26] Julius Volz

Talk about someone monitoring a big wind power park with Prometheus.

[30:31] Julius Volz

So he could even have the rotational angle of all the different, what do you call them, windmills or whatever they are.

[30:39] Julius Volz

Yeah, as a metric.

[30:41] Julius Volz

So like the angle as a metric, the wattage as a metric and everything.

[30:46] Julius Volz

So that was cool.

[30:47] Julius Volz

I heard from a big container shipping company who shall remain unnamed, that they have Prometheus running on all of their vessels and reports home telemetry about the ship, about where it is and what it's doing.

[31:02] Julius Volz

That I found really cool.

[31:06] Julius Volz

One guy from Accenture once gave a meetup talk about the train system in Germany, the Deutsche Bahn, where you have all these displays, signage systems telling you when the next train is coming and which one it is.

[31:20] Julius Volz

There were almost 100,000 or so of those in Germany in the different train platforms.

[31:26] Julius Volz

And they had basically installed the node exporter or something akin to it in each of these.

[31:31] Julius Volz

And we're monitoring it remotely using Prometheus.

[31:35] Julius Volz

So that's kind of cool because then you're on a train platform somewhere and you're like, oh cool.

[31:39] Julius Volz

That's like where basically my software is running in and you can show that.

[31:44] Julius Volz

So I really like those because, you know, all the data center use cases, they're cool.

[31:49] Julius Volz

Of course that's what Prometheus was made for.

[31:52] Julius Volz

But they're kind of the usual thing.

[31:55] Julius Volz

Maybe one tiny use case that someone also reported is they had mold in their bathroom at home and now basically they wanted to monitor the conditions to not get mold again.

[32:09] Julius Volz

So they just installed Prometheus on like a little Raspberry PI or something at home and monitor temperature, humidity and air pressure, I think.

[32:18] Julius Volz

And then you can like calculate in promql whether the dew point is reached and then send an alert in case that happens.

[32:24] Julius Volz

And then you would open the windows or something.

[32:26] Julius Volz

Right.

[32:27] Julius Volz

And so it really scales down to those really tiny use cases as well.

[32:34] Viktor Petersson

That's really cool.

[32:36] Viktor Petersson

So we covered quite a lot about PromQL, but maybe it warrants a bit of a deeper intro to what PromQL is as it's such an important building block in Prometheus.

[32:47] Viktor Petersson

And I'm not sure you want to do some speaking about the background of it and like the history of it.

[32:55] Viktor Petersson

My understanding is it really derived from kind of reverse engineer something that Google had in sense.

[33:00] Viktor Petersson

But maybe you want to speak a bit about like the design implementation and perhaps do a quick demo of how it works if that's.

[33:08] Julius Volz

Yeah, could do that.

[33:09] Julius Volz

So prompt.

[33:10] Julius Volz

Yeah, it's the one unifying query language in the Prometheus ecosystem that you know, allows you to do dashboarding, alerting, debugging, other use cases as well.

[33:20] Julius Volz

Automation for example.

[33:23] Julius Volz

And it is a non SQL style query language where you really just write a functional expression that some function that might take other parameters that are in turn.

[33:36] Julius Volz

Again an expression that could take other parameters.

[33:38] Julius Volz

So it could form like an arbitrarily deep tree.

[33:41] Julius Volz

But eventually if you evaluate that tree, you'll get some kind of result so you can select some data.

[33:48] Julius Volz

Then you might.

[33:48] Julius Volz

If it's counters, then you could take the rate of increase around it and then maybe you would want to sum up all the individual rates but keep certain dimensions in the sum.

[33:59] Julius Volz

So we still group by the path but get rid of all the other.

[34:02] Julius Volz

The instance level detail.

[34:04] Julius Volz

For example, you might have a whole set of error rates divided by a whole other set of error rates by a whole set of total rates correlated on identical label Sets so you can get a whole list of error rate ratios out and then alert on that.

[34:22] Julius Volz

You can filter.

[34:24] Julius Volz

You have a bunch of different functions basically to compute different things.

[34:28] Julius Volz

Very geared around systems monitoring.

[34:31] Julius Volz

So not like deep machine learning statistics or so, but really quick functions that behave pretty simply to give you right now answers about your infrastructure and whether you should alert someone, for example.

[34:45] Julius Volz

And yeah, I mean, if you like, I could try sharing my screen and just showing you what the language looks like.

[34:53] Viktor Petersson

Let's do it.

[34:54] Julius Volz

Let's try that.

[34:55] Julius Volz

Okay, so I have this.

[34:59] Julius Volz

Let's see.

[35:00] Julius Volz

I have.

[35:02] Julius Volz

Should I share?

[35:03] Julius Volz

Yeah, I'll share the entire screen here.

[35:06] Julius Volz

Of course it only shows me.

[35:08] Julius Volz

Okay, I will do the entire screen because otherwise it does not show me the windows that I have on a different virtual desktop.

[35:15] Julius Volz

So let me know once you can see stuff.

[35:23] Viktor Petersson

I can see it.

[35:23] Julius Volz

I'll just maybe zoom in a little bit more even here.

[35:27] Julius Volz

So this is Promlens is one of the possible interfaces for dealing with PromQL.

[35:33] Julius Volz

Basically what you have here is you have an expression input box and you can, you know, first of all you can select some data, of course.

[35:41] Julius Volz

Now, okay, let me reload that.

[35:49] Julius Volz

So for example, some HTTP level counter metrics counting for three different demo service instances, which process they come from, which group of processes.

[36:01] Julius Volz

This is the job.

[36:03] Julius Volz

The method for which an HTTP request was handled, the path on which it happened, the status code that was the result of the request.

[36:12] Julius Volz

So more for the response status code, I guess.

[36:15] Julius Volz

And these are raw counter values.

[36:18] Julius Volz

So if you just graph them like this, they don't look too useful because these only go up basically starting from whenever this demo service was started.

[36:28] Julius Volz

So they start at zero and then they just go up.

[36:31] Julius Volz

So what you want to know is you want to see like how fast do they go up.

[36:35] Julius Volz

And I could add a rate function around this here.

[36:38] Julius Volz

Not going too much in detail now about how this works in detail.

[36:43] Julius Volz

But then you will see at every point in the graph averaged over a 5 minutes window.

[36:49] Julius Volz

What's the actual per second request rate for the different dimensional combinations Here you can see if I hover over this.

[36:57] Julius Volz

This is method get path API bar status 200 and you get that for the different other ones as well.

[37:04] Julius Volz

If we wanted to now we could sum this and only keep the path dimension.

[37:09] Julius Volz

For example now we would have it only for every path instead of all the full details.

[37:14] Julius Volz

This is the summed up rates, aggregated rates.

[37:18] Julius Volz

I could still break out the method if I wanted to.

[37:23] Julius Volz

And since I'm using PROM lens, it's showing me my textual query as a kind of laid out tree here as well, where I can see how many results I have in each of the sub nodes of my query.

[37:35] Julius Volz

Right?

[37:35] Julius Volz

This is like a full, could be a first class expression by itself.

[37:41] Julius Volz

This demo API request duration, seconds count, metric name and then this one, the rate call also produces 27 results.

[37:50] Julius Volz

But those 27 results get shrunken down by the aggregation to only 5 results because we only have 5 path and method combinations that we end up with here, down here.

[38:02] Julius Volz

Now one further thing we could do is we could say, hey, let's only select the ones that ended with a 500 so errors, right?

[38:13] Julius Volz

This would be like error rates by path and method combination.

[38:19] Julius Volz

And now we could correlate those error rates for each path and method, which we can also see here as a table as current values.

[38:29] Julius Volz

We could correlate those to the total rates we get for the same dimensional combinations of method and path.

[38:36] Julius Volz

So now I could say, hey, divide this entire thing by basically the same thing, except here I'm, you know, in below the division, I'm not going to select only the arrows, I'm going to select all the requests to arrive at a total sum.

[38:53] Julius Volz

And I can indent this a little bit better here.

[38:56] Julius Volz

And yeah, now what happens is that this binary operator automatically does a join on identical label sets between the first and second operand.

[39:05] Julius Volz

We can also show that here how the matching is happening.

[39:08] Julius Volz

We see that sum will not actually find label combinations that have 500 errors.

[39:15] Julius Volz

So those will not produce an output.

[39:19] Julius Volz

And yeah, by default that's what happens.

[39:21] Julius Volz

You can customize these binary operations to match on a subset of labels to allow many to one or one to many.

[39:27] Julius Volz

Matching won't do that now.

[39:30] Julius Volz

So what we're getting now is basically ratios of how many errors there are.

[39:35] Julius Volz

I could multiply those by 100 to get percentages.

[39:40] Julius Volz

So we have 0.3% of errors in this first combination, for example, 0.7 in the second.

[39:48] Julius Volz

And now I could extend this to some kind of alerting rule where I say, hey, give me all the ones that are larger than 0.5%.

[39:57] Julius Volz

And so I just filtered this entire list of output time series down to the ones that have a sample value of larger than 0.5%.

[40:06] Julius Volz

And now I would get exactly the problematic label combination.

[40:11] Julius Volz

This is just one example of what you can do with PromQL.

[40:13] Julius Volz

There's histogram stuff in there's other stuff, but it's really just working with this label based data model and yeah, if you want to learn more about PromQL and all the different constructs you can write, I also created a cheat [email protected] promql cheat sheet.

[40:32] Julius Volz

You can find it also here under the resources.

[40:36] Julius Volz

And here I try to just list all the major patterns that people usually Write when doing PromQL.

[40:44] Julius Volz

So this won't be everything, but you can open each of these patterns in PROM Lens by just clicking on a button here and then you have some demo data to actually work with and play with expression and change it and see what it does.

[40:56] Julius Volz

You can also let the different nodes get explained to you here what they actually do.

[41:04] Julius Volz

You get the explanation down here.

[41:07] Julius Volz

So yeah, that's just like a super duper brief intro to what PromQL looks like.

[41:13] Julius Volz

And yeah, importantly and I'll stop sharing here, it's only used for reading data.

[41:20] Julius Volz

So if you want to write data into Prometheus or delete some data that happens over totally different paths.

[41:26] Julius Volz

Like Prometheus writing into the TSDB after his it has collected some data or if data has become too old it will become deleted depending on your retention policies.

[41:37] Julius Volz

But PromQL itself only always gives you some results.

[41:42] Julius Volz

It's a read only language and really flexible at giving you precise answers and conditions.

[41:50] Viktor Petersson

So one thing, I mean just very simple and question I guess around promql I guess maybe you can just explain it quickly like when you look at demo examples of Prometheus particular in the Stack Overflow forums and so on.

[42:05] Viktor Petersson

The most common functions is rate and irate.

[42:08] Viktor Petersson

Do you want to just help explain the difference between the two just quickly for.

[42:13] Viktor Petersson

Yeah, maybe people like myself, maybe I.

[42:15] Julius Volz

Should also if you go to my YouTube channel, I think it's so PromLabs is the channel name on YouTube.

[42:22] Julius Volz

I have a whole video about the three types of different rates and also the increase function that explains the exact differences.

[42:30] Julius Volz

Basically there are three different functions for telling how fast the counter is going up and those are rate, irate and increase.

[42:39] Julius Volz

The rate function gives you.

[42:42] Julius Volz

Maybe I should share my screen again because it's so graphically demonstratable.

[42:48] Julius Volz

That totally makes sense.

[42:50] Julius Volz

So I'll just do that again.

[42:51] Julius Volz

So let me know again once you see stuff these other ones here.

[42:57] Julius Volz

So now I'm just in a vanilla Prometheus server expression entry interface.

[43:02] Julius Volz

So let's say I do the rate over these request rate metrics that I saw that I used earlier.

[43:11] Julius Volz

What we're doing now is at every resolution step in the graph we are basically selecting five minutes of past data.

[43:20] Julius Volz

So if we're producing like this output point in the graph, we're looking five minutes backwards.

[43:24] Julius Volz

We're taking all the raw data, the only increasing counters, and we are kind of calculating an average of how fast per second is the counter or each of these counters increasing under that window.

[43:37] Julius Volz

And that per second increase value becomes the output point for this step in the graph.

[43:44] Julius Volz

And it can make these resolution steps a little bit more visible here.

[43:47] Julius Volz

If I set the resolution to five minutes, then we get a lot of explicitly visible resolution steps along the graph.

[43:55] Julius Volz

So you can really imagine that every one of these steps we look back this amount of time that I specify the expression.

[44:03] Julius Volz

And yeah, in this case this would be an entire step because this is five minutes and this is five minutes.

[44:11] Julius Volz

And it just summarizes that entire steps samples into an average per second value.

[44:18] Julius Volz

Now what happens if I make this window a little bit smaller, but also I will go back to the original resolution and start with five minutes.

[44:30] Julius Volz

So five minutes, I'm smoothing over five minutes so it will be relatively smooth.

[44:36] Julius Volz

And if I make this averaging window smaller, like one minute, I get more spiky rates because now I'm averaging over fewer samples.

[44:45] Julius Volz

Same If I do 30 seconds, then it gets like really spiky.

[44:49] Julius Volz

And I only scrape data every 15 seconds in this Prometheus server.

[44:53] Julius Volz

So eventually I even have to be careful that I don't make the window too small, because otherwise I will not always at every point be lucky enough to actually span two samples with my averaging window.

[45:09] Julius Volz

Right.

[45:09] Julius Volz

If they're 15 seconds apart, it could be that there are not two of them under a 20 seconds window.

[45:15] Julius Volz

This gets exacerbated if I do a 16 seconds window.

[45:19] Julius Volz

Then only sometimes are they actually going to be two samples under the rate window to be able to even compute a rate.

[45:25] Julius Volz

And the graph will mostly be cappy.

[45:28] Julius Volz

So you do have to choose your window large enough to at least robustly select two samples.

[45:36] Julius Volz

But what if you wanted to say something like, hey, I want my graph to behave as spiky as possible.

[45:44] Julius Volz

Like react as much as possible to the most two recent data points while still not having to care too much about how small?

[45:53] Julius Volz

Exactly.

[45:54] Julius Volz

I can choose this rate window.

[45:56] Julius Volz

So that's the irate function, or also called instant rate function.

[46:01] Julius Volz

It also gives you a per second rate of increase.

[46:05] Julius Volz

But under this provided window here, it will only ever choose the latest two samples and see how much per second does the counter increase between them.

[46:17] Julius Volz

So it doesn't matter anymore how large or small you make this here you will always get the same result providing I anchor it to a constant time here.

[46:27] Julius Volz

So one hour versus one minute it will look the same as long as you make this window large enough to always cover two data points.

[46:36] Julius Volz

Again, if I make it too small, you will run into issues again.

[46:39] Julius Volz

But that's basically what the irate function is about.

[46:43] Julius Volz

Normally I would not recommend using it unless you have a specific reason because it will skip over most of the data in your range interval here you might have five minutes worth of data in there and but now you're only looking at the latest two data points in there to give you some answer and they might not really be representative.

[47:04] Julius Volz

And so usually people use rate.

[47:07] Julius Volz

It's good to average or smooth over some amount of time.

[47:10] Julius Volz

Not too long maybe.

[47:13] Julius Volz

But if you are in a super zoomed in graph like this and you want to see really the latest developments, then maybe irete can make sense at times, right?

[47:25] Julius Volz

But yeah, use it sparingly.

[47:27] Julius Volz

And then a last function called increase is basically identical to rate except that it does not convert the output unit to per second.

[47:40] Julius Volz

So the shape of the graph, if I go back to the one hour graph, let's say, or let's go to two hours, the shape will look identical with rate and increase here.

[47:51] Julius Volz

The only difference is the Y axis.

[47:54] Julius Volz

So here we get a per second Y axis and if I go to increase with the same window now we get increases not per second but per one minute because I have a one minute window here.

[48:07] Julius Volz

Or if I do 15m, then we would have increases per 15 minutes.

[48:13] Julius Volz

And so you see again, but this kind of intermingles two aspects.

[48:18] Julius Volz

We intermingle the smoothing window size so how smooth the graph will look and the output unit.

[48:26] Julius Volz

So if I change this window, I will both change the smoothing and what unit I actually output, which at times might be actually what you want, right?

[48:35] Julius Volz

What's the total increase over one day or so?

[48:38] Julius Volz

Then you don't have to multiply it in the end with the number of seconds in a day, but at other times that's not actually what you want.

[48:45] Julius Volz

So most of the time by default the rate function is actually great.

[48:50] Julius Volz

It gives you a predictable per second output unit and that's great if you then want to combine it and divide it and so on with other constructs in your query.

[49:02] Julius Volz

If you keep everything at these base units, everything will stay more predictable and yeah, so that's why most of the time you will see just the rate function.

[49:13] Viktor Petersson

Got it.

[49:13] Viktor Petersson

Okay, that's really cool.

[49:15] Julius Volz

Any other promql questions?

[49:17] Julius Volz

Or answer the screen.

[49:20] Viktor Petersson

Yeah, I think rate, I would presume is your starting point for a lot of metrics where when you're starting out, I guess that is the function that you start using in most common.

[49:36] Julius Volz

Most common, yeah.

[49:38] Julius Volz

Rate and then summing of rates I would say is one of the most common things you will see.

[49:42] Julius Volz

Yeah, cool.

[49:44] Viktor Petersson

No, I think that's a good.

[49:45] Viktor Petersson

That's a good crash course and quick introduction.

[49:47] Julius Volz

Nice.

[49:48] Viktor Petersson

Cool.

[49:49] Viktor Petersson

Let's turn our eyes to the future a little bit and I'm curious about a few things to pick your brains on.

[49:56] Viktor Petersson

The first one is EBPF is all the rave these days in observability.

[50:03] Viktor Petersson

Is there any play for Prometheus to tap into EBPF for, I guess collecting metrics with less resource overhead?

[50:13] Viktor Petersson

Or how do you see that change in kind of observability in Prometheus?

[50:17] Julius Volz

Yeah, so myself, I'm not a big EBPF expert, of course it's awesome kernel level interface to allow you to tap into pretty much any place in the kernel and extract information or do stuff, whatever you want in there.

[50:31] Julius Volz

There are EPP exporters for Prometheus, so if you just Google EBPF exporter, you will find one by Cloudflare for example.

[50:40] Julius Volz

And yeah, you can configure it in different ways to give you all kinds of host metrics.

[50:46] Julius Volz

And I would actually also have to read the detailed documentation about what exactly it can do.

[50:53] Julius Volz

But yeah, basically it will enable you to collect metrics about all kinds of things you can, you know, instrument with ebpf.

[51:03] Viktor Petersson

You don't anticipate a world where an EBPF based node exporter would replace the regular GO based node exporter in the future?

[51:12] Julius Volz

It's a good question.

[51:13] Julius Volz

I mean, at the moment the question is what would that give us if, you know, if we could get at stats that are not exposed using the current interfaces which are fine, like the PROC file system sys file system and certain syscalls.

[51:31] Julius Volz

It could be that there are certain stats in the Linux kernel that we would also want to expose that would be possible in that way.

[51:39] Julius Volz

Let me just see if anyone has already added anything EBPF wise to the node exporter.

[51:45] Julius Volz

You never know because there's so many different Prometheus components these days.

[51:50] Viktor Petersson

Right?

[51:51] Julius Volz

Ebp, I don't find anything EBPF at least module wise in the node exporter yet.

[51:58] Julius Volz

But of course it makes sense because it's also node level metrics that you might at some point want to put something like that directly into the Node exporter and until that time run the separate EBPF exporter.

[52:12] Julius Volz

But yeah, I could guess what are the upsides either being able to get at metrics that you don't otherwise have, being able to get at them without writing as much custom parsing code because it's a bit annoying to parse those proc file system files sometimes and also maybe to get something more efficiently.

[52:35] Julius Volz

But so far it has not happened in the node exporter, but totally for it, if it makes sense.

[52:43] Viktor Petersson

I'm just thinking like you could increase the scraping period, for instance, and to get metrics every second instead of every 15 seconds, perhaps without taking the.

[52:52] Julius Volz

I mean usually that's like most of the time the scraping frequency is not really bottlenecked on the producer of the metrics and more by the Prometheus server being able to ingest a lot of data.

[53:05] Julius Volz

It's fairly good at that, but that's still usually the bottleneck.

[53:11] Julius Volz

But yeah, there's some modules in the node exporter that take a little bit more time to produce data, which are some of them I think also switched off by default.

[53:22] Julius Volz

So it could be that there's.

[53:24] Julius Volz

I'm not sure if there's something in there that could be made like way more efficient by ebpf because already the native kernel interface that exposes it, I hope most of those will already be relatively efficient.

[53:40] Viktor Petersson

Fair enough.

[53:43] Viktor Petersson

Open telemetry seems to be kind of taken big, well taken.

[53:49] Viktor Petersson

Observability kind of at a pretty high pace right now.

[53:52] Viktor Petersson

It's gaining a lot of momentum.

[53:55] Viktor Petersson

Obviously Prometheus is kind of a building block in that.

[53:58] Viktor Petersson

How do you see open telemetry kind of as the observability stack of the future for cloud, native workloads or how do you envision that?

[54:11] Julius Volz

Yeah, I mean, so open telemetry came from a little bit of a different corner originally.

[54:17] Julius Volz

Very tracing focused with open tracing and open senses and then merging into open telemetry.

[54:24] Julius Volz

So that was their.

[54:24] Julius Volz

The prime kind of initial use case that was most stable.

[54:28] Julius Volz

And then there were logs and metrics, with metrics now also being more stable and people adopting it.

[54:34] Julius Volz

And they focus a lot on the instrumentation API aspect, like standardizing the instrumentation API.

[54:44] Julius Volz

What should it look like to instrument your application and what metadata should there be in certain interfaces for metrics and so on?

[54:52] Julius Volz

And in a lot of cases those are similar to what we do in Prometheus, but in some other cases they're a bit different.

[55:01] Julius Volz

They have slightly different metric types, they have slightly different allowed character sets, especially in label Names, they can be arbitrary UTF 8 characters if I'm not mistaken.

[55:14] Julius Volz

Or at least we want to allow that now in Prometheus as a result of, you know, wanting more open telemetry compatibility.

[55:22] Julius Volz

And they also have different standardizations of what labels will you want to always collect about where a given metric came from, the resource label.

[55:36] Julius Volz

And in Prometheus we have something similar called target labels that Prometheus attaches to wherever it pulls metrics from.

[55:44] Julius Volz

But they're slightly different and they're more of an admin configurable stuff and less standardized.

[55:51] Julius Volz

And so now there's efforts in the Prometheus team wanting to of course still be relevant in this world.

[56:00] Julius Volz

And people want to use Prometheus as an otL metric store.

[56:05] Julius Volz

Prometheus will still only do the metrics part and not logging or tracing, but at least for the metrics part.

[56:11] Julius Volz

If you're using oatmeal telemetry to instrument your apps, then we want to be able to store OTAL metrics in a more native and better way.

[56:23] Julius Volz

You can already send in an experimental way OTAL metrics to Prometheus.

[56:28] Julius Volz

There's an experimental OTAL receiver in there, or there's different ways of making it work.

[56:34] Julius Volz

You can either send the OTLP stuff directly to Prometheus using the new experimental interface, or you could bridge it earlier on into a Prometheus native remote write format and send that to Prometheus.

[56:47] Julius Volz

But we want to in Prometheus extend the data model somehow to allow more characters so we are able to store more of the data model that OTAL has.

[57:00] Julius Volz

For metrics, we're looking at ways of supporting what they call, what's it called, a delta, not delta cardinality.

[57:11] Julius Volz

When you count like either when you send absolute counters that only increase over all of time, or you only send deltas from one push to the next.

[57:21] Julius Volz

And in a push based monitoring system that's easier because your process, it counts a lot of things from one push to the next and then it's, it resets the counter to zero and so it will only like, at every push it will send the delta that has happened between two pushes.

[57:37] Julius Volz

Right?

[57:39] Julius Volz

And the problem with a pull based monitoring system like Prometheus is that pools are supposed to be idempotent and any service process that is being monitored by Prometheus could be monitored by either 0, Promethei or like by many Prometheus servers at the same time.

[57:57] Julius Volz

And it shouldn't then any scrape from a Prometheus server shouldn't Just reset whatever the current count is.

[58:03] Julius Volz

So we have to add an additional little translation layer in there to potentially transform these delta temporality, I think counters into absolute counters again or support them natively or something like this.

[58:23] Julius Volz

So this is still being hashed out and there's a couple of other things around mapping these resource and target labels.

[58:29] Julius Volz

Like do we always take this huge amount of hotel labels and just put it into Prometheus metric?

[58:35] Julius Volz

That would get super messy.

[58:37] Julius Volz

Or do we allow users to map like configure which ones to map over?

[58:42] Julius Volz

And how do we always just put it into a separate metric that you then have to join in?

[58:47] Julius Volz

That's also annoying.

[58:48] Julius Volz

So there's definitely some little incompatibilities that will make, you know, I would still recommend people like use native Prometheus instrumentation.

[58:59] Julius Volz

It's going to be way less code and it's going to work faster and more simple.

[59:04] Julius Volz

But if you do want to use otel, like if you just are interested in the metrics use case and you still want to use otel, then we are still going to work on making that better.

[59:16] Viktor Petersson

Okay, makes sense.

[59:17] Viktor Petersson

Yeah, I guess that's fundamentally a different thing between push and pull.

[59:20] Viktor Petersson

I mean that was at least in our use case of screenly, that was a bit of an inconvenient thing.

[59:26] Viktor Petersson

When you work with edge devices, for instance, you can't scrape them because obviously they're not in your data center.

[59:32] Viktor Petersson

So the push ball there makes a lot more sense for our use case.

[59:35] Viktor Petersson

But I know Prometheus heavily favors pull versus push.

[59:40] Julius Volz

Yep.

[59:41] Julius Volz

For different reasons.

[59:42] Viktor Petersson

But yeah, absolutely, absolutely.

[59:44] Viktor Petersson

But I'm saying that might push that push use case a bit further, which we would welcome.

[59:52] Julius Volz

Makes sense for anyone with like super highly segmented networks or you know, end devices at customers and so on, where, you know, polling is really great if you can reach everything easily in your own data center and so on.

[01:00:04] Julius Volz

But yeah, it gets harder as you break things up.

[01:00:08] Viktor Petersson

Absolutely, absolutely.

[01:00:09] Viktor Petersson

There's another project that I know you've been a little bit touching upon as well.

[01:00:13] Viktor Petersson

It's OpenMetrics.

[01:00:15] Viktor Petersson

Do you want to speak a bit about that and how that kind of.

[01:00:18] Viktor Petersson

There's a lot of open here, but.

[01:00:21] Julius Volz

How that relates to the whole openmetrics was an idea of actually taking what we had as Prometheus metrics text transfer protocol and just standardizing it more with.

[01:00:34] Julius Volz

With some changes requested by other parties and stakeholders.

[01:00:39] Julius Volz

So this was an initiative started by Richard Hartman of the Prometheus project and he got others together from other companies and said, hey, let's do an RFC Internet standard.

[01:00:48] Julius Volz

And it made it pretty far and it's also supported by Prometheus, but by now basically it was decided by the Prometheus team and the Open Metrics team.

[01:01:01] Julius Volz

Like hey, I think we still want to do some things differently and let's just actually merge Open Metrics back into Prometheus again and not have it as a separate project anymore.

[01:01:12] Julius Volz

So it's just going to be archived in the cncf and it's still, I think explored a lot of the needs of, you know, if we want to standardize a metrics transfer protocol or evolve Prometheuses further, what are needs that other parties actually want in such a thing.

[01:01:33] Julius Volz

And so it did help, but it will no longer exist to my knowledge, like as a separate project.

[01:01:41] Viktor Petersson

Okay, but it sounds like you made headways in terms of that you can now push Prometheus style metrics to most cloud vendors, which I guess it sounds like that kind of paved the way to that.

[01:01:54] Julius Volz

Yeah, that's a different thing because like the Open Metrics system is so you have the Prometheus server and you have your monitor target.

[01:02:02] Julius Volz

That's what happens between those two.

[01:02:04] Julius Volz

And then the protocol that we speak to the outside world, that's the remote write protocol.

[01:02:08] Julius Volz

So that's a totally different one that is independent of the scrape format.

[01:02:15] Julius Volz

Right.

[01:02:15] Julius Volz

It has slightly different needs as well because you will want to be able to send multiple samples from multiple scrapes for many more different time series over this remote write protocol.

[01:02:27] Julius Volz

Whereas the pull based open Metrics or the previous text based format on which it is based, they're really for going to a single monitored process or target and pulling its current metric state.

[01:02:43] Julius Volz

Just the one number for each metric at this point in time.

[01:02:47] Viktor Petersson

Okay, that's perfect.

[01:02:50] Viktor Petersson

I think we're up on time.

[01:02:51] Viktor Petersson

But it's been very informational and helpful for me and I think that's really good.

[01:02:57] Viktor Petersson

And do you want to do a quick shout up if somebody needs consulting service around Prometheus where they can get in touch with you and how they can.

[01:03:05] Viktor Petersson

Yeah, how they can.

[01:03:06] Julius Volz

Oh yeah.

[01:03:07] Viktor Petersson

Take their Prometheus game to the next level?

[01:03:09] Julius Volz

I'd love to.

[01:03:09] Julius Volz

So I mean these days I mostly do training around Prometheus so no longer just, you know, per hour consulting.

[01:03:16] Julius Volz

I have two main things I do besides other partnerships and still open source development.

[01:03:21] Julius Volz

So those are live trainings around the PromQL query language where I really explain over a multi hour session kind of with a setup like this here right now and screen sharing and material and everything, how PromQL works, how you can use it, all the different pitfalls with exercises and so on.

[01:03:41] Julius Volz

So I do those, you know, quite often with teams in companies.

[01:03:46] Julius Volz

And the other thing is, so you can find [email protected] and you go to, I think at the very top there's a live trainings link and I also have self paced courses, another link on the side and that leads you to training.promlabs.com and those are training modules.

[01:04:05] Julius Volz

So courses that you can either buy as an individual or you can buy for your entire engineering team as a company to learn all the basics of Prometheus, like from what is Prometheus to all the advanced PromQL stuff and alerting and dashboarding and node exporter metrics and remote storage and all that.

[01:04:26] Julius Volz

So it's textual and images and screenshots and quizzes and interactive stuff where you set up all these components and string them together and build setups.

[01:04:37] Julius Volz

So it's not video content.

[01:04:39] Julius Volz

But that makes it actually possible for me to keep up to date all the time, which would be really hard with videos.

[01:04:46] Julius Volz

So yeah, individuals buy them and companies do as well.

[01:04:50] Julius Volz

And of course I can recommend them.

[01:04:53] Viktor Petersson

Perfect.

[01:04:54] Viktor Petersson

Thank you so much for your time today, Julius.

[01:04:57] Viktor Petersson

Much pleasure having you in the show and I talk to you soon.

[01:05:00] Julius Volz

Alrighty.

[01:05:00] Viktor Petersson

Thank you so much.

[01:05:01] Julius Volz

See you.

[01:05:01] Viktor Petersson

Thank you.

[01:05:02] Julius Volz

Bye.

Found an error or typo? File PR against this file or the transcript.