Skip to main content

Follow Me

Join Viktor, a proud nerd and seasoned entrepreneur, whose academic journey at Santa Clara University in Silicon Valley sparked a career marked by innovation and foresight. From his college days, Viktor embarked on an entrepreneurial path, beginning with YippieMove, a groundbreaking email migration service, and continuing with a series of bootstrapped ventures.

Nerding out about Prometheus and Observability with Julius Volz

Play On Listen to podcast on YouTube Listen to podcast on Spotify Listen to podcast on Apple Listen to podcast on Amazon music
15 JAN • 2024 1 hour 5 mins
Share:

In this episode, I’m joined by Julius Volz, co-founder of Prometheus and founder of PromLabs, to explore the fascinating world of systems monitoring and observability. Julius’s journey from working on Borgmon at Google to co-creating Prometheus offers unique insights into how modern monitoring systems evolved.

We start with the technical foundations of Prometheus. What particularly caught my attention was Julius’s explanation of their dimensional data model and how it revolutionized metrics-based monitoring. His breakdown of common pitfalls, especially around metric design and “cardinality bombs,” provides invaluable guidance for anyone implementing Prometheus.

The conversation gets especially interesting when we dive into long-term data storage challenges. Julius shares practical insights about solutions like Cortex and Thanos, demonstrating how to handle large datasets effectively. His live demonstration of PromQL, showing functions like rate, irate, and increase, reveals the powerful querying capabilities that make Prometheus stand out.

I was particularly intrigued by our discussion of future trends in observability. Julius’s thoughts on eBPF integration, OpenTelemetry, and the OpenMetrics project show how the monitoring landscape continues to evolve. We also explore the simplicity of writing Prometheus exporters, highlighting how accessible the technology can be even for those with minimal coding experience.

If you’re interested in systems monitoring, observability, or infrastructure management, you’ll find plenty of practical insights here. Julius brings both deep technical knowledge and hands-on experience to the discussion, making complex monitoring concepts accessible while maintaining their technical depth.

Transcript

Show/Hide Transcript
[00:00] Viktor Petersson
Hello and welcome to this episode of Nerding out with Victor.
[00:03] Viktor Petersson
Today.
[00:03] Viktor Petersson
I got a very special guest with me today, Julius from Prometheus.
[00:09] Viktor Petersson
Maybe we should start with doing a quick intro to yourself, Julius, for people who are not familiar with who you are and about what Prometheus is in big picture.
[00:20] Julius Volz
Yeah.
[00:20] Julius Volz
So I'm Julius, I live in Berlin.
[00:23] Julius Volz
I am the co founder of the open source Prometheus monitoring system, but then also the founder and the sole person behind the company Promlabs.
[00:33] Julius Volz
So Prometheus is the open source, open governance monitoring system that is being developed and used by many people and there's many companies around it and Promlabs is just myself and it's just one of those companies.
[00:48] Viktor Petersson
Got it.
[00:48] Viktor Petersson
Perfect.
[00:49] Viktor Petersson
And I guess for those not familiar Prometheus.
[00:53] Viktor Petersson
Do you want to give kind of a sense of how widely used Prometheus is today and how it's being used in general?
[01:01] Julius Volz
Yeah, I mean Prometheus has pretty much become the de facto standard in metrics based systems monitoring in the open source world.
[01:12] Julius Volz
At least.
[01:13] Julius Volz
There's some closed and hosted competitors of course, but at least for the metrics based monitoring, it is pretty much the standard.
[01:20] Julius Volz
And yeah, I mean so you can find it really everywhere from small startups to really big banks, corporations, enterprises, even people run it at home to monitor their homes.
[01:33] Julius Volz
And yeah, you can use it in the classic data center use case to monitor your IT in a data center, but people also monitor hardware with it like sensors and chips and wind parks and all that kind of stuff.
[01:50] Viktor Petersson
It's super interesting.
[01:52] Viktor Petersson
Let's take a stroll down memory lane because I'm really curious about the early days of Prometheus and really dive into that and if I'm not mistaken started around 2012 or so back at SoundCloud.
[02:05] Viktor Petersson
Back in those days.
[02:06] Viktor Petersson
Do you kind of speak a bit more about what happened and what led to the invention of Prometheus and what kind of pain point you can solve with for that?
[02:16] Julius Volz
Yeah, exactly.
[02:17] Julius Volz
So this was 11 years ago, 2012.
[02:20] Julius Volz
So by now all of this is not as exciting maybe anymore, but back then it was.
[02:26] Julius Volz
So my job previous to SoundCloud was at Google as a site reliability engineer in one of the services.
[02:34] Julius Volz
And all the site reliability engineers at Google used a tool called Borgman to monitor their production services at Google, whether this was Google Search or Gmail or the service I was on, which was an internal backup service.
[02:48] Julius Volz
And this was a tool that was very similar to what Prometheus is now.
[02:53] Julius Volz
So the idea was to collect time series, store Them with a dimensional data model, so a metric name and then a set of labels attached to them so you can see in detail where something happened.
[03:07] Julius Volz
Which is great especially for dynamic cloud based systems which Google already had back then.
[03:13] Julius Volz
Now the thing is, after Google I went to SoundCloud and Matt Proud, another ex Googler, also went to SoundCloud at roughly the same time.
[03:21] Julius Volz
And were basically hired to try and make SoundCloud more stable and reliable and faster as platform or system engineers.
[03:31] Julius Volz
And we really found that the monitoring system world, especially in the open source world outside of Google, was severely lacking at that time.
[03:41] Julius Volz
Either the systems could only do alerting, but didn't really have any idea of history like time series, right?
[03:48] Julius Volz
Nagios didn't really have much of a data model to speak of or the alerting conditions you could create in there were very simplistic things and check scripts and so on.
[04:00] Julius Volz
And then you had systems like Graphite or opentsdb which either didn't have a dimensional data model and or didn't have a proper query language to do math between whole sets of numbers that are correlated in some ways and so on.
[04:16] Julius Volz
And then also efficiency of storage, UIs, all those things were kind of really lacking.
[04:22] Julius Volz
And so eventually we got to the point where we said, well, let's at least try in our free time, like on the weekends and after work at first to build something that is more similar to what were used to at Google.
[04:34] Julius Volz
And this became Prometheus.
[04:37] Julius Volz
We threw it up on GitHub under an Apache license from day zero, but we hadn't really told many people about it yet.
[04:46] Julius Volz
Then from there on, maybe two months in, we had the roughest of prototypes that could collect some data, store it and show it.
[04:55] Julius Volz
But then the first prototype is very easy and then comes the 99% of hard work.
[05:03] Julius Volz
So after that it was years of a convincing people that this makes sense to build our own monitoring system, then to actually build it until it was stable and scalable enough and wouldn't just oom all the time and all those things.
[05:18] Julius Volz
And then also to have at SoundCloud internally enough of a killer use case for it, which came and hadn't given that context which made Prometheus so important at SoundCloud specifically is that SoundCloud had very early on built a cluster scheduler like Kubernetes, much simpler of course, but before Docker, even before anyone else in the world was using containers, basically not everyone, but they built this using go version 0.9 something and like raw LXC containers and these hundreds of microservices running on that cluster would just be rescheduled on different hosts and different ports every time a new revision was rolled out.
[06:04] Julius Volz
And so that's what made it extra hard to monitor it.
[06:07] Julius Volz
With existing monitoring systems like Graphite for time series or Nagios for alerting when there was a latency spike, it was almost impossible to find out was it the entire service that is just getting slower or was there one specific instance contributing to this because we didn't have enough scalability to track per instance process level stats and very short lived time series and all that.
[06:33] Julius Volz
So yeah, the killer use case really started coming or I guess the first one of those was instrumenting this internal cluster scheduler system.
[06:43] Julius Volz
So every developer immediately now could see without doing anything, instance level stats of memory usage, CPU usage and so on.
[06:52] Julius Volz
Like all these per process, per container stats that are completely commonplace these days.
[06:58] Julius Volz
And they could key it by process, by revision by different labels that they cared about to track whether something was related to a revision or a specific host, et cetera.
[07:10] Julius Volz
Right, that was one thing that really helped it internally.
[07:15] Julius Volz
And then we needed a dashboard builder.
[07:18] Julius Volz
So I had built Prom Dash back then before Grafana was out.
[07:22] Julius Volz
So Grafana didn't really exist yet.
[07:24] Julius Volz
Prom Dash doesn't exist anymore.
[07:26] Julius Volz
So by now we're also all just using Grafana.
[07:28] Julius Volz
There's some new competitors on the horizon now.
[07:32] Julius Volz
And yeah, I mean at some point we reached an internal threshold of where people started, you know, adopting this more and more, finding it so useful that eventually there was an edict saying no new service without Prometheus Metrics.
[07:47] Julius Volz
And over the years we, yeah, we already kind of like told some other ex Googlers hey, we are writing a system similar to Borgmon.
[07:59] Julius Volz
Maybe you want to use it at your next company as well.
[08:03] Julius Volz
So we attracted like a handful of external users already.
[08:06] Julius Volz
But we only really fully published this in beginning of 2015 with a blog post from SoundCloud and another early user company.
[08:16] Julius Volz
And that's when things really exploded on Hacker News and everywhere else.
[08:20] Julius Volz
And you know, shortly later we joined the cncf and now the project is completely neutrally hosted in a foundation that belongs to the cncf.
[08:30] Julius Volz
So that's for people who don't know, that is the Cloud Native Computing foundation, which in turn belongs to the Linux Foundation.
[08:38] Julius Volz
And they were just setting that up originally just kind of Google with the lf.
[08:42] Julius Volz
They set it up to house Kubernetes neutrally.
[08:45] Julius Volz
And were the second project joining that and.
[08:48] Julius Volz
Yeah, so I guess that's the beginnings of it all.
[08:51] Julius Volz
And since then everything has grown a lot and evolved a lot.
[08:56] Viktor Petersson
Yeah, that's.
[08:57] Viktor Petersson
I mean, I think there are.
[08:58] Viktor Petersson
The interesting thing here is because it was such a important or the timing really played in the favor because I think it was just around that time where the whole ephemeral workload really started taking off and it sounded like obviously what you guys were doing predated kubernetes, which is kind of the default runtime these days.
[09:15] Viktor Petersson
But back then, traditional monitoring tools, they were really not built for ephemeral workloads.
[09:21] Viktor Petersson
Right.
[09:21] Viktor Petersson
I think that's.
[09:22] Viktor Petersson
That was.
[09:22] Viktor Petersson
Yeah, it sounds like that was the big killer feature that really gained adoption.
[09:27] Viktor Petersson
Right.
[09:27] Julius Volz
Yeah, especially maybe not ephemeral in the sense of seconds because Prometheus is not the greatest at that.
[09:33] Julius Volz
Because at least you need some thing that can then keep the state of whatever just ran but great for short lived time series or ephemeral workloads in the sense of things that maybe just run a day and then they get rescheduled with a new revision or on a new, you know, process or something.
[09:51] Julius Volz
So not just tracking one fixed set of identified time series over time, but really, you know, having a lot of churn in the identities of your time series over the day as developers roll out new revisions and scale down or up apps and so on.
[10:07] Viktor Petersson
Yeah.
[10:09] Viktor Petersson
So more ephemeral worker pools rather than ephemeral in the sense of like serverless, I guess.
[10:15] Julius Volz
Yeah, yeah.
[10:16] Julius Volz
You can also make that work, but that you need some extra bits for that.
[10:19] Viktor Petersson
Yeah, yeah.
[10:22] Viktor Petersson
So one thing I'm really curious about, obviously you have building and work with Prometheus now.
[10:28] Viktor Petersson
Well, since day zero by definition.
[10:31] Viktor Petersson
But what I'm really curious about is what is a common mistake you see in metrics?
[10:35] Viktor Petersson
I'm sure you've done you.
[10:37] Viktor Petersson
I've.
[10:37] Viktor Petersson
I think I've said some blog posts you've been through about this and what?
[10:41] Viktor Petersson
There's.
[10:42] Viktor Petersson
There is some Prometheus consultancy out there that have written quite a few blog posts.
[10:45] Viktor Petersson
But I'm curious about your vantage point of that.
[10:47] Viktor Petersson
Like what really are the most common mistakes people do when they're implementing Prometheus?
[10:54] Viktor Petersson
I guess in general.
[10:57] Julius Volz
Yeah.
[10:58] Julius Volz
And one of the companies writing blog posts about exactly this is myself.
[11:02] Julius Volz
So that is Promlabs.
[11:03] Julius Volz
Also if you go to promlabs.com and you go to the blog section, there's an article called Avoid these six mistakes when getting started with Prometheus and the very clickbaity title leads you to blog post mentioned.
[11:17] Viktor Petersson
I read the post some time ago.
[11:19] Julius Volz
Yeah, some of the common things that stump newbies and these are very common.
[11:25] Julius Volz
I think the really the most common one is cardinality bombs.
[11:28] Julius Volz
This is just when you discover this concept of dimensionality and labels for the first time, especially back then when it was new, people first were a little bit skeptical, what are these labels?
[11:42] Julius Volz
But then they said, oh, this is really useful.
[11:45] Julius Volz
Let's put everything into a label.
[11:46] Julius Volz
Like let's also split up all our time series by the exact user ID or an email or something of which there are an unbounded huge number of possible values.
[11:58] Julius Volz
And the main thing in Prometheus is that every set of labels generates one time series automatically.
[12:09] Julius Volz
The unique label set identifies a time series together with a metric name.
[12:13] Julius Volz
And so if you have one label in there that has a million different values, not only do you generate a million time series, but a million times whatever the other possible number of values are on the other labels on that same time series.
[12:27] Julius Volz
So it all multiplies together, the cardinality.
[12:30] Julius Volz
And your a big Prometheus server will maybe comfortably handle 10 or so million time series.
[12:39] Julius Volz
It really depends on how large you, how many resources you give it.
[12:42] Julius Volz
So several million time series are kind of normal for a big Prometheus server, maybe even more.
[12:48] Julius Volz
But that is your total budget.
[12:50] Julius Volz
And you want to kind of design your metrics in such a way that you stay under that total budget.
[12:57] Julius Volz
And how you exactly do that is kind of up to you.
[12:59] Julius Volz
Whether you have fewer processes to monitor because the process is also a label dimension, or whether you in the process split up a metric by fewer labels.
[13:11] Julius Volz
Or maybe there's ways to summarize the potential different values within one label into groups so you don't have to.
[13:21] Julius Volz
Instead of tracking a million different status codes, maybe you can group them into error success or something else.
[13:27] Julius Volz
Or something like this.
[13:28] Julius Volz
Right?
[13:29] Julius Volz
A very common thing where this happens is also when people want to track HTTP request statistics and they put an entire path in there, like hey, slash posts, user ID, post ID or whatever.
[13:44] Julius Volz
And these IDs, right?
[13:45] Julius Volz
They can get very high cardinality.
[13:48] Julius Volz
And there you could also say, hey, before I track this, I will actually replace these high cardinality bits in my path with something that is just a placeholder.
[13:57] Julius Volz
So now I'm just generating one label value for that particular pattern of labels.
[14:06] Julius Volz
So that's one.
[14:08] Julius Volz
I'm just looking here for others.
[14:12] Julius Volz
The others are even a little bit more technical, I guess, that are in my blog post.
[14:16] Julius Volz
So I Wouldn't go into them too deeply, but I think this whole metrics design is one of the biggest ones that everyone gets hit with initially.
[14:28] Viktor Petersson
Yeah, I can definitely relate to that.
[14:29] Viktor Petersson
When we first started adopting Prometheus, we tagged and over tagged everything and it became, like you correctly pointed out, kind of unusable after a while.
[14:40] Viktor Petersson
So less is more, I guess in some ways and start expanding from there.
[14:45] Viktor Petersson
Speaking of metrics in general, so those are some common mistakes with Prometheus.
[14:51] Viktor Petersson
If you're putting your sre hat back on for a moment, what do you consider kind of the blueprint or the best practice for server monitors in general?
[15:03] Viktor Petersson
If you're looking away from the application stack but more metric that you care about when you do observability and how you do go about to do that and set up those dashboards.
[15:14] Viktor Petersson
So how do you think about that?
[15:16] Viktor Petersson
If you think from that angle, you.
[15:18] Julius Volz
Mean like monitoring servers.
[15:22] Viktor Petersson
So assuming we are.
[15:24] Viktor Petersson
Yes.
[15:25] Viktor Petersson
Like the health of the server in general.
[15:26] Viktor Petersson
Right.
[15:27] Viktor Petersson
And so what are the metrics that you care about in terms of monitoring a server in general really?
[15:34] Viktor Petersson
If you're thinking from like best practices for somebody who hasn't had proper monitoring in place, perhaps, and setting that up from day zero, presume that would involve node exporter.
[15:46] Viktor Petersson
Perhaps.
[15:47] Viktor Petersson
But I'm just curious, what do you think about that in general, how you see that Monitoring in general.
[15:55] Julius Volz
Right.
[15:55] Julius Volz
So yeah, when it comes to like monitoring hosts.
[15:58] Julius Volz
Right, that's what you mean.
[15:59] Viktor Petersson
Yes.
[16:01] Julius Volz
There's often not that much you actually want to alert on.
[16:04] Julius Volz
So it depends what your environment is, of course, and what your philosophy is.
[16:09] Julius Volz
But often for hosts at least there's.
[16:13] Julius Volz
The node exporters is one of those agents that the Prometheus project officially offers that exposes all kinds of metrics about the host it is running on to the Prometheus monitoring system.
[16:26] Julius Volz
So you can collect CPU memory usage and network usage and everything you would expect these days if you're running it on Linux, it gets it from the PROC file system, from the sys file system, like all these different kernel interfaces and then just bridges it over to Prometheus metrics.
[16:42] Julius Volz
And so the first step is just run the node exporter really easy.
[16:46] Julius Volz
It's easy to deploy on every host you have and then just monitor it so you have the metrics.
[16:52] Julius Volz
You can configure it in different ways to for example, exclude some pseudo file systems or Docker container volumes and stuff like that.
[16:59] Julius Volz
A lot of those exclusion flags are set pretty reasonably by default, so often you can just start it without Any flags and it'll work fine.
[17:09] Julius Volz
And yeah, then there's just like a few things you might want to alert on.
[17:12] Julius Volz
Like, you know, back then it was pretty common to alert on stuff like load average being high or certain resource usage being high.
[17:22] Julius Volz
But the problem is that sometimes that doesn't really point to a real problem and might actually generate too noisy alerts that don't immediately affect the service in a bad way.
[17:34] Julius Volz
And so alerting I would usually more approach from the kind of the SLA perspective of what does the user of whatever service you offer expect or what kind of agreement do you have with them or what service level objective SLO did you make for yourself?
[17:54] Julius Volz
At least this is the latency we want to have like 90th percentile latency should always below 100 milliseconds.
[18:04] Julius Volz
We should have a 99.9% uptime over given months and so on.
[18:10] Julius Volz
And then have alerts more based on these user visible metrics rather than on all possible underlying causal metrics.
[18:19] Julius Volz
You still want to collect all those underlying causal host metrics to be able to figure out when you do get alerted what might be the reason, what might be the underlying cause, are there any weird things in your graphs and so on.
[18:31] Julius Volz
But host wise there's not a lot of things you wouldn't alert directly on.
[18:36] Julius Volz
But of course there's some stuff like disk usage at least if it's completely full or you can predict that it is already very, it's maybe at 80% full and you're filling up quite fast and you can use a linear prediction in the Prometheus query language promql that it will be full in one day.
[18:55] Julius Volz
If it develops further like this, then it's still a good idea to alert on these imminent dangers to your actual service.
[19:05] Julius Volz
But other than that I would be very conservative with actual host level alerts.
[19:09] Julius Volz
And again it depends a lot on your environment of course, because also some environments will be completely happy if one host completely dies, whereas others will just, you know, completely have an issue if you don't have that redundancy.
[19:26] Julius Volz
Right.
[19:27] Viktor Petersson
Yeah, I mean I like the vantage point for SLO or SLA is of what is customer impacting really.
[19:33] Viktor Petersson
I think that's a good, I think that's a good view of how what you should alert.
[19:39] Viktor Petersson
I think that's a good guiding principle.
[19:42] Viktor Petersson
You kind of alluded to alerting, but do you want to speak a bit about Alert Manager as well?
[19:46] Viktor Petersson
Because that's obviously an important corner piece in Prometheus at large.
[19:51] Julius Volz
Yeah, so the idea how Prometheus alerting works in general is that first Prometheus collects all these dimensional metrics, and then anything useful you want to do with the collected data, you probably will use the Prometheus query language, PromQL.
[20:07] Julius Volz
So this is like a dimensional, very useful functional query language for processing your time series data and generating some answers based on the selected data, whether you want to aggregate or select or do mathematical correlations, and so on.
[20:23] Julius Volz
And you can use this query language both for dashboarding, ad hoc debugging, and automation, but also for alerting.
[20:33] Julius Volz
So alerting uses the same PromQL query language.
[20:36] Julius Volz
So this is not the alert manager yet that still happens in Prometheus.
[20:40] Julius Volz
The alerting rules are actually configured as part of the Prometheus server, and the Prometheus server on a regular basis runs the contained PromQL expression in an alerting rule.
[20:53] Julius Volz
And if there is any output, any time series being returned from this PromQL expression in the rule, those outputs will become alerts, subject to some other thresholds that are configured in the rule, and so on.
[21:07] Julius Volz
And then once they do become alerts, they get sent to this separate server component that is called the Alert Manager.
[21:13] Julius Volz
And so the alert manager just receives fully baked, hey, here's a problem style alerts from various Prometheus servers in your infrastructure, and then it routes them based on their labels to some.
[21:29] Julius Volz
So basically the alert manager conflict is like a big routing tree, and you end up in some node of that routing tree based on what labels your alert had.
[21:38] Julius Volz
And you might root on stuff like the team or the service or the severity level of the alert, right?
[21:44] Julius Volz
And then in that routing node that you actually reach, you can configure all kinds of things like do I want to send this to slack or to PagerDuty, for example, to which teams Slack, and so on.
[21:59] Julius Volz
You can say, hey, throttle this in a certain way over time, group this by certain labels.
[22:10] Julius Volz
So, for example, hey, don't send me one alert for every host that is down, but send me one alert that contains all the hosts that are down.
[22:19] Julius Volz
So you can kind of bake many notifications into one with more detail so you don't get flooded.
[22:26] Julius Volz
And yeah, there's a bunch more settings that are possible in the alert manager, but basically this the component where all your Prometheus servers send their raw alerts to, and then it routes and throttles and aggregates and finally dispatches your final alert notifications to a human.
[22:44] Viktor Petersson
Okay, that's perfect.
[22:46] Viktor Petersson
One thing that I think people that start using Prometheus eventually run into is long term storage because obviously Prometheus is not designed for long term storage, it's designed for short term storage where I think Its default is 14 days I believe it is, if not mistaken and then you can set it to higher.
[23:06] Viktor Petersson
But I think it would probably peak out about a month or so on most hosts before it becomes useless.
[23:13] Viktor Petersson
Kind of what I mean, what's your view on that?
[23:17] Viktor Petersson
There have been quite a lot of development with Cortex is one of them and quite a few others.
[23:22] Viktor Petersson
How do you see that evolving and what are you thinking around that?
[23:27] Viktor Petersson
How do you recommend people looking?
[23:29] Julius Volz
Good point.
[23:30] Julius Volz
So Prometheus comes with its own built in TSTB time series database storing files on local disk.
[23:39] Julius Volz
And I would have to say for many use cases that are not huge, this actually works pretty well even as a long term storage.
[23:45] Julius Volz
So there are people like me who have years of data in a single Prometheus database.
[23:51] Julius Volz
But of course if you do collect a lot of data about millions of different time series every couple of seconds, eventually you will hit the scalability bottleneck of one disk and this local TSDB implementation.
[24:06] Julius Volz
And you may also hit depending on your requirements, robustness or reliability issues, when that one disk dies or you lose the data or something, there's ways to back it up so you can still make it work.
[24:19] Julius Volz
So first of all I wanted to say actually you can use the normal Prometheus TSDB quite well for certain long term storage scenarios, but eventually of course you will outgr what a single node can do.
[24:33] Julius Volz
And there's different ways for going around that.
[24:37] Julius Volz
So one is, hey, let's configure the Prometheus server to send all the data or subset off that it collects to some remote storage system.
[24:48] Julius Volz
So it doesn't only store it locally, but forwards it to some remote endpoint.
[24:54] Julius Volz
You can even now if you want to, you could even turn off the local storage completely so it only forwards.
[24:59] Julius Volz
And then that remote endpoint needs to implement a protocol that we call remote write.
[25:05] Julius Volz
This is a protocol that the Prometheus monitoring system standardized.
[25:11] Julius Volz
And by now there are so many different integrations for that protocol.
[25:16] Julius Volz
Like all the major cloud providers accepted.
[25:20] Julius Volz
If you go to like, you know, Amazon, Google and all these, you can usually send Prometheus remote right there or also Chronosphere, Grafana cloud, all these players and they have their own proprietary databases for how they can then scale and store that data in different ways and give you extra features on top and give you a global view over all the data from Your different Prometheus servers.
[25:45] Julius Volz
So this is one option and there are also open source alternatives to this one is Cortex.
[25:53] Julius Volz
And yeah, this also came from a long overlap with Prometheus maintainers.
[26:00] Julius Volz
I was one of the original Cortex creators as well, along with Tom Wilkie.
[26:05] Julius Volz
And this is more of a clustered system that you can run locally, but it is quite complex.
[26:10] Julius Volz
So it's like mainly for people who want to offer Prometheus data storage as a service in their big organization, as a central point where all the different teams can send their Prometheus data and then it will offer you a PromQL compatible API that you can use with Grafana and other tools, even Alert Manager and so on.
[26:32] Julius Volz
So that is really a horizontally scalable multi tenant tool.
[26:37] Julius Volz
It is a bit more complex to deploy.
[26:40] Julius Volz
So most people who do want to have some kind of long term storage on premise tend to use a thing that works a little bit differently called Thanos.
[26:50] Julius Volz
And Thanos works in this way, at least the way it was originally envisioned.
[26:56] Julius Volz
Now it supports different modes as well.
[26:58] Julius Volz
Is you run still all your different Prometheus servers in your infrastructure, monitoring different services and regions and data centers and so on.
[27:09] Julius Volz
But now you want to add both a global view over them and long term storage and then also some ha deuplication and so on.
[27:18] Julius Volz
This is what Thanos can do.
[27:20] Julius Volz
So you still have your Prometheus servers, but now you run little sidecars next to each of them and they all integrate together with the Thanos Querier into this kind of mesh where you can in a federated way query over all of them at once, give them a single PromQL query and the query knows which Prometheus servers it actually needs to hit based on some labels in your query.
[27:43] Julius Volz
So it doesn't need to always go to all of them.
[27:46] Julius Volz
And these sidecars running next to each Prometheus server can also ship all the finished TSDB blocks.
[27:55] Julius Volz
So everything that is a bit older than like 2 or 3 hours to an object storage like either S3 or minio or GCS and others.
[28:05] Julius Volz
And then the Querier, the Thanos Querier can integrate that long term older data as well.
[28:12] Julius Volz
And so there it's really safely backed up right in any of those object storages, you can easily replicate it and all that.
[28:20] Julius Volz
So yeah, you get like pretty cheap very long term storage, including some down sampling features and so on.
[28:28] Julius Volz
So most people who want to scale Prometheus internally, they just use Thanos.
[28:33] Julius Volz
And then others who really want to invest in a dedicated team running such a cluster for other teams would use Cortex.
[28:41] Julius Volz
There's also Mimir, which is a Grafana fork of Cortex now.
[28:45] Julius Volz
And then others again, they would choose some cloud services to do that.
[28:51] Viktor Petersson
Yeah, no, we've been using tunnels at screen enough for a while and we use it both for metrics on our clusters infrastructure, but also on actual devices.
[29:01] Viktor Petersson
So pushing metrics from actual devices.
[29:03] Viktor Petersson
But one of the things I find great and this kind of beautiful simplicity of Prometheus is really the output.
[29:12] Viktor Petersson
Because if you look at how a slash metrics feed looks like, it is so simple and kind of that's what's beautiful about it in many ways.
[29:21] Viktor Petersson
Because writing a Prometheus exporter is very trivial.
[29:26] Viktor Petersson
I mean, you can literally write one in Bash versus many other monitoring frameworks that I worked with in the past.
[29:32] Viktor Petersson
And I guess that combined with the modularity of the remote write, which is kind of recursive almost because you use the same way to push to it as you can do to push further out in the federation.
[29:46] Viktor Petersson
I find that as a beautiful implementation.
[29:51] Viktor Petersson
So so far, what is the most interesting use case you've seen of Prometheus or out of the box use case of Prometheus?
[29:59] Viktor Petersson
You spoke a bit about monitoring your home, but I'm curious, like what's the most crazy use case you've seen so far?
[30:07] Julius Volz
Yeah, the ones that I always find the coolest or most interesting is when I hear someone talk somewhere or tell me about physical use cases like we had.
[30:17] Julius Volz
So maybe I'll just mention like three or four different examples.
[30:21] Julius Volz
One was a prom con.
[30:23] Julius Volz
So we also have a Prometheus conference, a promcon.
[30:26] Julius Volz
Talk about someone monitoring a big wind power park with Prometheus.
[30:31] Julius Volz
So he could even have the rotational angle of all the different, what do you call them, windmills or whatever they are.
[30:39] Julius Volz
Yeah, as a metric.
[30:41] Julius Volz
So like the angle as a metric, the wattage as a metric and everything.
[30:46] Julius Volz
So that was cool.
[30:47] Julius Volz
I heard from a big container shipping company who shall remain unnamed, that they have Prometheus running on all of their vessels and reports home telemetry about the ship, about where it is and what it's doing.
[31:02] Julius Volz
That I found really cool.
[31:06] Julius Volz
One guy from Accenture once gave a meetup talk about the train system in Germany, the Deutsche Bahn, where you have all these displays, signage systems telling you when the next train is coming and which one it is.
[31:20] Julius Volz
There were almost 100,000 or so of those in Germany in the different train platforms.
[31:26] Julius Volz
And they had basically installed the node exporter or something akin to it in each of these.
[31:31] Julius Volz
And we're monitoring it remotely using Prometheus.
[31:35] Julius Volz
So that's kind of cool because then you're on a train platform somewhere and you're like, oh cool.
[31:39] Julius Volz
That's like where basically my software is running in and you can show that.
[31:44] Julius Volz
So I really like those because, you know, all the data center use cases, they're cool.
[31:49] Julius Volz
Of course that's what Prometheus was made for.
[31:52] Julius Volz
But they're kind of the usual thing.
[31:55] Julius Volz
Maybe one tiny use case that someone also reported is they had mold in their bathroom at home and now basically they wanted to monitor the conditions to not get mold again.
[32:09] Julius Volz
So they just installed Prometheus on like a little Raspberry PI or something at home and monitor temperature, humidity and air pressure, I think.
[32:18] Julius Volz
And then you can like calculate in promql whether the dew point is reached and then send an alert in case that happens.
[32:24] Julius Volz
And then you would open the windows or something.
[32:26] Julius Volz
Right.
[32:27] Julius Volz
And so it really scales down to those really tiny use cases as well.
[32:34] Viktor Petersson
That's really cool.
[32:36] Viktor Petersson
So we covered quite a lot about PromQL, but maybe it warrants a bit of a deeper intro to what PromQL is as it's such an important building block in Prometheus.
[32:47] Viktor Petersson
And I'm not sure you want to do some speaking about the background of it and like the history of it.
[32:55] Viktor Petersson
My understanding is it really derived from kind of reverse engineer something that Google had in sense.
[33:00] Viktor Petersson
But maybe you want to speak a bit about like the design implementation and perhaps do a quick demo of how it works if that's.
[33:08] Julius Volz
Yeah, could do that.
[33:09] Julius Volz
So prompt.
[33:10] Julius Volz
Yeah, it's the one unifying query language in the Prometheus ecosystem that you know, allows you to do dashboarding, alerting, debugging, other use cases as well.
[33:20] Julius Volz
Automation for example.
[33:23] Julius Volz
And it is a non SQL style query language where you really just write a functional expression that some function that might take other parameters that are in turn.
[33:36] Julius Volz
Again an expression that could take other parameters.
[33:38] Julius Volz
So it could form like an arbitrarily deep tree.
[33:41] Julius Volz
But eventually if you evaluate that tree, you'll get some kind of result so you can select some data.
[33:48] Julius Volz
Then you might.
[33:48] Julius Volz
If it's counters, then you could take the rate of increase around it and then maybe you would want to sum up all the individual rates but keep certain dimensions in the sum.
[33:59] Julius Volz
So we still group by the path but get rid of all the other.
[34:02] Julius Volz
The instance level detail.
[34:04] Julius Volz
For example, you might have a whole set of error rates divided by a whole other set of error rates by a whole set of total rates correlated on identical label Sets so you can get a whole list of error rate ratios out and then alert on that.
[34:22] Julius Volz
You can filter.
[34:24] Julius Volz
You have a bunch of different functions basically to compute different things.
[34:28] Julius Volz
Very geared around systems monitoring.
[34:31] Julius Volz
So not like deep machine learning statistics or so, but really quick functions that behave pretty simply to give you right now answers about your infrastructure and whether you should alert someone, for example.
[34:45] Julius Volz
And yeah, I mean, if you like, I could try sharing my screen and just showing you what the language looks like.
[34:53] Viktor Petersson
Let's do it.
[34:54] Julius Volz
Let's try that.
[34:55] Julius Volz
Okay, so I have this.
[34:59] Julius Volz
Let's see.
[35:00] Julius Volz
I have.
[35:02] Julius Volz
Should I share?
[35:03] Julius Volz
Yeah, I'll share the entire screen here.
[35:06] Julius Volz
Of course it only shows me.
[35:08] Julius Volz
Okay, I will do the entire screen because otherwise it does not show me the windows that I have on a different virtual desktop.
[35:15] Julius Volz
So let me know once you can see stuff.
[35:23] Viktor Petersson
I can see it.
[35:23] Julius Volz
I'll just maybe zoom in a little bit more even here.
[35:27] Julius Volz
So this is Promlens is one of the possible interfaces for dealing with PromQL.
[35:33] Julius Volz
Basically what you have here is you have an expression input box and you can, you know, first of all you can select some data, of course.
[35:41] Julius Volz
Now, okay, let me reload that.
[35:49] Julius Volz
So for example, some HTTP level counter metrics counting for three different demo service instances, which process they come from, which group of processes.
[36:01] Julius Volz
This is the job.
[36:03] Julius Volz
The method for which an HTTP request was handled, the path on which it happened, the status code that was the result of the request.
[36:12] Julius Volz
So more for the response status code, I guess.
[36:15] Julius Volz
And these are raw counter values.
[36:18] Julius Volz
So if you just graph them like this, they don't look too useful because these only go up basically starting from whenever this demo service was started.
[36:28] Julius Volz
So they start at zero and then they just go up.
[36:31] Julius Volz
So what you want to know is you want to see like how fast do they go up.
[36:35] Julius Volz
And I could add a rate function around this here.
[36:38] Julius Volz
Not going too much in detail now about how this works in detail.
[36:43] Julius Volz
But then you will see at every point in the graph averaged over a 5 minutes window.
[36:49] Julius Volz
What's the actual per second request rate for the different dimensional combinations Here you can see if I hover over this.
[36:57] Julius Volz
This is method get path API bar status 200 and you get that for the different other ones as well.
[37:04] Julius Volz
If we wanted to now we could sum this and only keep the path dimension.
[37:09] Julius Volz
For example now we would have it only for every path instead of all the full details.
[37:14] Julius Volz
This is the summed up rates, aggregated rates.
[37:18] Julius Volz
I could still break out the method if I wanted to.
[37:23] Julius Volz
And since I'm using PROM lens, it's showing me my textual query as a kind of laid out tree here as well, where I can see how many results I have in each of the sub nodes of my query.
[37:35] Julius Volz
Right?
[37:35] Julius Volz
This is like a full, could be a first class expression by itself.
[37:41] Julius Volz
This demo API request duration, seconds count, metric name and then this one, the rate call also produces 27 results.
[37:50] Julius Volz
But those 27 results get shrunken down by the aggregation to only 5 results because we only have 5 path and method combinations that we end up with here, down here.
[38:02] Julius Volz
Now one further thing we could do is we could say, hey, let's only select the ones that ended with a 500 so errors, right?
[38:13] Julius Volz
This would be like error rates by path and method combination.
[38:19] Julius Volz
And now we could correlate those error rates for each path and method, which we can also see here as a table as current values.
[38:29] Julius Volz
We could correlate those to the total rates we get for the same dimensional combinations of method and path.
[38:36] Julius Volz
So now I could say, hey, divide this entire thing by basically the same thing, except here I'm, you know, in below the division, I'm not going to select only the arrows, I'm going to select all the requests to arrive at a total sum.
[38:53] Julius Volz
And I can indent this a little bit better here.
[38:56] Julius Volz
And yeah, now what happens is that this binary operator automatically does a join on identical label sets between the first and second operand.
[39:05] Julius Volz
We can also show that here how the matching is happening.
[39:08] Julius Volz
We see that sum will not actually find label combinations that have 500 errors.
[39:15] Julius Volz
So those will not produce an output.
[39:19] Julius Volz
And yeah, by default that's what happens.
[39:21] Julius Volz
You can customize these binary operations to match on a subset of labels to allow many to one or one to many.
[39:27] Julius Volz
Matching won't do that now.
[39:30] Julius Volz
So what we're getting now is basically ratios of how many errors there are.
[39:35] Julius Volz
I could multiply those by 100 to get percentages.
[39:40] Julius Volz
So we have 0.3% of errors in this first combination, for example, 0.7 in the second.
[39:48] Julius Volz
And now I could extend this to some kind of alerting rule where I say, hey, give me all the ones that are larger than 0.5%.
[39:57] Julius Volz
And so I just filtered this entire list of output time series down to the ones that have a sample value of larger than 0.5%.
[40:06] Julius Volz
And now I would get exactly the problematic label combination.
[40:11] Julius Volz
This is just one example of what you can do with PromQL.
[40:13] Julius Volz
There's histogram stuff in there's other stuff, but it's really just working with this label based data model and yeah, if you want to learn more about PromQL and all the different constructs you can write, I also created a cheat [email protected] promql cheat sheet.
[40:32] Julius Volz
You can find it also here under the resources.
[40:36] Julius Volz
And here I try to just list all the major patterns that people usually Write when doing PromQL.
[40:44] Julius Volz
So this won't be everything, but you can open each of these patterns in PROM Lens by just clicking on a button here and then you have some demo data to actually work with and play with expression and change it and see what it does.
[40:56] Julius Volz
You can also let the different nodes get explained to you here what they actually do.
[41:04] Julius Volz
You get the explanation down here.
[41:07] Julius Volz
So yeah, that's just like a super duper brief intro to what PromQL looks like.
[41:13] Julius Volz
And yeah, importantly and I'll stop sharing here, it's only used for reading data.
[41:20] Julius Volz
So if you want to write data into Prometheus or delete some data that happens over totally different paths.
[41:26] Julius Volz
Like Prometheus writing into the TSDB after his it has collected some data or if data has become too old it will become deleted depending on your retention policies.
[41:37] Julius Volz
But PromQL itself only always gives you some results.
[41:42] Julius Volz
It's a read only language and really flexible at giving you precise answers and conditions.
[41:50] Viktor Petersson
So one thing, I mean just very simple and question I guess around promql I guess maybe you can just explain it quickly like when you look at demo examples of Prometheus particular in the Stack Overflow forums and so on.
[42:05] Viktor Petersson
The most common functions is rate and irate.
[42:08] Viktor Petersson
Do you want to just help explain the difference between the two just quickly for.
[42:13] Viktor Petersson
Yeah, maybe people like myself, maybe I.
[42:15] Julius Volz
Should also if you go to my YouTube channel, I think it's so PromLabs is the channel name on YouTube.
[42:22] Julius Volz
I have a whole video about the three types of different rates and also the increase function that explains the exact differences.
[42:30] Julius Volz
Basically there are three different functions for telling how fast the counter is going up and those are rate, irate and increase.
[42:39] Julius Volz
The rate function gives you.
[42:42] Julius Volz
Maybe I should share my screen again because it's so graphically demonstratable.
[42:48] Julius Volz
That totally makes sense.
[42:50] Julius Volz
So I'll just do that again.
[42:51] Julius Volz
So let me know again once you see stuff these other ones here.
[42:57] Julius Volz
So now I'm just in a vanilla Prometheus server expression entry interface.
[43:02] Julius Volz
So let's say I do the rate over these request rate metrics that I saw that I used earlier.
[43:11] Julius Volz
What we're doing now is at every resolution step in the graph we are basically selecting five minutes of past data.
[43:20] Julius Volz
So if we're producing like this output point in the graph, we're looking five minutes backwards.
[43:24] Julius Volz
We're taking all the raw data, the only increasing counters, and we are kind of calculating an average of how fast per second is the counter or each of these counters increasing under that window.
[43:37] Julius Volz
And that per second increase value becomes the output point for this step in the graph.
[43:44] Julius Volz
And it can make these resolution steps a little bit more visible here.
[43:47] Julius Volz
If I set the resolution to five minutes, then we get a lot of explicitly visible resolution steps along the graph.
[43:55] Julius Volz
So you can really imagine that every one of these steps we look back this amount of time that I specify the expression.
[44:03] Julius Volz
And yeah, in this case this would be an entire step because this is five minutes and this is five minutes.
[44:11] Julius Volz
And it just summarizes that entire steps samples into an average per second value.
[44:18] Julius Volz
Now what happens if I make this window a little bit smaller, but also I will go back to the original resolution and start with five minutes.
[44:30] Julius Volz
So five minutes, I'm smoothing over five minutes so it will be relatively smooth.
[44:36] Julius Volz
And if I make this averaging window smaller, like one minute, I get more spiky rates because now I'm averaging over fewer samples.
[44:45] Julius Volz
Same If I do 30 seconds, then it gets like really spiky.
[44:49] Julius Volz
And I only scrape data every 15 seconds in this Prometheus server.
[44:53] Julius Volz
So eventually I even have to be careful that I don't make the window too small, because otherwise I will not always at every point be lucky enough to actually span two samples with my averaging window.
[45:09] Julius Volz
Right.
[45:09] Julius Volz
If they're 15 seconds apart, it could be that there are not two of them under a 20 seconds window.
[45:15] Julius Volz
This gets exacerbated if I do a 16 seconds window.
[45:19] Julius Volz
Then only sometimes are they actually going to be two samples under the rate window to be able to even compute a rate.
[45:25] Julius Volz
And the graph will mostly be cappy.
[45:28] Julius Volz
So you do have to choose your window large enough to at least robustly select two samples.
[45:36] Julius Volz
But what if you wanted to say something like, hey, I want my graph to behave as spiky as possible.
[45:44] Julius Volz
Like react as much as possible to the most two recent data points while still not having to care too much about how small?
[45:53] Julius Volz
Exactly.
[45:54] Julius Volz
I can choose this rate window.
[45:56] Julius Volz
So that's the irate function, or also called instant rate function.
[46:01] Julius Volz
It also gives you a per second rate of increase.
[46:05] Julius Volz
But under this provided window here, it will only ever choose the latest two samples and see how much per second does the counter increase between them.
[46:17] Julius Volz
So it doesn't matter anymore how large or small you make this here you will always get the same result providing I anchor it to a constant time here.
[46:27] Julius Volz
So one hour versus one minute it will look the same as long as you make this window large enough to always cover two data points.
[46:36] Julius Volz
Again, if I make it too small, you will run into issues again.
[46:39] Julius Volz
But that's basically what the irate function is about.
[46:43] Julius Volz
Normally I would not recommend using it unless you have a specific reason because it will skip over most of the data in your range interval here you might have five minutes worth of data in there and but now you're only looking at the latest two data points in there to give you some answer and they might not really be representative.
[47:04] Julius Volz
And so usually people use rate.
[47:07] Julius Volz
It's good to average or smooth over some amount of time.
[47:10] Julius Volz
Not too long maybe.
[47:13] Julius Volz
But if you are in a super zoomed in graph like this and you want to see really the latest developments, then maybe irete can make sense at times, right?
[47:25] Julius Volz
But yeah, use it sparingly.
[47:27] Julius Volz
And then a last function called increase is basically identical to rate except that it does not convert the output unit to per second.
[47:40] Julius Volz
So the shape of the graph, if I go back to the one hour graph, let's say, or let's go to two hours, the shape will look identical with rate and increase here.
[47:51] Julius Volz
The only difference is the Y axis.
[47:54] Julius Volz
So here we get a per second Y axis and if I go to increase with the same window now we get increases not per second but per one minute because I have a one minute window here.
[48:07] Julius Volz
Or if I do 15m, then we would have increases per 15 minutes.
[48:13] Julius Volz
And so you see again, but this kind of intermingles two aspects.
[48:18] Julius Volz
We intermingle the smoothing window size so how smooth the graph will look and the output unit.
[48:26] Julius Volz
So if I change this window, I will both change the smoothing and what unit I actually output, which at times might be actually what you want, right?
[48:35] Julius Volz
What's the total increase over one day or so?
[48:38] Julius Volz
Then you don't have to multiply it in the end with the number of seconds in a day, but at other times that's not actually what you want.
[48:45] Julius Volz
So most of the time by default the rate function is actually great.
[48:50] Julius Volz
It gives you a predictable per second output unit and that's great if you then want to combine it and divide it and so on with other constructs in your query.
[49:02] Julius Volz
If you keep everything at these base units, everything will stay more predictable and yeah, so that's why most of the time you will see just the rate function.
[49:13] Viktor Petersson
Got it.
[49:13] Viktor Petersson
Okay, that's really cool.
[49:15] Julius Volz
Any other promql questions?
[49:17] Julius Volz
Or answer the screen.
[49:20] Viktor Petersson
Yeah, I think rate, I would presume is your starting point for a lot of metrics where when you're starting out, I guess that is the function that you start using in most common.
[49:36] Julius Volz
Most common, yeah.
[49:38] Julius Volz
Rate and then summing of rates I would say is one of the most common things you will see.
[49:42] Julius Volz
Yeah, cool.
[49:44] Viktor Petersson
No, I think that's a good.
[49:45] Viktor Petersson
That's a good crash course and quick introduction.
[49:47] Julius Volz
Nice.
[49:48] Viktor Petersson
Cool.
[49:49] Viktor Petersson
Let's turn our eyes to the future a little bit and I'm curious about a few things to pick your brains on.
[49:56] Viktor Petersson
The first one is EBPF is all the rave these days in observability.
[50:03] Viktor Petersson
Is there any play for Prometheus to tap into EBPF for, I guess collecting metrics with less resource overhead?
[50:13] Viktor Petersson
Or how do you see that change in kind of observability in Prometheus?
[50:17] Julius Volz
Yeah, so myself, I'm not a big EBPF expert, of course it's awesome kernel level interface to allow you to tap into pretty much any place in the kernel and extract information or do stuff, whatever you want in there.
[50:31] Julius Volz
There are EPP exporters for Prometheus, so if you just Google EBPF exporter, you will find one by Cloudflare for example.
[50:40] Julius Volz
And yeah, you can configure it in different ways to give you all kinds of host metrics.
[50:46] Julius Volz
And I would actually also have to read the detailed documentation about what exactly it can do.
[50:53] Julius Volz
But yeah, basically it will enable you to collect metrics about all kinds of things you can, you know, instrument with ebpf.
[51:03] Viktor Petersson
You don't anticipate a world where an EBPF based node exporter would replace the regular GO based node exporter in the future?
[51:12] Julius Volz
It's a good question.
[51:13] Julius Volz
I mean, at the moment the question is what would that give us if, you know, if we could get at stats that are not exposed using the current interfaces which are fine, like the PROC file system sys file system and certain syscalls.
[51:31] Julius Volz
It could be that there are certain stats in the Linux kernel that we would also want to expose that would be possible in that way.
[51:39] Julius Volz
Let me just see if anyone has already added anything EBPF wise to the node exporter.
[51:45] Julius Volz
You never know because there's so many different Prometheus components these days.
[51:50] Viktor Petersson
Right?
[51:51] Julius Volz
Ebp, I don't find anything EBPF at least module wise in the node exporter yet.
[51:58] Julius Volz
But of course it makes sense because it's also node level metrics that you might at some point want to put something like that directly into the Node exporter and until that time run the separate EBPF exporter.
[52:12] Julius Volz
But yeah, I could guess what are the upsides either being able to get at metrics that you don't otherwise have, being able to get at them without writing as much custom parsing code because it's a bit annoying to parse those proc file system files sometimes and also maybe to get something more efficiently.
[52:35] Julius Volz
But so far it has not happened in the node exporter, but totally for it, if it makes sense.
[52:43] Viktor Petersson
I'm just thinking like you could increase the scraping period, for instance, and to get metrics every second instead of every 15 seconds, perhaps without taking the.
[52:52] Julius Volz
I mean usually that's like most of the time the scraping frequency is not really bottlenecked on the producer of the metrics and more by the Prometheus server being able to ingest a lot of data.
[53:05] Julius Volz
It's fairly good at that, but that's still usually the bottleneck.
[53:11] Julius Volz
But yeah, there's some modules in the node exporter that take a little bit more time to produce data, which are some of them I think also switched off by default.
[53:22] Julius Volz
So it could be that there's.
[53:24] Julius Volz
I'm not sure if there's something in there that could be made like way more efficient by ebpf because already the native kernel interface that exposes it, I hope most of those will already be relatively efficient.
[53:40] Viktor Petersson
Fair enough.
[53:43] Viktor Petersson
Open telemetry seems to be kind of taken big, well taken.
[53:49] Viktor Petersson
Observability kind of at a pretty high pace right now.
[53:52] Viktor Petersson
It's gaining a lot of momentum.
[53:55] Viktor Petersson
Obviously Prometheus is kind of a building block in that.
[53:58] Viktor Petersson
How do you see open telemetry kind of as the observability stack of the future for cloud, native workloads or how do you envision that?
[54:11] Julius Volz
Yeah, I mean, so open telemetry came from a little bit of a different corner originally.
[54:17] Julius Volz
Very tracing focused with open tracing and open senses and then merging into open telemetry.
[54:24] Julius Volz
So that was their.
[54:24] Julius Volz
The prime kind of initial use case that was most stable.
[54:28] Julius Volz
And then there were logs and metrics, with metrics now also being more stable and people adopting it.
[54:34] Julius Volz
And they focus a lot on the instrumentation API aspect, like standardizing the instrumentation API.
[54:44] Julius Volz
What should it look like to instrument your application and what metadata should there be in certain interfaces for metrics and so on?
[54:52] Julius Volz
And in a lot of cases those are similar to what we do in Prometheus, but in some other cases they're a bit different.
[55:01] Julius Volz
They have slightly different metric types, they have slightly different allowed character sets, especially in label Names, they can be arbitrary UTF 8 characters if I'm not mistaken.
[55:14] Julius Volz
Or at least we want to allow that now in Prometheus as a result of, you know, wanting more open telemetry compatibility.
[55:22] Julius Volz
And they also have different standardizations of what labels will you want to always collect about where a given metric came from, the resource label.
[55:36] Julius Volz
And in Prometheus we have something similar called target labels that Prometheus attaches to wherever it pulls metrics from.
[55:44] Julius Volz
But they're slightly different and they're more of an admin configurable stuff and less standardized.
[55:51] Julius Volz
And so now there's efforts in the Prometheus team wanting to of course still be relevant in this world.
[56:00] Julius Volz
And people want to use Prometheus as an otL metric store.
[56:05] Julius Volz
Prometheus will still only do the metrics part and not logging or tracing, but at least for the metrics part.
[56:11] Julius Volz
If you're using oatmeal telemetry to instrument your apps, then we want to be able to store OTAL metrics in a more native and better way.
[56:23] Julius Volz
You can already send in an experimental way OTAL metrics to Prometheus.
[56:28] Julius Volz
There's an experimental OTAL receiver in there, or there's different ways of making it work.
[56:34] Julius Volz
You can either send the OTLP stuff directly to Prometheus using the new experimental interface, or you could bridge it earlier on into a Prometheus native remote write format and send that to Prometheus.
[56:47] Julius Volz
But we want to in Prometheus extend the data model somehow to allow more characters so we are able to store more of the data model that OTAL has.
[57:00] Julius Volz
For metrics, we're looking at ways of supporting what they call, what's it called, a delta, not delta cardinality.
[57:11] Julius Volz
When you count like either when you send absolute counters that only increase over all of time, or you only send deltas from one push to the next.
[57:21] Julius Volz
And in a push based monitoring system that's easier because your process, it counts a lot of things from one push to the next and then it's, it resets the counter to zero and so it will only like, at every push it will send the delta that has happened between two pushes.
[57:37] Julius Volz
Right?
[57:39] Julius Volz
And the problem with a pull based monitoring system like Prometheus is that pools are supposed to be idempotent and any service process that is being monitored by Prometheus could be monitored by either 0, Promethei or like by many Prometheus servers at the same time.
[57:57] Julius Volz
And it shouldn't then any scrape from a Prometheus server shouldn't Just reset whatever the current count is.
[58:03] Julius Volz
So we have to add an additional little translation layer in there to potentially transform these delta temporality, I think counters into absolute counters again or support them natively or something like this.
[58:23] Julius Volz
So this is still being hashed out and there's a couple of other things around mapping these resource and target labels.
[58:29] Julius Volz
Like do we always take this huge amount of hotel labels and just put it into Prometheus metric?
[58:35] Julius Volz
That would get super messy.
[58:37] Julius Volz
Or do we allow users to map like configure which ones to map over?
[58:42] Julius Volz
And how do we always just put it into a separate metric that you then have to join in?
[58:47] Julius Volz
That's also annoying.
[58:48] Julius Volz
So there's definitely some little incompatibilities that will make, you know, I would still recommend people like use native Prometheus instrumentation.
[58:59] Julius Volz
It's going to be way less code and it's going to work faster and more simple.
[59:04] Julius Volz
But if you do want to use otel, like if you just are interested in the metrics use case and you still want to use otel, then we are still going to work on making that better.
[59:16] Viktor Petersson
Okay, makes sense.
[59:17] Viktor Petersson
Yeah, I guess that's fundamentally a different thing between push and pull.
[59:20] Viktor Petersson
I mean that was at least in our use case of screenly, that was a bit of an inconvenient thing.
[59:26] Viktor Petersson
When you work with edge devices, for instance, you can't scrape them because obviously they're not in your data center.
[59:32] Viktor Petersson
So the push ball there makes a lot more sense for our use case.
[59:35] Viktor Petersson
But I know Prometheus heavily favors pull versus push.
[59:40] Julius Volz
Yep.
[59:41] Julius Volz
For different reasons.
[59:42] Viktor Petersson
But yeah, absolutely, absolutely.
[59:44] Viktor Petersson
But I'm saying that might push that push use case a bit further, which we would welcome.
[59:52] Julius Volz
Makes sense for anyone with like super highly segmented networks or you know, end devices at customers and so on, where, you know, polling is really great if you can reach everything easily in your own data center and so on.
[01:00:04] Julius Volz
But yeah, it gets harder as you break things up.
[01:00:08] Viktor Petersson
Absolutely, absolutely.
[01:00:09] Viktor Petersson
There's another project that I know you've been a little bit touching upon as well.
[01:00:13] Viktor Petersson
It's OpenMetrics.
[01:00:15] Viktor Petersson
Do you want to speak a bit about that and how that kind of.
[01:00:18] Viktor Petersson
There's a lot of open here, but.
[01:00:21] Julius Volz
How that relates to the whole openmetrics was an idea of actually taking what we had as Prometheus metrics text transfer protocol and just standardizing it more with.
[01:00:34] Julius Volz
With some changes requested by other parties and stakeholders.
[01:00:39] Julius Volz
So this was an initiative started by Richard Hartman of the Prometheus project and he got others together from other companies and said, hey, let's do an RFC Internet standard.
[01:00:48] Julius Volz
And it made it pretty far and it's also supported by Prometheus, but by now basically it was decided by the Prometheus team and the Open Metrics team.
[01:01:01] Julius Volz
Like hey, I think we still want to do some things differently and let's just actually merge Open Metrics back into Prometheus again and not have it as a separate project anymore.
[01:01:12] Julius Volz
So it's just going to be archived in the cncf and it's still, I think explored a lot of the needs of, you know, if we want to standardize a metrics transfer protocol or evolve Prometheuses further, what are needs that other parties actually want in such a thing.
[01:01:33] Julius Volz
And so it did help, but it will no longer exist to my knowledge, like as a separate project.
[01:01:41] Viktor Petersson
Okay, but it sounds like you made headways in terms of that you can now push Prometheus style metrics to most cloud vendors, which I guess it sounds like that kind of paved the way to that.
[01:01:54] Julius Volz
Yeah, that's a different thing because like the Open Metrics system is so you have the Prometheus server and you have your monitor target.
[01:02:02] Julius Volz
That's what happens between those two.
[01:02:04] Julius Volz
And then the protocol that we speak to the outside world, that's the remote write protocol.
[01:02:08] Julius Volz
So that's a totally different one that is independent of the scrape format.
[01:02:15] Julius Volz
Right.
[01:02:15] Julius Volz
It has slightly different needs as well because you will want to be able to send multiple samples from multiple scrapes for many more different time series over this remote write protocol.
[01:02:27] Julius Volz
Whereas the pull based open Metrics or the previous text based format on which it is based, they're really for going to a single monitored process or target and pulling its current metric state.
[01:02:43] Julius Volz
Just the one number for each metric at this point in time.
[01:02:47] Viktor Petersson
Okay, that's perfect.
[01:02:50] Viktor Petersson
I think we're up on time.
[01:02:51] Viktor Petersson
But it's been very informational and helpful for me and I think that's really good.
[01:02:57] Viktor Petersson
And do you want to do a quick shout up if somebody needs consulting service around Prometheus where they can get in touch with you and how they can.
[01:03:05] Viktor Petersson
Yeah, how they can.
[01:03:06] Julius Volz
Oh yeah.
[01:03:07] Viktor Petersson
Take their Prometheus game to the next level?
[01:03:09] Julius Volz
I'd love to.
[01:03:09] Julius Volz
So I mean these days I mostly do training around Prometheus so no longer just, you know, per hour consulting.
[01:03:16] Julius Volz
I have two main things I do besides other partnerships and still open source development.
[01:03:21] Julius Volz
So those are live trainings around the PromQL query language where I really explain over a multi hour session kind of with a setup like this here right now and screen sharing and material and everything, how PromQL works, how you can use it, all the different pitfalls with exercises and so on.
[01:03:41] Julius Volz
So I do those, you know, quite often with teams in companies.
[01:03:46] Julius Volz
And the other thing is, so you can find [email protected] and you go to, I think at the very top there's a live trainings link and I also have self paced courses, another link on the side and that leads you to training.promlabs.com and those are training modules.
[01:04:05] Julius Volz
So courses that you can either buy as an individual or you can buy for your entire engineering team as a company to learn all the basics of Prometheus, like from what is Prometheus to all the advanced PromQL stuff and alerting and dashboarding and node exporter metrics and remote storage and all that.
[01:04:26] Julius Volz
So it's textual and images and screenshots and quizzes and interactive stuff where you set up all these components and string them together and build setups.
[01:04:37] Julius Volz
So it's not video content.
[01:04:39] Julius Volz
But that makes it actually possible for me to keep up to date all the time, which would be really hard with videos.
[01:04:46] Julius Volz
So yeah, individuals buy them and companies do as well.
[01:04:50] Julius Volz
And of course I can recommend them.
[01:04:53] Viktor Petersson
Perfect.
[01:04:54] Viktor Petersson
Thank you so much for your time today, Julius.
[01:04:57] Viktor Petersson
Much pleasure having you in the show and I talk to you soon.
[01:05:00] Julius Volz
Alrighty.
[01:05:00] Viktor Petersson
Thank you so much.
[01:05:01] Julius Volz
See you.
[01:05:01] Viktor Petersson
Thank you.
[01:05:02] Julius Volz
Bye.

Found an error or typo? File PR against this file or the transcript.