Viktor Petersson logo

Podcast

Follow Me

Join Viktor, a proud nerd and seasoned entrepreneur, whose academic journey at Santa Clara University in Silicon Valley sparked a career marked by innovation and foresight. From his college days, Viktor embarked on an entrepreneurial path, beginning with YippieMove, a groundbreaking email migration service, and continuing with a series of bootstrapped ventures.

A deep dive into the SBOM format SPDX

Play On Listen to podcast on YouTube Listen to podcast on Spotify Listen to podcast on Apple Listen to podcast on Amazon music
16 JAN • 2025 50 mins
Share:

In this episode, I spoke with Kate Stewart from the Linux Foundation and Gary O’Neall, a long-time SPDX contributor, about the evolution of SPDX and its role in software transparency. We discussed how SPDX started as a tool for tracking open-source license compliance and grew to address broader needs in security and vulnerability management.

Kate and Gary walked through the technical challenges teams face when generating accurate SBOMs, including handling circular dependencies and dealing with uncertainty in software components. They shared practical examples from their work with various organizations and explained how these challenges influenced the development of SPDX tools and specifications.

We explored current efforts to integrate SBOM generation into build systems, looking at specific examples from the Zephyr and Yocto projects. The discussion covered ongoing work to implement build-time SBOM generation for the Linux kernel, highlighting both the technical approach and its practical benefits for development teams.

The conversation then turned to the growing regulatory requirements around SBOMs, particularly in safety-critical systems. Kate and Gary explained how SPDX 3.0 is being developed to handle these requirements while supporting modern CI/CD pipelines. They described the technical considerations behind maintaining compatibility with existing tools while adding features for new use cases.

SPDX remains an open, community-driven project that continues to evolve with industry needs. Whether you’re dealing with compliance, security vulnerabilities, or supply chain transparency, this episode provides concrete insights into how SPDX can help address these challenges in your software development workflow.

Related episodes:

Transcript

Show/Hide Transcript
[00:05] Viktor Petersson
Welcome back to another episode of Nerding out with Victor.
[00:08] Viktor Petersson
Today we're doing another SBOM episode and we got to cover spdx.
[00:12] Viktor Petersson
So with me today I have Kate Stewart and Gary O'Neill.
[00:15] Viktor Petersson
Welcome, both of you.
[00:17] Gary O’Neall
Thank you.
[00:18] Gary O’Neall
Thanks for having us.
[00:20] Viktor Petersson
Pleasure to have you on the show, Gary.
[00:22] Viktor Petersson
We've been corresponding quite a lot over the last few months.
[00:24] Viktor Petersson
We are on some working groups together.
[00:26] Viktor Petersson
And so let's maybe before we dive into the conversations, maybe we start with doing a round of introduction.
[00:33] Viktor Petersson
Gary, maybe, if you want to start.
[00:35] Gary O’Neall
Sure, sure.
[00:35] Gary O’Neall
So I've been working with SBOMs since way before it was popular, over 10 years.
[00:42] Gary O’Neall
I got into this business when my company was acquired by Microsoft and I went through due diligence and learned the importance of knowing what's in your software.
[00:51] Gary O’Neall
And so I've been doing.
[00:53] Gary O’Neall
I run a small consulting business that does analysis for a lot of acquisitions.
[00:58] Gary O’Neall
And I've been working with the software product data exchange almost from the very beginning.
[01:03] Gary O’Neall
And I'm kind of known a little bit as the tools guy.
[01:07] Gary O’Neall
So I've written a lot of tools to do spdx.
[01:11] Viktor Petersson
Thank you, Gary.
[01:12] Viktor Petersson
Kate?
[01:13] Kate Steward
Yeah, and I've been with the system package data Exchange now actually we rebranded our name recently just to catch the fact that we're more than just software at this point now.
[01:28] Kate Steward
Pretty much since 2009 when I was having to ship BSPS as part of being at a semiconductor manufacturer and we had to basically adhere to what was the actual terms and actually know what was in them and have a way we could share it with people.
[01:42] Kate Steward
And so that was sort of the starting point.
[01:44] Kate Steward
And Gary was joining very shortly thereafter and making it real and helping us all try to get keep this grounded in things that people can actually implement and move it forward from that.
[01:55] Viktor Petersson
Amazing.
[01:55] Viktor Petersson
Thanks, Kate.
[01:56] Viktor Petersson
So I guess Gary, you kind of almost answered my first question.
[02:00] Viktor Petersson
And my first question was going to be what was your first introduction to SBOMs really?
[02:05] Viktor Petersson
Because obviously both you guys have been around for quite some time and maybe Gary, start with you.
[02:09] Viktor Petersson
You mentioned that it's coming out of M and A, right?
[02:13] Viktor Petersson
Which is something that I've heard before and yeah, walk me through that process back then and how that looked and how that the landscape I've changed ever since.
[02:21] Viktor Petersson
Really?
[02:22] Gary O’Neall
Oh, yeah, it's changed quite a bit.
[02:23] Gary O’Neall
I mean, when we first got started, the, you know, I really had to evangelize to people why you would want to like track what's in your software.
[02:35] Gary O’Neall
And you know, and you know, it started more from the license Licensing side and compliance.
[02:42] Gary O’Neall
And I think it was only two years into SPDX where we started to, you know, kind of shape it more for security use cases.
[02:50] Gary O’Neall
But, you know, in either case, the problem statement's the same is, you know, I got some software, some, but from somebody.
[02:57] Gary O’Neall
Whether you're acquiring it as a company or acquiring it to use in your own enterprise, you kind of want to know what's inside the software.
[03:04] Gary O’Neall
And you know, 10 to 15 years ago, it was like a total evangelism.
[03:08] Gary O’Neall
It's like, no, you really want to know what's in there.
[03:10] Gary O’Neall
It's like, I don't know, it's kind of expensive.
[03:13] Gary O’Neall
And then, and then when you finally get to something like an M and a transaction and all of a sudden it's an emergency and that's when I get called in.
[03:21] Gary O’Neall
So.
[03:21] Gary O’Neall
And now it's changed quite a bit, especially with the NTIA and the government regulations.
[03:27] Gary O’Neall
Now everybody, it's no longer evangelizing.
[03:30] Gary O’Neall
It's more a question of now how do It?
[03:32] Gary O’Neall
Which is, which is great to see, right?
[03:34] Viktor Petersson
Yeah, Absolutely.
[03:36] Viktor Petersson
And Kate, it sounds like you started on the semiconductor side of things, which is slightly different, but I see a big wave from that as well.
[03:42] Viktor Petersson
So maybe shed some lights on how your view has changed or that view has changed, but.
[03:48] Viktor Petersson
Yes.
[03:49] Kate Steward
Well, for us to sell the silicon, we actually had to provide enablement, which meant like a Linux kernel and compilers and tool chains and bootloaders and things like that.
[04:02] Kate Steward
And the easiest path for us to do that was open source.
[04:06] Kate Steward
And so we got your basic bootloader, your kernel, and then a variety of user space packages and you put them into a BSP or board support package.
[04:15] Kate Steward
For us to ship that board support package, we needed to actually understand what was the licenses of all these open source components and make sure we complied to the terms of them.
[04:24] Kate Steward
The challenge was that I was doing this exercise in a semiconductor factory.
[04:29] Kate Steward
Colleagues at Monte Vista were doing the same exercise, colleagues at Wind river were doing the same exercise.
[04:33] Kate Steward
We had no way of sharing it.
[04:36] Kate Steward
And so we sort of tried to figure out how can we share this metadata about the packages we're looking at.
[04:41] Kate Steward
And so we basically got together and started talking and started figuring out what properties we want to share.
[04:49] Kate Steward
And that was the start of forming up the SPDX spec.
[04:53] Kate Steward
And then we, you know, and then we got use case after use case after use case.
[05:00] Kate Steward
That has taken us down some interesting directions.
[05:04] Kate Steward
And so once we, you know, once were able to start sharing things, I guess I think the 1.2 release, it was good, but then all of a sudden was sort of like, well, how do we start dealing with all these abstractions that are in the industry?
[05:15] Kate Steward
Like what a group of things, you know, we call in SPDX a group of anything, a package, effectively.
[05:23] Kate Steward
Now this gets overloaded with package managers and all the packaging technology.
[05:26] Kate Steward
But realistically for us, a group of things, like a tarball is that.
[05:31] Kate Steward
And having things as part of a file or a group of files has been a very powerful abstraction for us to use and leverage.
[05:41] Kate Steward
And so we've been sort of working that use cases into the next rev, which was the 2.0.
[05:47] Kate Steward
And 2.0 was kind of what we ended up refining and taking to ISO as a standard.
[05:53] Kate Steward
And we also put the security stuff in there.
[05:55] Kate Steward
Actually, at that time that was.
[05:56] Kate Steward
There was a lot of the security stuff started showing up with references.
[05:59] Kate Steward
So you could link packages or components or car balls for that matter, to vulnerabilities or CPEs and so forth.
[06:07] Kate Steward
So we started those use cases being represented at that point and linking things forward.
[06:13] Kate Steward
And like I say, we took it to ISO and it's been out there for.
[06:16] Kate Steward
Since, I guess what, we took this stuff there.
[06:20] Kate Steward
It's about four or five years now, I guess, and we've been continuing to refine it.
[06:25] Kate Steward
And more and more I've gotten really interested in how we deal with things for critical infrastructure and how we actually start looking at safety.
[06:35] Kate Steward
And so this is one of the areas that I care about a lot and I'm focusing on.
[06:40] Kate Steward
Gary's focusing on some other areas too, like services.
[06:44] Gary O’Neall
That's one of my favorite areas.
[06:46] Gary O’Neall
Yeah.
[06:48] Viktor Petersson
So it started from the semiconductor world, and I think we're almost seeing a big full circle there, I guess.
[06:55] Viktor Petersson
I mean, I've had some of the guys from coreboot on the show, for instance, before, and like, open source firmware is definitely in vogue at the moment, and it goes hand in hand with SBoM.
[07:08] Viktor Petersson
So it's kind of fun to see how that becomes a full circle.
[07:11] Viktor Petersson
And now we're treating them all like the same and describing them using an S bomb.
[07:16] Viktor Petersson
Yeah, yeah, absolutely.
[07:18] Kate Steward
I don't know if you've encountered Zephyr before.
[07:23] Viktor Petersson
Yes, yes.
[07:24] Kate Steward
Well, basically we've got it set up there that all you're pretty much doing is turning a config option and you're getting 3s boms out automatically.
[07:34] Kate Steward
So you've got all this information sitting there.
[07:36] Kate Steward
And this is kind of, I think, where we need to take the industry is so that this isn't a fire drill, it just comes into the whole processes automatically.
[07:46] Viktor Petersson
Yeah, absolutely.
[07:47] Viktor Petersson
I mean we've.
[07:49] Viktor Petersson
So one of the tiger team at cisa, I'm co leading on the reference implementation of sbom.
[07:55] Viktor Petersson
We had the Zephyr team representing there as well.
[07:58] Viktor Petersson
And it's particular when you go in lower level where you start to incorporate building SBOMs in like makefiles or not.
[08:05] Viktor Petersson
The tool chain is so very different than it is for either a microservice written in Python or JavaScript.
[08:13] Viktor Petersson
Like the tooling are so polar opposite.
[08:15] Viktor Petersson
Right.
[08:15] Viktor Petersson
In terms of like how you interface with them.
[08:17] Viktor Petersson
But it seems like there's a lot more maturity coming in there.
[08:20] Viktor Petersson
And SE is definitely.
[08:21] Viktor Petersson
And YOCTO as well, to be fair, is doing a good job when it comes to that.
[08:27] Viktor Petersson
This is.
[08:28] Viktor Petersson
I mean, so this is super interesting and let's.
[08:31] Viktor Petersson
You kind of talked about this, but already K.
[08:33] Viktor Petersson
But like the history of SPDX really doesn't derives from license compliance.
[08:39] Viktor Petersson
Right.
[08:39] Viktor Petersson
That's kind of the angle you started the project which.
[08:41] Viktor Petersson
Because I would imagine like.
[08:43] Viktor Petersson
Well, you kind of mentioned already security was not really part of the initial scope.
[08:47] Viktor Petersson
Right.
[08:48] Viktor Petersson
So it really started with license focus.
[08:50] Viktor Petersson
Is that.
[08:50] Viktor Petersson
Is that a correct assessment?
[08:54] Kate Steward
It's a start with transparency as to what you're there now.
[08:57] Kate Steward
We were recording the licensing so we could understand.
[09:00] Kate Steward
But transparency, I think was the root cause.
[09:02] Kate Steward
It was root here, which is understand what you really got and how you can basically, you know, track dependencies and track interactions with things.
[09:12] Gary O’Neall
You know, another way of looking at it.
[09:14] Gary O’Neall
Yeah, you know, I think it is.
[09:16] Gary O’Neall
You could argue that it started with licensing, but.
[09:19] Gary O’Neall
But I think the more important dimension is it really started with compliance.
[09:23] Gary O’Neall
We had like a set of rules, you know, legal rules.
[09:26] Gary O’Neall
You had, you know, you didn't have just engineers looking at stuff.
[09:31] Gary O’Neall
You had other parties that want to make sure things comply.
[09:34] Gary O’Neall
And you have compliance and security as well.
[09:36] Gary O’Neall
You have a lot of compliance and security.
[09:37] Gary O’Neall
Security, especially if you're a financial institution.
[09:41] Gary O’Neall
So that was a big part of our initial focus is how do we provide information that enables proper compliance?
[09:48] Gary O’Neall
You know, you know, starting with the licensing, but very shortly after that, in fact were.
[09:54] Gary O’Neall
I remember the meeting we had, Kate.
[09:55] Gary O’Neall
We were over at University of Nebraska at Omaha talking to one of the professors and pulled in their security, you know, their security professors.
[10:04] Gary O’Neall
We had a big discussion like, look, all these, all this data is this.
[10:08] Gary O’Neall
It's the same data.
[10:09] Gary O’Neall
You just want to know what is in there and if you want to comply with your security Policy, you need almost exactly the same information, you know, that you need for license compliance.
[10:20] Gary O’Neall
So it really was a very short distance from starting with the licensing to discussing security.
[10:28] Kate Steward
And when I sort of started discussing this stuff with the NTA working groups as they were starting off, you know, it was sort of like the security people were going, oh, we don't like licensing.
[10:39] Kate Steward
We're going like, well, actually you kind of need to know the same properties here.
[10:42] Kate Steward
Guys, this is not anything new here.
[10:46] Kate Steward
Use what's already there and built from it.
[10:48] Kate Steward
Don't try to create your own.
[10:52] Gary O’Neall
The hard part is knowing the dependencies, right?
[10:54] Gary O’Neall
Once you, once you got the dependency pre figured out whether you use it to correlate with a vulnerability database or use it to correlate with, you know, licensing information, data source, you know, that's a lot easier than just figuring out what the dependencies are.
[11:09] Gary O’Neall
So very leverageable data.
[11:12] Viktor Petersson
Well, that is one of the biggest problem, isn't it?
[11:15] Viktor Petersson
When.
[11:15] Viktor Petersson
Well, I wouldn't say the biggest, but one of the big problem is transient dependency and tracking that throughout the tool chain.
[11:22] Viktor Petersson
But that's where it gets very messy.
[11:24] Viktor Petersson
And I'm sure both of you have spent a lot of time trying to try to unravel this, right?
[11:30] Gary O’Neall
Yeah.
[11:31] Gary O’Neall
I mean, when.
[11:32] Gary O’Neall
I'll just give one example on that affects the standard is you would like to believe that you could represent dependencies as a tree.
[11:42] Gary O’Neall
And in 90% of the time that's true.
[11:46] Gary O’Neall
You know, and then it's like, oh shoot, there's, you know, there's actually circular dependencies.
[11:51] Gary O’Neall
It's generally considered a bad practice to have many of these, but guess what?
[11:55] Gary O’Neall
It happens.
[11:56] Gary O’Neall
Oh shoot.
[11:57] Gary O’Neall
So not only is it not a tree, it's not even a directed graph.
[12:01] Gary O’Neall
It's a cyclic graph.
[12:02] Gary O’Neall
You can have cycles in it.
[12:04] Gary O’Neall
It's like, oh man, that really messes up the tooling.
[12:06] Gary O’Neall
But your spec has to support it.
[12:09] Gary O’Neall
It's not that the spec is complicated in supporting graphs is that the world is complicated and we have to deal with that world and we have to be able to represent it in our standards.
[12:21] Gary O’Neall
So, you know, so that's some of the learnings.
[12:23] Gary O’Neall
It's like, oh gosh, I wish it was just a treat.
[12:26] Gary O’Neall
Oh my gosh, I do too.
[12:29] Kate Steward
Like I say, it all started off with just simple packages, okay.
[12:33] Kate Steward
And then packages into systems and it was sort of like.
[12:38] Kate Steward
And the world is even getting more complicated at this point in time with things that are dependent data, right?
[12:46] Kate Steward
Let's not forget data here because I don't know about you, but if you've got a self driving car, there's a lot of data sets that are being pulled into training up your car the same way there's a lot of code that's being pulled in to build up your application.
[13:01] Kate Steward
So understanding that data set and having the transparency on the data sets that are being used is part of the reason we've been heading more and more towards the system side.
[13:11] Viktor Petersson
Yeah.
[13:11] Viktor Petersson
On the AI data side, they're pretty big financial incentives not to reveal those data sets as we've seen pretty big pushbacks on disclosing the trading data.
[13:21] Kate Steward
Yeah, no, there may be incentives, but you need to know what you've got internally even if you're not exposing it externally.
[13:29] Kate Steward
And real estate.
[13:30] Viktor Petersson
I meant more that they know.
[13:34] Viktor Petersson
Oh, I meant more that they very much know what's in it.
[13:37] Viktor Petersson
It's just if they disclose it, they will be in trouble.
[13:42] Kate Steward
That certainly happens.
[13:43] Kate Steward
But being able to sort of know and be able to query things in a structured fashion is one of the pieces that we're trying to work towards making sure we can have happening at this point and taking.
[13:53] Gary O’Neall
Yeah, graphs.
[13:56] Viktor Petersson
Yeah, absolutely.
[13:56] Viktor Petersson
And one of the things, I mean, not only graphs, but graphs of graphs almost.
[14:00] Viktor Petersson
Right.
[14:00] Viktor Petersson
Because it's one of the things we cover a lot in the CISA working group with Gary's part of is how do you represent an entire system?
[14:09] Viktor Petersson
To your point, Kate, you might have a microservice which in turn has three to five different SBOMs, but then you want to represent that and that's part of a bigger system.
[14:19] Viktor Petersson
So you have these hierarchies of SBOMs and that's where it gets really complicated.
[14:25] Kate Steward
So this is what we did.
[14:26] Kate Steward
This is the whole reason SPX3 exists.
[14:29] Kate Steward
Okay.
[14:30] Viktor Petersson
Right.
[14:30] Kate Steward
We have a common underlying element that has any type of data hooked onto it.
[14:37] Kate Steward
So initially I think in like the NTIA stuff there was a whole concept of the crop circles.
[14:42] Kate Steward
I don't know if people are remembering some of those discussions where there's data associated with security, there's data associated with software, there's data associated with data, but there's metadata in with all these things.
[14:52] Kate Steward
And what we've got in SPDX right now is the ability to create databases with a common schema and so you can actually link all of these things together and reference, so you can reference from a build SBOM to the source sboms and know that you are, you know, you're coherent and things haven't been mucked up on you.
[15:11] Kate Steward
And then this data, you can then query and understand how things change over time, which puts you in a lovely place where if you're working with the CI flow and things are changing automatically, so all you're doing is adding a new element and changing some linkage.
[15:23] Kate Steward
But you're doing this in a database, in a structured fashion.
[15:26] Kate Steward
So at any point in time you say, oh, what was the SBoM on this date?
[15:29] Kate Steward
Oh, what was the SBoM on this Date?
[15:30] Kate Steward
And what was the SBoM of this subset?
[15:32] Kate Steward
You know, you can basically export these out of databases now.
[15:36] Kate Steward
And that's kind of what we need to do.
[15:37] Kate Steward
Scale rather than patch, patch, clunk, clunk.
[15:42] Kate Steward
Sorry.
[15:42] Kate Steward
That's my view on it.
[15:43] Viktor Petersson
Yeah, no, speak.
[15:45] Viktor Petersson
Speak to me more about that because that's interesting, right?
[15:47] Viktor Petersson
Because I mean, the way I've seen sboms traditionally is that what traditional, as long as I've been exposed to it is a snapshot in time, right?
[15:54] Viktor Petersson
Is it's a truth for that CI CD run, right?
[15:57] Viktor Petersson
And the next run, it's likely change, right?
[16:01] Kate Steward
Yeah.
[16:02] Kate Steward
So the thing is, what were finding about five years ago is every company were working with was creating their own databases, okay?
[16:11] Kate Steward
So all these large organizations had their own databases, and they were moving SPDX into the databases in a structured fashion.
[16:17] Kate Steward
So we started talking.
[16:19] Kate Steward
The tool tool group was basically very focused on this.
[16:22] Kate Steward
This was coming out of omg and we also were seeing evidence from ourselves, from the people were working with, that they were putting things into databases.
[16:30] Kate Steward
The challenge became, okay, how do we basically structure our data underneath the covers here so that we can be in databases very seamlessly.
[16:39] Kate Steward
If someone has to put out an S BOM that is your point in time, they can query it, but then they can also know, okay, how has it changed since then?
[16:48] Kate Steward
And how is the things going that way?
[16:50] Kate Steward
And so having that internal database and having that graph, a knowledge graph effectively of all your elements connected together becomes very powerful when we have to go to scale.
[17:02] Gary O’Neall
Scaling is a big one.
[17:04] Gary O’Neall
And if I could just take that a little step further with a couple of examples, because I'll give you a little insight into how our meetings run sometimes.
[17:13] Gary O’Neall
I'm always a big champion of compatibility with the prior versions, so people have to argue with me to get some changes in.
[17:21] Gary O’Neall
And a lot of the changes were made so that these systems can scale.
[17:28] Gary O’Neall
And just to be clear, SPDX is an interchange standard.
[17:32] Gary O’Neall
It's what data you send out of this database.
[17:36] Gary O’Neall
I don't think we want to be in the business of telling people how to design their internal systems.
[17:42] Gary O’Neall
That would be a little harder to adopt.
[17:44] Gary O’Neall
However, if the Exchange database is structured in a certain way, you can implement much more scalable systems internally.
[17:53] Gary O’Neall
So one of the examples I'll just point to was our representative from Microsoft, you know, which owns GitHub, wanted to have a very scalable system where every single commit that goes in can give you a live fresh update of your SBoM.
[18:11] Gary O’Neall
You think about the scale, you know, you talk about scale, you're not going to like read, rescan, regenerate, you know, redo all this really heavy lifting every time somebody does a commit.
[18:22] Gary O’Neall
You need to have things done at a very small atomic level and make things, you know, really accessible.
[18:28] Gary O’Neall
So a lot of the restructuring we've done about how things like how relationships work within the systems, that what we meant as like, we mint elements and you cannot change that element.
[18:41] Gary O’Neall
So we've moved things out of the element so you didn't have to create a new one every time there's a commit for all the different properties.
[18:47] Gary O’Neall
So we did a lot of engineering around that to make the SPDX system a lot more scalable.
[18:54] Gary O’Neall
And of course, I always try to balance it out with keeping it compatible and keeping it simple, you know, for maybe the people that don't want to build these giant systems.
[19:02] Gary O’Neall
So there's a little tension there and I think we did an okay job at that.
[19:06] Viktor Petersson
Let me.
[19:07] Viktor Petersson
So let me rephrase that, or to make sure I got it right.
[19:10] Viktor Petersson
So, like, you guys consider SPDX 3 more a database schema than a format.
[19:16] Viktor Petersson
Is that a fair recap of this?
[19:20] Gary O’Neall
I wouldn't, I would not say that.
[19:22] Gary O’Neall
There may be people in the community that would.
[19:26] Gary O’Neall
I mean, it is a schema.
[19:27] Gary O’Neall
I mean, it's a, it's very specifically, it's a shackle owl schema that we generate a JSON schema from and use it for JSON LD serializations and such.
[19:41] Gary O’Neall
So, I mean, it really is a schema, but it really is intended to be a schema for data you send from system A to system B, and people are using that same schema to implement internal systems as well.
[19:55] Gary O’Neall
But I personally would not say that's the design of it.
[20:01] Gary O’Neall
Now I will again say that there are people in our community that would say, yes, this is a prescription for how you should implement your system, but I personally would not say that it really is.
[20:11] Gary O’Neall
Let's create a Standard that people can exchange data system data around.
[20:16] Gary O’Neall
Yeah, I don't.
[20:17] Gary O’Neall
Kate, what do you think?
[20:18] Gary O’Neall
You might have a different opinion.
[20:19] Kate Steward
Like I say, I think that's.
[20:21] Kate Steward
We're threading a line here and so it's not as black and white as Victor would like because we are basically it's there for people to implement internally.
[20:33] Kate Steward
We aren't dictating what's in your database per se because you may have other properties, you may have other things you want to keep.
[20:39] Viktor Petersson
Sure.
[20:40] Kate Steward
And you know, and some of that other stuff you're keeping becomes fodder for the next rev of spdx, quite frankly.
[20:47] Kate Steward
But it, we are an interchange and we are.
[20:51] Kate Steward
But you want to interchange with modern systems, you don't want to interchange with things that are thrown out.
[20:57] Kate Steward
And so CI is part of it now.
[21:00] Kate Steward
Data data training data sets, validating data sets, testing data sets.
[21:06] Kate Steward
And then quite frankly, as we go towards critical infrastructure and all the regulatory side around that we've got to deal with, the safety profiles and the standards around safety are going to be key for, you know, making sure that we actually have the precision that, you know, we can know people stay safe.
[21:28] Kate Steward
And so that's, you know, these areas where we're expanding SPDX to make sure that we are handling it properly.
[21:36] Kate Steward
But these are things that are going to keep on emerging and you know, the world isn't getting simpler, let's put it that way.
[21:42] Kate Steward
It's getting a lot more complex.
[21:44] Viktor Petersson
Quite right.
[21:45] Viktor Petersson
And as far as safety, I would imagine that things like attestation would fall within scope of that as well to make sure you trust actual data and.
[21:53] Viktor Petersson
Or is that completely out of scope or.
[21:56] Kate Steward
Well, there's ways of doing attestation.
[22:00] Kate Steward
You're more likely to attest to an SBOM document or something like that.
[22:05] Kate Steward
Say that we've got mechanisms built in and we've had mechanisms built in pretty much since the two actually was it the 1.2 days where you're basically taking hashes on.
[22:16] Kate Steward
On artifacts to make sure that you have one to one and things haven't shifted out from under you.
[22:23] Kate Steward
So there's mechanisms there and then the attestation is someone saying yes, I've tried to check this effectively.
[22:30] Kate Steward
So people have generally put in attestations around the SBOM data as opposed to inside the SBOM data because that's what we've been finding.
[22:40] Kate Steward
On the other hand, there's ways of doing it.
[22:42] Kate Steward
People want to put them in, they can, there's ways it can happen but yeah, for that, yeah, like I said, we've put things in for it.
[22:51] Kate Steward
But realistically, most people try to do it here I am providing this SBOM document to the government for this thing at this point in time, and I'm attesting that this is accurate.
[23:01] Kate Steward
Right, right.
[23:02] Gary O’Neall
That's.
[23:03] Kate Steward
That's kind of where it happens.
[23:04] Kate Steward
And so it's happening around it in my mind mostly, but that's my perspective.
[23:09] Viktor Petersson
That's fair enough.
[23:10] Viktor Petersson
Then.
[23:11] Viktor Petersson
We always have the problem of unknowns, which is a big problem in the SBOM world.
[23:15] Viktor Petersson
Right.
[23:16] Viktor Petersson
Because we, just because you have an SBoM, we generate an SBOM from whatever piece of software, it doesn't mean that it's actually complete.
[23:21] Viktor Petersson
So the definition of complete is a bit of a, I think, hot potato in the SBOM world.
[23:27] Viktor Petersson
Right.
[23:29] Viktor Petersson
How do you guys think about this in general and like, where we're going with that?
[23:33] Kate Steward
Well, this whole issue of known unknowns came up in the NTI work.
[23:38] Kate Steward
And we made sure our 2.2 version of SPDX, which is when we took ISO, actually had an explicit way of signaling a known unknown.
[23:46] Kate Steward
And so known unknowns are a concept we are quite conscious of.
[23:50] Kate Steward
And whether or not there's full completeness on the information or not, you can signal.
[23:54] Kate Steward
And you could do that in 2.2, 2.3, as well as noun 3.0.
[23:58] Gary O’Neall
I would say we actually had it from the very beginning of 1.0, because when there's a lot of properties where we had two special values, none and no assertion.
[24:11] Gary O’Neall
So none means I know this does not exist.
[24:14] Gary O’Neall
I know there was never, you know, a statement around this, or I know there was no license associated with this.
[24:21] Gary O’Neall
That's a none.
[24:22] Gary O’Neall
No assertion means I don't know.
[24:24] Gary O’Neall
I know I don't know.
[24:26] Gary O’Neall
Therefore it's a no assertion.
[24:28] Gary O’Neall
So no assertion has been in there.
[24:29] Gary O’Neall
So we've carried that forward into 3.0, although in 3.0 we actually added a lot more flexibility in how you can state it.
[24:38] Gary O’Neall
So the relationship class actually has a, a new field that gives you a little bit more clarity on how much you know.
[24:46] Gary O’Neall
You don't know.
[24:47] Gary O’Neall
If I can complicate it a little with that.
[24:51] Viktor Petersson
Yeah, I guess the problem is also like you don't know if there's a dependency in there that you haven't tracked.
[24:58] Viktor Petersson
Right.
[24:58] Viktor Petersson
That do.
[24:59] Viktor Petersson
You can't specify that whatsoever.
[25:00] Viktor Petersson
That by definition you can't.
[25:03] Kate Steward
We can.
[25:03] Kate Steward
We have a way, and this is the two step I was talking about is you can say, hey, Is everything complete or not?
[25:10] Kate Steward
And you can put a no assertion on the completeness, which means I don't know if it's not complete or not.
[25:14] Kate Steward
So you can signal that you have some potential.
[25:16] Kate Steward
It may all be fine, maybe it complete, but you can't be authoritative at this point in time.
[25:21] Kate Steward
So you can be explicit about that.
[25:25] Viktor Petersson
Fair enough, Fair enough.
[25:27] Viktor Petersson
The next thing I wanted to have a chat about was something that Gary and I have been trading quite a few conversations around, which is licensing audit, which I, I know Gary has spent a lot of his life, spending his life, a lot of his life trying to figure out in terms of auditing S bonds.
[25:42] Viktor Petersson
But that is a bigger and bigger problem that I've seen, at least in my firsthand experience, like both in the case of a duplication, but also like incorrect licensing.
[25:52] Viktor Petersson
So maybe like how do you guys see the state of that currently and what's being done to I guess improve quality of license compliance and coherence, I guess is a good way of phrasing that.
[26:07] Gary O’Neall
Do you want me to start?
[26:09] Kate Steward
You want to start?
[26:10] Gary O’Neall
Because I got so many ideas on this.
[26:16] Gary O’Neall
Let me go ahead and start that.
[26:17] Gary O’Neall
Just so there's like two different areas.
[26:20] Gary O’Neall
Some of this is a little bit of a hot button with me, but with some of the SBOM tools out there, they, they'll generate just knowingly things that don't even parse as far as, you know, valid license expressions.
[26:35] Gary O’Neall
So there's definitely some work that needs to be done, you know, in the tooling side.
[26:39] Gary O’Neall
Just simple things, you know, does this license expression actually parse?
[26:43] Gary O’Neall
Is it valid?
[26:44] Gary O’Neall
And if it's not.
[26:45] Viktor Petersson
So but let maybe start there.
[26:47] Viktor Petersson
Like what's what?
[26:48] Viktor Petersson
Let's zoom in on that first because people who have watched this may not be aware of the whole license side of compliance.
[26:56] Gary O’Neall
Yeah, so within spdx, you know, and this by the way is these license expressions are actually used in a lot of other standards, many other standards as it turns out.
[27:08] Gary O’Neall
And it's basically like a Boolean expression of licenses.
[27:12] Gary O’Neall
So you would say like, oh, I found a license, an open source license gpl and oh, I found an open source license mit.
[27:20] Gary O’Neall
So both of those apply.
[27:22] Gary O’Neall
So I would create an expression MIT and gpl.
[27:25] Gary O’Neall
And if I just find a blob of text that doesn't correlate with any known licenses, we have a way of doing what we call a license ref.
[27:34] Gary O’Neall
So it's basically you create this thing that has the text and then you can refer to that in other parts of the SPDX document.
[27:42] Gary O’Neall
So these things are pretty well defined, they've been around, I don't know Kate, what about 10 years?
[27:49] Kate Steward
The thing to understand, to help simplify, improve the problem, one of the big contributions SPDX has made to the ecosystem as a whole is the license list where we've standardized a common set of licenses and have a group of lawyers that review them to come up with an idea to say whether it's unique or not and have standard templates so that you can do matching and so forth and then quite frankly getting it so that we've gotten open source projects to adopt this.
[28:20] Kate Steward
So rather than a random wall of text that it takes some sort of artificial intelligence inferencing machine to figure out which license it really is.
[28:28] Kate Steward
You have a one line that you parse and then you can say yeah, it's this license.
[28:32] Kate Steward
And the developers like that, they don't want to have to cut things around, they want to be able to just put the licensing in.
[28:39] Kate Steward
And so this whole initiative to just put that in started with u boot out there and then various other projects have adopted it since then, including the Linux kernel and Zephyr and so forth.
[28:53] Kate Steward
And we spent about five or six years ago, we did full pass going through and looking at Zephyr, looking at all the licenses in the kernel and cleaning them up and removing blocks of text.
[29:05] Gary O’Neall
Yeah, if I could just interject, I mean just when you mentioned the license list, I just want to say that we're actually expanding that same concept to other areas.
[29:15] Gary O’Neall
We actually have a hash algorithm group that started up that's doing the same curated list so that we could have identifiers and a group of experts.
[29:26] Gary O’Neall
I'm not one of the experts, but we have people that are expert in cryptography working on this group to basically associate certain known properties with the strength of the algorithm.
[29:41] Gary O’Neall
So be able to create and curate that as well.
[29:44] Gary O’Neall
So I think that's going to be just as valuable as that evolves.
[29:49] Gary O’Neall
And I just wanted to also mention back on the licensing, I just wanted to mention the, just as the story of NPM because I think this is a great story.
[29:57] Gary O’Neall
I've been doing as you know, I've been doing these license audits for a long time.
[30:01] Gary O’Neall
I used to hate to see NPM projects have a total nightmare of license, you know, because people could put in their files, you know that the metadata files, whatever random string they wanted and call it a license.
[30:15] Gary O’Neall
So sometimes you wouldn't find anything, sometimes you would find Things like this license is, you know, something I, it just, you just find all kinds of random stuff.
[30:24] Gary O’Neall
I'm trying to remember the guy's name.
[30:26] Gary O’Neall
I should give him a shout out if I could remember his name.
[30:29] Gary O’Neall
I'll mention it, but one of the maintainers in that group, in the NPM group says, hey, let's use license expressions.
[30:36] Gary O’Neall
This is pretty well thought out.
[30:38] Gary O’Neall
We got the license list and they went through and they put in tools, they verified it.
[30:44] Gary O’Neall
Now I just love it when I see an NPM project.
[30:46] Gary O’Neall
It's still a little bit of a nightmare in the dependencies.
[30:49] Gary O’Neall
It's kind of hard, you know, the way they do dependencies.
[30:52] Gary O’Neall
I wish it was a little cleaner, but as far as the licenses goes, it's great because you cannot upload a project without it having a valid license expression.
[31:01] Gary O’Neall
So it makes the downstream life.
[31:03] Gary O’Neall
And that I think is what the key is to the solution to the licensing is that the package managers, you know, the upstream projects and the package managers fully implement the standard.
[31:16] Gary O’Neall
So those of us downstream, when we look at these, it's much easier to parse rather than having to deal with these random strings that we see.
[31:24] Viktor Petersson
I mean it's funny because almost as you merge yourself in this world, almost all roads lead to package managers.
[31:30] Viktor Petersson
Because that's like, they are the canonical truth, more or less.
[31:34] Kate Steward
Yes, Kate, I'm going to disagree from the embedded side.
[31:37] Kate Steward
Just saying.
[31:38] Viktor Petersson
Fair enough.
[31:40] Kate Steward
It's all about the compiler.
[31:41] Kate Steward
Come on.
[31:43] Viktor Petersson
Right, Fair enough.
[31:44] Viktor Petersson
In higher level languages, I guess all roads lead to package managers.
[31:48] Viktor Petersson
Although you do see a push towards package managers on lower level averages as well.
[31:54] Viktor Petersson
Like Conan for C for instance.
[31:57] Viktor Petersson
So it is making its way down to lower level as well.
[32:00] Viktor Petersson
Maybe not to super low level of embedded code, but at least to compile that level.
[32:07] Viktor Petersson
At least.
[32:09] Viktor Petersson
But that's.
[32:10] Viktor Petersson
So the SPDX licensing database has kind of merged as the canonical source of truth for licensing these days.
[32:22] Viktor Petersson
You do have some friction with OSI model.
[32:24] Viktor Petersson
Do you guys care to comment on the discrepancies between the two?
[32:29] Gary O’Neall
It's not a discrepancy.
[32:30] Gary O’Neall
We do have data management issues.
[32:33] Gary O’Neall
We actually work pretty closely with the OSI group.
[32:36] Gary O’Neall
Yeah, it's actually a pretty good relationship.
[32:39] Gary O’Neall
One of the things we've done on our license list is we've added an OSI column to our database where we can say whether it's OSI approved or not.
[32:50] Gary O’Neall
I've worked with the technical folks at OSI to see if we could Automate that and make that a little bit more reliable.
[32:58] Gary O’Neall
Because what happens is, and I won't point fingers, but let's just say somebody changes the text of a license that you think should be the same and all of a sudden you got different text and it's like that doesn't really match anymore.
[33:11] Gary O’Neall
So the data gets out of sync and those are problems we're trying to work through.
[33:17] Gary O’Neall
I mean, there's another group too, which is the folks behind gpl.
[33:24] Gary O’Neall
Fsf.
[33:25] Gary O’Neall
Yes.
[33:26] Viktor Petersson
Free Soft foundation.
[33:27] Gary O’Neall
Yeah.
[33:28] Gary O’Neall
They also are pretty passionate about their licenses and they have their license list as well.
[33:32] Gary O’Neall
And you know, we work with them as well.
[33:34] Kate Steward
Yeah, we had to make specific changes to keep them happy actually in the license at one point in time, which made the colonel people unhappy.
[33:43] Kate Steward
So, you know, you just can't.
[33:47] Kate Steward
The other thing too that's been happening is, you know, this cleanup has been happening pretty much across the ecosystem and to the extent that like the Red Hat team has been slowly and surely cleaning up all of their packages and licensing, which makes it easier for a huge range of downstreams as well.
[34:05] Viktor Petersson
Yeah.
[34:05] Kate Steward
And so these sorts of, you know, initiatives of making it easier to just pull the data out and be accurate, you know, long, slow process.
[34:14] Kate Steward
But it makes things better in the long run too.
[34:17] Viktor Petersson
Absolutely.
[34:18] Viktor Petersson
And that's kind of what I meant by package manager because that expands beyond just languages, but also DEB and rpm.
[34:25] Viktor Petersson
So the operating system vendors, they have a unique position where they can actually give you a truth that is otherwise hard to do.
[34:35] Viktor Petersson
But it is nice to see that this is sweeping the industry.
[34:38] Viktor Petersson
I know PYPI is undergoing a similar audit right now that NPM did a few years ago.
[34:44] Viktor Petersson
Right.
[34:44] Viktor Petersson
So now there's focus there to clean that up as well, to make sure.
[34:48] Viktor Petersson
So yeah, it's hard when the data is dirty.
[34:53] Viktor Petersson
It's hard to generate a good sbop.
[34:56] Viktor Petersson
So I kind of want to turn my focus a bit over to your day job, I guess, Kate, now at the Linux Foundation.
[35:04] Viktor Petersson
So you guys are in a unique position as well in terms of like governors of a lot of software, all open source software.
[35:13] Viktor Petersson
What's the kind of view on projects that are under the Linux Foundation?
[35:20] Viktor Petersson
I'm thinking under CNCF and I'm thinking under other umbrella organs or so under organization, under the explanation in terms of like mandating S BOMs and quality of S BOMs as well.
[35:32] Viktor Petersson
Is there any conversation happening internally with.
[35:36] Viktor Petersson
With regards to that as.
[35:37] Viktor Petersson
As we speak?
[35:41] Kate Steward
Well, the projects I work for and work closely with are on the embedded side So I can't speak too much beyond that.
[35:48] Kate Steward
But I do know that the LF is very much committed to having accurate SBOMs available for the projects.
[35:55] Kate Steward
And to that extent Gary and Jeff have been working on a project to basically scan everything and make it visible and have it published.
[36:05] Kate Steward
And we've been actually, we've been publishing out for various projects.
[36:09] Kate Steward
The Source has BOM audits for about five years now.
[36:12] Kate Steward
Actually there was an LF scanning repo in GitHub that where this stuff was getting published and there's continuing work to continue to improve that.
[36:20] Kate Steward
So the LF is trying to help the projects as much as possible adopt best practices.
[36:25] Kate Steward
Gary, do you want to say anything more about what you and Jeff have been working on?
[36:28] Gary O’Neall
Yeah, so Jeff.
[36:31] Gary O’Neall
Jeff and I actually have worked together for many years.
[36:33] Gary O’Neall
We did audits together before he joined the Linux Foundation.
[36:36] Gary O’Neall
So I'm not an employee of Linux foundation myself, but I do provide support to a project and support to jap, you know, who does these scans.
[36:45] Gary O’Neall
So if I remember, so don't quote me on this, I'm probably off, but I think there's like about a couple Hundred or like 250 different Linux foundation projects that we kind of focus on and provide extra services for.
[37:00] Gary O’Neall
And one of those services is to do scanning and these are source sboms.
[37:05] Gary O’Neall
So we actually scan and look for actual license text and it's somewhat labor intensive to sift through the false positives and that.
[37:12] Gary O’Neall
We've been doing that for years.
[37:14] Gary O’Neall
I've been working with Jeff and we're just starting to roll out a more complete SBOMs.
[37:20] Gary O’Neall
And basically what we're doing is in addition to the source files, we're doing all of the dependencies that are represented in the metadata packages.
[37:32] Gary O’Neall
In fact, I'm using the pretty much the same tool chain that the CISA reference implementation is using.
[37:39] Gary O’Neall
We happen to pick the same tools almost pretty much independently, which is kind of interesting.
[37:45] Gary O’Neall
So, and we have some of our own tooling that we wrote this thing called Scaffold, but we'll be rolling that out for these 250 projects.
[37:54] Gary O’Neall
But the one thing I just want to mention though, this is a service that the Linux foundation will be providing, but it is not as good as what you could do if you did what the embedded team has been doing for years, which is producing the SBOM at build time.
[38:10] Gary O’Neall
Because we are doing a scan, we're actually looking at the metadata file, but you may have a metadata file that says something like get the latest version of this dependency.
[38:20] Gary O’Neall
Obviously we're not going to know exactly what version was pulled down when this particular executable was built.
[38:29] Gary O’Neall
But if you do a build sbom, you know, you'll know that.
[38:33] Gary O’Neall
So I, whenever I talk about this, I always, you know, I've certainly, I'd encourage people to use what we're producing out of these scans.
[38:40] Gary O’Neall
I think it's very usable.
[38:41] Gary O’Neall
I think it's as good or better than many of the SBOMs that I've seen, but it's not as good as what the teams could build themselves.
[38:48] Gary O’Neall
So kind of do what the Octo and Zephyr teams have been doing, which is, you know, generate these s boms at build time to get the most accurate results.
[38:57] Gary O’Neall
But if you don't have that, you know, we'll be producing these additional ones.
[39:02] Gary O’Neall
And these are all public, they're in a public repo, so anybody that uses the open source can pull them down.
[39:08] Viktor Petersson
Okay, interesting.
[39:09] Viktor Petersson
And I, I believe, was it the YOCTO or was it Zephyr project that modified the compiler to basically do it at.
[39:18] Viktor Petersson
Com in the compiler?
[39:18] Viktor Petersson
I think it was Zephyr product that actually modified to actually use the output from gcc if I'm not mist and then generate s pause based on that.
[39:26] Kate Steward
I think you're thinking more towards Yocto.
[39:29] Kate Steward
Zephyr is using CMake and it's putting instrumentation in CMake to basically pull out the debug information.
[39:37] Kate Steward
Realistically, YOCTO is also pulling out the debug information out of the build flow.
[39:41] Kate Steward
And that's where the source of truth is in terms of what you're actually getting with your image and work is I'm going to make something like this available for the Linux kernel too.
[39:51] Kate Steward
So that will be nice when that finally emerges.
[39:55] Viktor Petersson
That was going to be my next question, actually.
[39:56] Viktor Petersson
Like, I've heard rumors about this for a while, that the SBOM for the kernel is coming.
[40:02] Viktor Petersson
I guess you are kind of involved indirectly or directly with that, Kate.
[40:05] Viktor Petersson
I imagine, yeah.
[40:07] Kate Steward
Basically the same person who did it for Zephyr has turned his hand to looking at the kernel problem and so is working with Greg and some of the others in the kernel to make sure we can get something a little bit better.
[40:17] Kate Steward
So hopefully we'll be having something we feel comfortable with and making it visible with some noises, I guess, once it's there, because that'll be useful.
[40:28] Viktor Petersson
Do we have any timeline for that?
[40:31] Kate Steward
There's beta versions sitting around now.
[40:33] Kate Steward
It's just a question of making sure that something that we're all comfortable with.
[40:37] Kate Steward
I See, how many hours in the day and how many trips are you taking is kind of the challenge here for us all.
[40:48] Viktor Petersson
I think.
[40:49] Viktor Petersson
Fair enough.
[40:49] Viktor Petersson
I mean, I think that when that hits, that will be a big massive shift for the industry.
[40:57] Viktor Petersson
I think from my vantage point, like the having S bonds for the kernel, having proper S bombs for all the Linux distros, that's when we will really have a big shift in the, in like the adoption.
[41:10] Kate Steward
I think I'll point out right now you can get an SBOM for the kernel if you use yocto.
[41:15] Kate Steward
Just saying.
[41:17] Viktor Petersson
Fair enough, that's a fair point.
[41:20] Kate Steward
Okay, but yeah, pulling it in from the builds and things like that.
[41:24] Kate Steward
But yeah, no, you can generate out a build S BOM with YOCTO for the kernel and you've been able to do that for a couple years now.
[41:30] Kate Steward
It's just a matter of if you're not working within yocto, you want to have it, then you need to look for different mechanisms.
[41:37] Kate Steward
So that's what we're trying to get.
[41:38] Viktor Petersson
Yeah, absolutely.
[41:40] Viktor Petersson
Yeah.
[41:40] Viktor Petersson
No, I think it's end of the day it's the person who consumes the SBOM that will, sorry, that builds the kernel that needs to generate the SBoM and that kind of falls on then to the distribution, the distros essentially.
[41:54] Viktor Petersson
Right.
[41:54] Viktor Petersson
And they I guess use different tooling.
[41:57] Viktor Petersson
So there might be a cascading there.
[41:59] Viktor Petersson
But yeah, I don't know how custom the kernels are these days from the distribution they're probably pretty far from pretty heavily patched from the upstream kernel by now I would imagine.
[42:08] Viktor Petersson
So there might be some lag there involved as well.
[42:13] Viktor Petersson
No, this has been super interesting in terms of the competitive, sorry the compliance landscape as you go forward into.
[42:24] Viktor Petersson
I'm thinking both about CRA that's hitting Europe, which obviously can drive a big part, but also in terms of like compliance for.
[42:33] Viktor Petersson
I know NIST2, which was announced earlier last year, kind of stops just shy of mandating S bombs.
[42:41] Viktor Petersson
But you can interpret, I think it's PCI DSS 4.0 which kind of mandates S bombs.
[42:47] Viktor Petersson
So I'm curious about how you guys are seeing the driving force for S bombs.
[42:53] Viktor Petersson
Beyond the executive order which kind of spearheaded this in many ways.
[43:00] Kate Steward
I'm seeing it from the regulatory side and functional safety, we need that transparency.
[43:06] Kate Steward
And if you know, you want to know what's running on the nuclear power, someone needs to know, even if you don't from a public side or on a personal level, you need to know like you know you want the FDA to have actually reviewed all that's going into a medical device, especially with something that's implanted in your body.
[43:23] Viktor Petersson
Okay, yes, oh yes, absolutely.
[43:26] Kate Steward
These are things that we need that transparency when we start to have safety as an element here.
[43:31] Kate Steward
And you know, because open source changes so rapidly because there are security fixes showing up all the time.
[43:36] Kate Steward
The other thing you really want to do is you want to know, do I actually really need to fix this or not?
[43:41] Kate Steward
And so being able to have a higher precision and being able to get rid of the whole class of false positives at the component level is also an area I think we'll be heading.
[43:51] Viktor Petersson
So you believe the legal is the driving force here for change?
[43:57] Viktor Petersson
Really?
[43:58] Kate Steward
No.
[43:59] Kate Steward
Well, I'd say safety is a driving force for change, regulatory is a driving force for change rather than legal compliance.
[44:05] Kate Steward
But it's compliance, yes, compliance, yes to compliance with licensing.
[44:11] Viktor Petersson
Okay, fair enough.
[44:13] Viktor Petersson
Gary, what's your.
[44:14] Gary O’Neall
Yeah, I agree with safety.
[44:15] Gary O’Neall
I think safety is going to be a big driver.
[44:17] Gary O’Neall
You know, especially, you know, when you pull in AI self driving cars, all the data sets being used, there's a lot of regulatory compliance that's being thought about that I think is coming down the road.
[44:30] Gary O’Neall
The other thing I'll just mention, another dimension to this is the international dimension.
[44:34] Gary O’Neall
We're seeing lots of regulations come from different countries, they're similar, but not exactly the same.
[44:40] Gary O’Neall
One little, you know, kind of technical story on this is we have this tool called the, that was formerly known as the NTIA compliance checker that takes an SPDX file and tells you whether you comply with the minimum.
[44:54] Gary O’Neall
Ntia.
[44:55] Gary O’Neall
Well, we had to add a whole new command structure because you want to see does it meet the German standards, does it meet the Japanese standards?
[45:03] Gary O’Neall
You know, so does it meet the CRA minimums?
[45:06] Gary O’Neall
You know, so, you know, we're having to evolve our tooling to deal with these multiple regulations which have a common core, but they're just a little different.
[45:15] Viktor Petersson
Well, CRA is going to open a whole different kind of worm because, yes, the first draft is still there from the eu, but then you have the regional implementations of stra, right?
[45:25] Viktor Petersson
So that's gonna probably add a lot of very, a lot of arguments to your command line tooling.
[45:30] Kate Steward
And then India has its own take on things and China has its own take on things, as does, you know, Japan and Germany and so forth.
[45:37] Kate Steward
So we're seeing a lot of variants and everyone has specific things they care about.
[45:42] Kate Steward
But the regulatory side is definitely a factor here, no question.
[45:47] Viktor Petersson
Yeah, I mean for me the reason why I'm excited about SBOMs in general is.
[45:51] Viktor Petersson
I mean it's funny to be excited by the JSON element because it's really not that sexy.
[45:56] Viktor Petersson
But it's what enables and what it spans.
[46:00] Viktor Petersson
All these regulatory frameworks like the world I anticipate is that all the next wave of regulations be SOC2, be NIST, be ISO all these frameworks, they will mandate S BOMs in next order version after that I'm fairly confident with.
[46:17] Viktor Petersson
CRA just kind of opened this and started making it part of the public debate and now it's just, it's going to be part of all these framework.
[46:25] Viktor Petersson
That's at least that's the way I see this path going forward.
[46:30] Gary O’Neall
Yeah, I think one of the interesting data sides of it is the geography.
[46:34] Gary O’Neall
You know, how do you know what geography software originated in?
[46:38] Gary O’Neall
You know, in the services profile that I'm working on geography is very important because you know there's certain regulatory issues as far as, you know whether your data is actually stored on a server, you know where that service actually is.
[46:53] Gary O’Neall
So including this geography information is interesting but export regulations that also requires it.
[46:59] Gary O’Neall
Now the self driving cars, there's regulations about what software where that software for that self driving car could originate.
[47:07] Gary O’Neall
It's just going to be just a real challenge.
[47:09] Gary O’Neall
I don't think the challenge is representing the data.
[47:11] Gary O’Neall
I think the job for SPDX is pretty easy because the country codes and everything is pretty well standardized.
[47:18] Gary O’Neall
But how are you going to determine these things is the interesting thing, especially for open source projects.
[47:23] Gary O’Neall
So some nice, interesting problems to work on.
[47:26] Viktor Petersson
Absolutely.
[47:28] Viktor Petersson
I mean if the code lives in GitHub, you have the committer of that core code but you have no way of knowing geographically where this person is and what if it's constellation of people?
[47:39] Viktor Petersson
Then you have even less way of expressing that.
[47:42] Viktor Petersson
Right?
[47:42] Gary O’Neall
Yeah.
[47:43] Gary O’Neall
And what if somebody, you know, that's a US citizen moves to this other geography?
[47:47] Gary O’Neall
There's all kinds of really interesting edge cases to think about.
[47:52] Gary O’Neall
You know, in terms of what is it, what does this really mean as far as determining the origin of this alcohol or data?
[48:00] Viktor Petersson
Yeah, absolutely.
[48:03] Viktor Petersson
This has been super interesting.
[48:05] Viktor Petersson
I really appreciate both of you taking time of your busy days and I am looking forward to seeing SPX3 getting wider adoption across tooling because I know as Gary is acutely aware there is a problem with tooling keeping up with the latest version.
[48:20] Viktor Petersson
So I'm looking forward to seeing that adoption creepy up with all the tools and Any closing notes from either of you before we wrap up for today?
[48:30] Gary O’Neall
Let me just.
[48:31] Gary O’Neall
I'll just make one quick comment.
[48:32] Gary O’Neall
I'll turn it over to Kate, but I just want to mention SPDX is an open community.
[48:37] Gary O’Neall
We're welcoming of all contributors, whether it's tooling, improving the spec.
[48:42] Gary O’Neall
Just go to spdx.dev and you'll find out how you can participate.
[48:47] Gary O’Neall
But join us, you know, if you see an issue with our standard, do you think there's something off, you know, come in.
[48:52] Gary O’Neall
We're easy to work with and very happy to have more contributors to the spec.
[48:57] Gary O’Neall
So I just wanted to mention that for the group and it's great talking to you, Victor, and as always, Kate, and thanks for having us.
[49:05] Kate Steward
Yeah, it's been really good.
[49:06] Kate Steward
Victor, I'd also basically bring up that if you're interested in working on tools and helping to us to move some tools forward to SPDX3, we definitely would welcome all help to here.
[49:18] Kate Steward
We don't have, you know, we don't have large corporate organizations really pushing for things behind us, so it makes it a little bit more interesting to make the enablement happen.
[49:28] Kate Steward
And so this is our second transition.
[49:31] Kate Steward
We did a transition between 1.2 and 2.0, but there was a lot fewer tools out there at the time.
[49:37] Kate Steward
And so now we're ecosystems out there at the time.
[49:41] Kate Steward
So now we're basically going through this transition.
[49:43] Kate Steward
But I think what's there is a good base for us to evolve and hopefully the next set of transitions will not be as interesting because we will be adding more things for a while.
[49:54] Kate Steward
So I think we're on three for a little bit.
[49:57] Kate Steward
Yes, or Gary and I will be retired by then before the transition happens.
[50:04] Kate Steward
Well, we'll see.
[50:06] Viktor Petersson
Perfect.
[50:07] Viktor Petersson
Again, thank you so much for coming on the show and have a good day.
[50:11] Viktor Petersson
Cheers.
[50:12] Gary O’Neall
Thank you.

Found an error or typo? File PR against this file or the transcript.