It is not effective if it just takes a simple base64 encode to bypass. If Claude is trivially able to find that it is malicious then Pypi is being negligent.
It is not effective if it just takes a simple base64 encode to bypass. If Claude is trivially able to find that it is malicious then Pypi is being negligent.
The package in question was live for 46 minutes. It generally takes longer than that for security partners to scan and flag packages.
PyPI doesn't block package uploads awaiting security scanning - that would be a bad idea for a number of reasons, most notably (in my opinion) that it would be making promises that PyPI couldn't keep and lull people into a false sense of security.
I realize this is controversial (and many Python folks would claim anti ethical). But I keep wondering if requiring a small payment for registering and updating packages would help. The money could go to maintaining pypix as well as automated AI analysis. Folks who really couldn't afford it could apply for sponsorship.
Very much not speaking for the PSF here, but my personal opinion on why that wouldn't work is that Python is a global language and collecting fees on a global basis is inherently difficult - and we don't want to discriminate against people in countries where the payment infrastructure is hard to support.
PyPI has paid organization accounts now which are beginning to form a meaningful revenue stream: https://docs.pypi.org/organization-accounts/pricing-and-paym...
Plus a small fee wouldn't deter malware authors, who would likely have easy access to stolen credit cards - which would expose PyPI to the chargebacks and fraudulent transactions world as well!
I don't think people want to pay for that.
If pypi charges money, python libraries will suddenly have a lot of "you can 'uv add git+https://github.com/project/library'" instead of 'uv add library'.
I also don't think it would stop this attack, where a token was stolen.
If someone's generating pypi package releases from CI, they're going to register a credit card on their account, make it so CI can automatically charge it, and when the CI token is stolen it can push an update on the real package owner's dime, not the attackers, so it's not a deterrent.
Also, the iOS app store is an okay counter example. It charges $100/year for a developer account, but still has its share of malware (certainly more than the totally free debian software repository).
TBH there isn't much difference in pulling directly from GH.
Though I do like your Apple counterexample.
It should not let people download unscanned dependencies without a warning and asking the user to override and use a potentially insecure package. If such security bug is critical enough to need to bypass this time (spoiler: realistically it is not actually that bad for a security fix to be delayed) they can work with the pypi security team to do a quicker manual review of the change.
sadly I still worry about that. An install fails once, you you hard code the --force flag in all your CI/CD jobs and we are back in the same place again. I am not sure what the answer is, though problems...
Adding a hardcoded flag is not the same as asking the user if they want potential malware. If CI/CD is broken they should revert the change to pinned dependencies instead of trying to install a bleeding edge version of a new dependency that hasn't been scanned yet.
What happens then if the security scanners say something is safe and it turns out not to be?
I don't think PyPI should be in the business of saying if a piece of software is safe to install or not.
Then it will be downloadable and then it's up to your own security scanners to catch it. If you find it, it should be reported to pypi and then the scanner should be improved to catch that kind of bypass the next time it comes around. In such a world I don't think pypi is acting negligent.
Would you happen to know where the latency comes from between upload and scanning? Would more resources for more security scanner runners to consume the scanner queue faster solve this? Trying to understand if there are inherent process limitations or if a donation for this compute would solve this gap.
(software supply chain security is a component of my work)
He said, "pypi doesn't block upload on scanning"; that's part of where the latency comes from. The other part is simply the sheer mass of uploads, and that there's not money in doing it super quickly.
I agree that's a bad idea to do so since security scanning is inherently a cat and mouse game.
Let's hypothetically say pypi did block upload on passing a security scan. The attacker now simply creates their own pypi test package ahead of time, uploads sample malicious payloads with additional layers of obfuscation until one passes the scan, and then uses that payload in the real attack.
Pypi would also probably open source any security scanning code it adds as part of upload (as it should), so the attacker could even just do it locally.
I suppose my argument is that pypi could offer the option to block downloads to package owners until a security scan is complete (if scanning will always take ~45-60 minutes), and if money is a problem, money can solve the scanning latency. Our org scans all packages ingested into artifact storage and requires dependency pinning, and would continue to do so, but more options (when cheap) are sometimes better imho. Also, not everyone has enterprise resources for managing this risk. I agree it is "cat and mouse" or "whack-a-mole", and always will be (ie building and maintaining systems of risk mitigation and reduction). We don't not do security scanning simply because adversaries are always improving, right? We collectively slow attackers down, when possible.
("slow is smooth, smooth is fast")
I don't know that myself but Mike Fiedler is the person to reach out to, he runs security for PyPI and is very responsive. security@pypi.org