I think the point around incorporating MFA into the automated publishing flow isn't getting enough attention.
I've got no problem with doing an MFA prompt to confirm publish by a CI workflow - but last I looked this was a convoluted process of opening a https tunnel out (using a third party solution) such that you could provide the code.
I'd love to see either npm or GitHub provide an easy, out the box way, for me to provide/confirm a code during CI.
Publishing a package involves 2 phases: uploading the package to npmjs, and making it availble to users. Right now these 2 phases are bundled together into 1 operation.
I think the right way to approach this is to unbundle uploading the packages & publishing packages so that they're available to end-users.
CI systems should be able to build & upload packages in a fully automated manner.
Publishing the uploaded packages should require a human to log into npmjs's website & manually publish the package and go through MFA.
Completely agree tbh, and that would be one of my preferred approaches should npm be the actor to implement a solution.
I also think it makes sense for GitHub to implement the ability to mark a workflow as sensitive and requiring "sudo mode" (MFA prompt) to run. It's not miles away from what they already do around requiring maintainer approval to run workflows on PRs.
Ideally both of these would exist, as not every npm package is published via GitHub actions (or any CI system), and not every GitHub workflow taking a sensitive action is publishing an npm package.
Which CI would that be?
npm should require this with packages that have a large enough blast radius
npm should require it for all packages.
I'm feeling that maybe the entire concept of "publishing packages" is something that's not really needed? Instead, the VCS can be used as a "source of truth", with no extra publishing step required.
This is how Go works: you import by URL, e.g. "example.com/whatever/pkgname", which is presumed to be a VCS repo (git, mercurial, subversion, etc.) Versioning is done by VCS tags and branches. You "publish" by adding a tag.
While VCS repos can and have been compromised, this removes an entire attack surface from the equation. If you read every commit or a diff between two tags, then you've seen it all. No need to also diff the .tar.gz packages. I believe this would have prevented this entire incident, and I believe also the one from a few weeks ago (AFAIK that also only relied on compromised npm accounts, and not VCS?)
The main downside is that moving a repo is a bit harder, since the import path will change from "host1.com/pkgname" to "otherhost.com/pkgname", or "github.com/oneuser/repo" to "github.com/otheruser/repo". Arguably, this is a feature – opinions are divided.
Other than that, I can't really think of any advantages a "publish package"-step adds? Maybe I'm missing something? But to me it seems like a relic from the old "upload tar archive to FTP" days before VCS became ubiquitous (or nigh-ubiquitous anyway).
There’s also a cost that installs take much longer, you need the full toolchain installed, and are no longer reproducible due to variations in the local build environment. If everything you do is a first-party CI build of a binary image you deploy, that’s okay but for tools you’re installing outside of that kind of environment it adds friction.
Agreed, in the JS world? Hell no. Ironically, doing a local build would itself pull in a bunch of dependencies, whereas now you can at least have one built dependency technically.
All not problems for Go: pull through proxy is fast and eliminates the need for a toolchain if you just want to download, and Go builds are fully bit-for-bit reproducible.
That would be an impossible expectation on the Go toolchain. The pull through proxy can’t magically avoid the need to transfer all dependencies to my device, especially including any native code or other resources. Large projects are going to need to download stuff - think about how some cloud clients build code dynamically from API definition or how many codecs wrap native code.
Similarly, newer versions of Go change the compiler–which to be first is a good thing–so even if I start with the same source in Git I might not have the same compiled bytes in the result.
Again, mone of this is a bad thing: it just means that I want to compile binaries and ship those so they don’t unexpectedly change in the future and my CI pipeline doesn’t need to have a full Go build stage when all I want is to use Crane to do something with a container.
sometimes i think shipping source + compiler would be faster...
the other day i was wondering why the terraform aws provider binary was now around 800MB compiled https://github.com/hashicorp/terraform-provider-aws/issues/3...
We have a terraform monorepo with many small workspaces (ie: state files). The amount of disk space used by the .terraform directories on a fully inited clone is wild
As a lot of these npm "packages" are glorified code snippets that should never have been individual libraries, perhaps this would drive people to standardise and improve the build tooling, or even move towards having sensibly sized libraries?
Yes, there’s widespread recognition that the small standard library makes JavaScript uniquely dependent on huge trees of packages, and that many of them (e.g. is-arrayish from last week) are no longer necessary but still linger from the era where it was even worse.
However, this isn’t a problem specific to JavaScript – for example, Python has a much richer standard library and we still see the same types of attacks on PyPI. The entire open source world has been built on an concept of trust which was arguably always more optimistic than realistic, and everyone is pivoting – especially after cryptocurrency’s inherent insecurity created enough of a profit margin to incentivize serious attacks.
Doubt that many read those diffs of dependencies.