Recommend using `pnpm` for transparent, correct disk-space saving

App developers: please consider using pnpm. YNH core developers, please consider recommending or non-strictly enforcing using of pnpm, in place of npm, in app packaging.

Why

I want to suggest app developers consider, and favor, using pnpm in packaging for javascript packages. I suggest this because it, pnpm, will transparently share downloaded dependencies among all apps installed on the system relying on it (shared based on hash of downloaded package source). It downloads once for the host, and for each project managed by pnpm, symlinks the project’s packages in node_module to the system-wide shared module, based on full hash of fully downloaded, unique package@version.

I share this because I’m inspecting causes of disk space filling, and node_modules often come up as heavy hitters in disk space usage, and disk space usage and mitigation strategies often require coordination beyond the level of individual app packaging.

Specific Inspiration

I had issues with running out of disk space on my server, shared with friends, hosting yunohost. I happened to resolve my disk space issues with a journalctl --rotate and journalctl --vacuum-time=2w, and have other strategies I want to push on.

Other disk pressure

Particularly inclusion of cypress, which should be in dev (and not production) package.json dependencies of projects, and so should probably not get installed by ynh-packaged apps. Cypress, a web testing framework (presumably used by upstream projects for end-to-end integration tests) is using 606MB/1.9GB + 643MB/1.5GB in my most disk-hungry applications of Uptime-Kuma and Monica, respectively. It’s installed and consuming disk space, despite that it should probably never run while installed on my machine.

Uptime-Kuma is using another 510MB for node_modules—see again, where pnpm can help—582MB in my status timeseries data, and 200MB in an opaque SHA512-keyed .npm cache.

Additions

It seems like Yarn, with PnP (Plug 'n Play) strategy, addresses this as well, with a shared global cache, however with enableGlobalCache set to default false, it make a copy for each project. It seems like yarn does not by itself, by default, nor in most configurations, solve this issue too as pnpm does.

1 Like

To me it’s not clear if pnpm is the silver bullet you want it to be.

First, apps do often depends on specific NodeJS version, hence maybe different versions of npm, hence maybe different versions of pnpm

Second, apps to often depends on very specific version of packages, and I’m not convinced that the overlap between every nodejs app is gonna be worth using pnpm … (maybe it is worth, i don’t know the nodejs ecosystem well enough)

1 Like

I appreciate the answer!

If it deduplicated a 1.5GB library once, for a user with at least 5 applications, then it would be worthwhile to my mind. (I don’t know the stats, I don’t know if they’re collected). I understand that this is a complicated ecosystem, and that supporting a different package manager from the upstream could very well represent an untenable maintenance burden.

At more length than I could quickly muster:

Some indirect indications of project maturity, and errors worth documenting for users:

Not holding you to anything, I’m curious what risks loom largest in your mind about this direction? Especially if someone else worked on maintaining it.


To more specifically address the concern you explicitly raised:

In essence:

  • npx pnpm@<version> <command>

or:

  • corepack prepare pnpm@7.13.6 --activate

I would bet good money that different versions of pnpm would share the content-addressable package store seamlessly without conflict at the same location.