Pkg ecosystem: Learning from other's mistakes


#1

Let’s learn from other’s mistakes instead of repeating them. Specifically, the debacle that is node.js / npm. See the following two recent train-wrecks:

The first one is the infamous left-pad: This was an essentially 10-line package that got pulled in as a transitive dependency into unholy amounts of code. At some time, the maintainer “unpublished” this package over some not relevant scuffle, and giant parts of the ecosystem broke down (unsatisfiable dependencies).

The second one is more recent and still unfolding: A less ridiculous package got used as transitive dependency in a lot of packages (2 million downloads / week, almost 1600 directly dependent packages and who knows how many indirect reverse dependencies). At some point, the original author passed maintainership, due to lack of interest. The new maintainer promptly backdoored the package. This was discovered today, after 2+ months.

Let us reflect for a moment on what we can do better, especially before practices that turn out to be harmful get entrenched.

Typical linux distributions have their act together: Failure on such a monumental scale is rare. Node.js / npm does not have its act together.

This is not about bad actors, or failure of judgement when passing on maintainership, or failure of judgement when pulling in dependencies. I don’t even think that this is a technological failure. Human error is a symptom of a fragile system.

We want a system / community / package manager / ecosystem that accounts for lazy programmers and human error.

I’m not against the existence of footguns; I’m against the path of least resistance being a loaded footgun. I’ll post my own thoughts on what to learn below, in a couple of minutes. But I think this issue is worth discussing from more viewpoints than just my own, especially since there are much more experienced people around.


How do you feel about Julia Observer?
Pkg: attack vectors
#2

There is also this: https://incolumitas.com/2016/06/08/typosquatting-package-managers/.


#3

Ok, let’s compare to a system that mostly works. For example, archlinux. “Official packages” have official maintainers that curate upstream changes. Then, there is the AUR: These are “unofficial packages”, with a pretty low bar to entry, and slightly lower ease of installation. Backdoored AUR have happened, and got pulled.

I am advocating for a somewhat similar system: There should be a relatively small subset of “curated” packages (that cannot have uncurated dependencies). And then there should be a larger subset of “uncurated” packages. Installation of uncurated packages need not be complicated: For example something like (v1.0) pkg> add uncurated foobar, with mandatory warning and display of a list of all transitive dependencies, as well as suitable overrides for running silently.

This is made more important by the philosophy of “no batteries included” / “packages are first class” / excision from Base and eventual move from stdlib to packages, as opposed to e.g. python. There is a large difference whom I am giving code-execution on my machine, between DataStructures and BenchmarkTools (both could be stdlib, trust-wise) vs the eventual left-pads of the julia world.

This would also answer the perennial question: Why is $feature not part of the language and instead a package? Nobody would ask this of a linux distribution, because all packages can be expected to work: If the package screws up, then the buck stops somewhere. Instead, “why is there no curated package for $feature” would become a valid criticism.

As opposed to julia and node.js: Being a registered package is no high bar.

This would also significantly improve discovery: Users can scan the curated section first, at least if their desired functionality is not entirely niche.

What limits the size of the curated registry? Basically this is a matter of building trust, between the old maintainers and newcomers. If we cannot maintain a large curated registry, for lack of people who have already proven to be reliable and are willing to maintain packages, then the curated subset needs to be small. Better a small curated subset than an empty one.

Seeing that a lot of trusted (core)devs of julialang are also working in many important packages, there is already a seed for such a system. The “trusted maintainers” would not need to take over package development; they would instead simply be the only ones with permission to tag new versions in the curated registry, and would be expected to be feel appropriately sorry if they unintentionally sign off on a backdoor. They would not be expected to read all commits going into a new release; they would be expected to attest that the development leading to the new release was generally sane.

As another point, we would immediately get a reasonable target for PkgEval :wink:


#4

Also, cf this.

This issue is structural, not individual.

The specific incident triggering this post appears to be relatively benign: A targeted attack against bitcoin wallets instead of widespread intrusion, pivoting and month-long exfiltration of confidential data. :popcorn:


#5

Something along these lines seems reasonable, though of course you have the problem of “who maintains the maintainers?”


#6

I think the communities to learn from are linux/BSD distros: Most distros get this right most of the time (in spite of debians openssl RRNG trainwreck). One can look at what e.g. debian, freebsd, archlinux, gentoo do.

I’m not an active package maintainer for any distro, so my experience is limited. But none of these projects take the “wild west” approach, like e.g. npm and pkg. Since the “core” part of the ecosystem is relatively small, and the base of trusted devs relatively large, I’d hope that some subset (without any outgoing dependencies leaving that subset) can be selected, and maintenance of maintainers would be a non-problem until more experience has been collected.

Most niche packages don’t need to be curated. The problem is not in dependencies in user code (users explicitly research and decide to use package), it is in transitive dependencies: We should not demand users skim issue tracker / codebase for every single transitive dependency every time. Even more, the risk is in culture: A world with tiny packages where many popular packages install another 1k packages of dubious origin and no quality control whatsoever is just not sustainable and asking npm users to personally vet transitive deps is a bad joke.

To prevent that: Mild quality control though curated repos, mild pressure to only have transitive deps inside the curated world, mild pressure to consolidate trivial packages into larger ones, mild pressure to rather duplicate 20 lines of code than pull in a dep.


#7

Manpower, primarily. While the problem you describe has the potential to be acute at some point, its solution requires a lot of (volunteer) work. Someone has to do the curating.

That said, the situation in Julia does not appear to be that dismal. While many packages have a single maintainer in the formal sense, widely used ones usually get contributions from other people, many of whom keep an eye on the changes. This is not foolproof or perfect, but could mitigate the problems somewhat.

So, to recap, you are quietly working in your free time on a very monotonous yet demanding task, but if disaster hits you will take (part of) the blame. I am not sure this is a dream job.

I am not sure I understand this proposal.


#8

Can you explain why you consider this a relevant example for the Julia ecosystem? My understanding is that with the current (and forthcoming) registry setup, one could not do the same thing in Julia, as it is not possible to “unpublish” packages, and definitely not without the assistance of the registry maintainers.


#9

You can remove the repository, there are currently no official backups/mirrors.


#10

There is a big related discussion on slack, if you’re quick enough to look:
https://julialang.slack.com/archives/C680MM7D4/p1543248105775800


#11

There is no problem of manpower: The required effort is one of the curators or curator-tusted devs hitting “tag new release”, which is a tiny tiny fraction of the effort needed to develop the code that went into the new release.

The curation would not serve to do actual coding work. It would serve to transmit trust. The problem is entirely social: it is about building a community of trusted people who are willing to take responsibility and oversight. This doesn’t diminish the effort going into a curated repository, but it is an effort of a different nature than coding.

So, to recap, you are quietly working in your free time on a very monotonous yet demanding task, but if disaster hits you will take (part of) the blame. I am not sure this is a dream job.

If you push any code on github this is already the case (or do you seriously think you won’t be blamed if you compromise users by pushing a backdoor?). There is ample precedent of curated software repositories working well: About every Linux / BSD distro. There is also ample precedent of failure. There are a lot of ideological, economic, social and technical factors playing into this. Just some examples:

  1. The windows world, aka DLL-hell. The main thing to notice is complete decentralization with complete abdication of both technical and social responsibility of all involved parties. You get a chaotic hellscape of people bundling their dependencies (and never fixing security vulns in their bundled libpng), and users voluntarily installing malware. Well, at least users own their machines and microsoft failed at becoming a tyrant overlord. Software distribution on windows machines becomes bearable if you install cygwin.

  2. The Android world. Like the windows world, only worse: distribution is centralized with no meaningful curation, and the trust relationship between developers and users is intentionally broken by google (download from the play store with no meaningful control that the purported author of a package is the real author). Also, unsophisticated users don’t own their machines anymore.

  3. The iPhone world. A dystopian hellscape where “owners” don’t own their devices anymore. Imagine Steve Job’s boot stamping on your face, forever (iOS is functionally emulating a Harvard architecture). Curation is still minimal, but at least not entirely meaningless, and users get really good exploit mitigation out of the deal (and by paying 100$/year for a dev license they can run their own code on “their own” devices; heck, it is probably even possible to make julia work with a dev license). Our Microsoft/Palladium-nightmare from the late 90s come true.

  4. Julia/pkg, node.js/npm. Technical responsibility is shouldered, dependencies and versioning are properly managed. Users own their machines, and it is a joy to use the systems. However, we see a complete abdication of social responsibility.

  5. About every free software distro. Both technical and social responsibility are taken. The system is not decentralized; instead you have a hierarchical federated system of curation. Different repos (core/extra/community/AUR/archzfs) have different curation standards. The entire thing is also federated one level up: You can use a different distro. For users choosing a distro, the social aspect of curation standards of repositories is probably more important than the technical aspects that separate distros. You can fork entire distros, including their curation work, like ubuntu did with debian.

I am arguing that (5) is a better model to follow than (4): We should not abdicate responsibility for the social aspect of deciding which code to trust to run on users machines, and of actively shaping an ecosystem that is sane. If there is consensus that this responsibility should be shouldered, then it is possible to gradually improve the state of things and learn from mistakes. But it is a binary decision of whether to do any curation at all.


#12

I am not sure I understand the proposal. If no one reviews code, where is the effective curation coming from?

I am not sure that “being blamed” is an effective deterrent for people who have decided to backdoor software. But this is beside the point — you were talking about curators “being blamed” because they let a backdoor slip by.

Now, if I understand correctly, you are proposing that they just hit the tag button without any effective review. Sure, I could “blame” them, but I doubt it is going to do any good to anyone; as I implicitly accepted a system where I know they don’t effectively review code.


#13

I’m proposing that we get a new repository for pkg: “trusted” vs “untrusted”. Only trusted community members can tag new releases into the “trusted” repo, and tags carry cryptographic hashes / signatures, possibly with a central backup to prevent depublication. Packages in the curated section can only have dependencies inside the curated section, so it is a self-contained subset of the ecosystem. Initially, the core devs decide on which other devs to invite to trusted status, and which packages to put into the curated sections. Once there is no overlap between trusted curators and devs who are somewhat active in a packages development, the curated release cannot get updated anymore (or the package needs to be kicked into uncurated status).

The curator does not attest “I have audited the entire package”; instead, the curator attests “this package is under active development by sane people with sane review processes, and I have looked at a small subset of the development leading to the new release. I would trust that code on my machine. I can recommend that this package meets $quality_guideline.”

The debian maintainer for openssl is not responsible for catching heartbleed; he is responsible for catching left_pad. The actual openssl people are responsible for catching heartbleed.

And yes, this means that one-man-packages probably don’t belong into the curated section. This is fine, and nobody prevents users from installing uncurated packages. Users just get a warning that they are trusting a single person without oversight with code-exec on their machine; and larger packages are discouraged from adding one-man-packages as (direct or indirect) dependencies without taking part in the maintenance labor.


#14

I’d prefer a decentralized process.

Maybe each user could have a list of people they trust. If I’m paranoid, I only put myself on the list, and then I have to sign off on every new version of every package I’m using.

If I’m more trusting, I put a couple of other people on the list, and automatically get new versions as soon as they’ve either signed off on them or published them on a server that I trust. If I know that somebody is both trustworthy and careful about whom they add to their list, I might merge their list with mine. (This might apply to e.g. the core developers or my local sysadmin.)

Those who are not worried about malicious code can skip the list and automatically accept the latest version of every registered package, as is the current default.


#15

Too trusting. :wink:


#16

I’m far from an expert in this sort of thing, but I have been playing around with this for the past few months. So take this all with a grain of salt…

There are two distinct problems here: (1) allowing users to identify people that they trust to write good code, and (2) providing a mechanism for developers to sign off on the code.

For problem (1), I think that we can draw inspiration from openPGP’s web of trust. The basic idea is that every person has a secret, and they can use this secret to endorse other people. In PGP, this endorsement simply means that the endorser is satisfied that the endorse is the person that they say they are. There are also different levels of endorsement that can be given.

In a software development context, we could adapt this to mean that the endorser thinks that the endorse is a trustworthy dev. Here trustworthiness means both that the dev won’t deliberately insert malware into their code, and also that they won’t accidentally write insecure code that another party can exploit.

Now a user can use this web of trust to determine who they trust, and how much they want to trust the endorsements of the people that they trust. As Stefan pointed out in Slack, if the core Julia devs wanted to backdoor users, they would have already. So, at a minimum, they should probably be trusted.

Turning to problem (2), we need a way for devs to endorse packages as well. In my mind, this should be done on a per version basis. We could once again use the web-of-trust model to implement this. Individual devs could endorse a particular version of a package (perhaps by endorsing the checksum of the package). The strength of their enforcement will be related to the degree to which they have audited that particular version of the package. Obviously, the maintainer of the package will be able to sign off on the new version, and for centrally important packages, there is often a community of developers who track the changes of the package, and will be able to sign off as well.

The big downside to this approach is one that is shared by all web-of-trust systems: it is a lot of work to maintain, and so people generally don’t do it. I think that this system has to potential to solve a lot of the trust problems with OSS, but I would love to hear your thoughts…


#17

As Stefan pointed out in Slack, if the core Julia devs wanted to backdoor users, they would have already. So, at a minimum, they should probably be trusted.

Where’s the evidence they haven’t done this? :laughing:

Though if I’m being honest, if the julia devs want access to my computer, they can have it… I suspect they’d end up making things better


#18

From doc

Last but not least, Pkg is designed to support federated package registries. This means that it allows multiple registries managed by different parties to interact seamlessly. In particular, this includes private registries which can live behind corporate firewalls. You can install and update your own packages from a private registry with exactly the same tools and workflows that you use to install and manage official Julia packages.

This could be interesting too:

Is there functionality which could satisfy what is @foobar_lv2 proposing?

Now I miss something like Pkg.add("foobar", registry="curated") but maybe it is implemented somehow?


#19

FYI: There was a lot of discussion on slack. General opinion does not like the curated subset idea (many people think this is unworkable due to additional workload on code reviews burning people out).

Cross-posting my most recent plea:

My viewpoint is not so much what features could have prevented the specifics of the npm backdoor disaster. It is more: How would the ecosystem need to look like to be resilient to such disasters. If almost unmaintained one-man-projects become transitive deps of significant parts of the ecosystem, then everything else is putting lipstick on a pig. Sure, npm’s distribution of minified files and lack of signatures is a dumpster fire that we won’t repeat, but it is not the root cause.

The root cause is that the path of least resistance for node dev X is to add left-pad to their deps, and packages downstream of X don’t complain. Once the ecosystem is at such a point, individual devs can’t really do anything about this: Es gibt kein richtiges Leben im falschen [Adorno, “Wrong life cannot be lived rightly”].

Julia’s current path encourages infrastructure packages (lots of downstream reverse deps) outside of julialang’s control. Macrotools, Datastructures, etc. Future excisions will increase that.

Now, I’m not saying that Datastructures.jl or CategoricalArrays.jl will go bad: They won’t. That is precisely my point of a curated subset: Make it easier for devs to know which packages are under de-facto quality control by the community, as opposed to one-man-projects. Make the path of least resistance healthy for the ecosystem. This is both by reducing resistance on the good path (discoverability of well-managed packages with good quality control) and by increasing resistance on the bad path (e.g.: pkg shows by default all uncurated transitive deps that will be installed, with a scary warning. This is super mild nudging!).

We win a lot if we establish a culture with formalized multiple tiers of package/version trustworthy-ness / blessed-ness. I don’t want to increase code-review or cause lots of extra work for anyone. I want to move to a system where everybody knows which packages are infrastructure and OK deps, and which ones are “expensive” deps. Maybe that is enough.

Maybe formal code-reviews for new releases of curated/infrastructure packages would be needed at some point; or maybe some web-of-trust / ownership-warnings / etc are needed in the future. But that is a future hypothetical step, and a curated registry is a necessary first step.


#20

I am participating in the Slack discussion, but thought I’d summarize my thoughts here since Slack has short-term memory :slight_smile:

My first thought…

This is a good discussion to have, but like most good discussions when it comes to Julia, I am sure the core devs have thought about it and the current state represents some deep reflection by people who know more than me so I chime in with humble spirit.

My second thought…

Related to the first, I am pretty sure that having a “Curated” repository was baked into the design of Pkg3. If I remember correctly (:older_man:) that was the name of the original repository. In any case, the capability is there to have an LTS curated repo for those doing work in regulated industries with higher security / audit standards and JuliaPro seems like a good step in that direction.

The challenge is, of course as @foobar_lv2 mentioned, that having a curated repo will require MORE work from already resource-constrained people. Another issue is, as @ChrisRackauckas points out, the maturity level of the ecosystem. A lot of major packages are still undergoing nearly daily bug fixes etc. That makes it difficult to maintain a stable curated repo when changes are still happening rapidly.

I am not so concerned about having a curated repo right now. It is needed and I know it will come when the time is right.

My third thought…

In light of the recent flatmap-stream issue, I think finding some way to protect Julia users from something like that happening is more pressing. If a similar thing happened here, and it could, that would be a major hit to reputation and confidence in Julia. If I were a major corporation evaluating a proposal to use Julia and, in the middle of my review, a backdoor was discovered in a popular package, I’d close the folder with a big red stamp: “Declined”.

So a lot of the Slack discussion has been around finding a more immediate solution to this reputation risk.

Some of the discussion has been about developing a “trust” framework somehow. Package maintainers would be assigned a trust score and we build a framework around that. I personally have an allergic reaction to a “Trust Score”. Haven’t you seen Black Mirror’s “Nosedive” or China’s “Social Credit”? :sweat_smile: I think the potential for unintended consequences is quite high with this idea.

Another option discussed with @StefanKarpinski is to have Pkg flag a warning when updating packages that meet certain specified criteria. Again, some of the discussion was around a “trust score”, but I had a different thought. I don’t trust “trust”.

When updating a package, Pkg can identify who merged the latest version. If that person does not appear in the merge history, i.e. the latest version is from a new “merger”, then Pkg issues a warning before automatically updating.

Then the challenge is to avoid massive and annoying warnings. One way to do this is, say a package has a new developer with merge rights, when they merge a new version, left unchecked, this could create widespread warnings. To avoid this, a previous accepted “merger” can subsequently merge a new “Introducing developer X” version. As soon as this introductory merge is performed, all warning about the new developer X will stop, because they are in the merge history now.

Summary:

Please no “trust” system. I don’t trust “trust”.

Let’s implement a simple check in Pkg to warn when the latest version of a package is merged by an new “merger”. This is simple and probably represent 70% of a full solution.