Package licenses: Contemplations and considerations

I’ve recently been giving more thoughts to the way I license my packages. While I don’t anticipate the choice becoming practically significant any times soon, I’ve realised that the topic probably deserves more consideration that I’ve previously given it.

Why we need to think about licenses

While many of us want to “just do (F)OSS” without paying much attention to licenses, they are an unavoidable part of the landscape due to the way intellectual property law has developed over the last several decades.

I’ve recently been perusing Lawrence Rosen’s Open Source Licensing: Software Freedom and Intellectual Property Law, which has been a useful reference for clarifying my understanding of several thorny nuances. As you might have guessed, Lawrence Rosen is not a layman but an adjunct professor of law at Columbia University who also provides legal advice to the OSI.

When I release a package of mine to the public, I hope, as I think many of us do, that it will be of as broad benefit to the community as possible, and that the community will help improve it if is indeed found useful.

The OSI gives us a wide spectrum of licenses to choose from, choosing various balances of rights and responsibilities, written by people with varying amounts of legal experience.

The MIT license

The MIT license is predominant in the Julia package ecosystem and is arguably the cheerleader for ‘permissive’ licenses. It’s famously short; in fact, it’s so concise that I can easily include the “here you go” clause here for reference:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

[warranty disclaimer]

Isn’t that short and sweet! As I’ve mulled this over, I’ve realised there are two ways in which the MIT license falls short.

Limitations of the MIT license

Firstly, it’s too succinct. I think this is largely a consequence of the time it was written (late 1980s). For instance, it only explicitly mentions how users can interact with the implementation shared, not the methods. Software IP developed significantly in courts after the MIT license was written, and patents started being applied to the methods/processes of software. More recent licenses such as the Apache v2 license, GPL v3, MPL v2, and EPL v2 contain explicit provisions dealing with both copyright and patent rights. To quote Rosen on the MIT license:

This improves on the BSD license by specifically mentioning all of the exclusive rights under copyright law and almost all of the exclusive rights under patent law (e.g., “make” is omitted, but that is probably unnecessary given the other verbs in that sentence). No longer are we limited by the BSD’s reference to redistribution and use. On the other hand, the new phrase deal in the software has no precise legal meaning. In light of the longer list of rights in the MIT license grant, it appears not to limit copyright or patent rights in any way. Like the BSD license that preceded it, the scope of the patent grant in the MIT license is implicit rather than explicit. This means that a licensee cannot be sure that the implied patent rights granted by MIT are broad enough to cover derivative works.

Secondly, much of the way we hope people will use our packages, and share improvements is solely based on assumptions of goodwill. While nice, hope is not a strategy. Some licenses require that modifications in software that are then distributed must also be shared back. These are the “copyleft” licenses.

Copyleft and its challenges

Copyleft licenses also introduce another key property: being viral. If GPL-v3 licensed code is incorporated into a larger work, the larger work as a whole must satisfy the requirements of the GPL-v3 license. When people and companies complain about copyleft licenses being impractical and making software hard to adopt, it is this provision that’s responsible. Some people believe the entire software ecosystem should be copyleft and viral, but I’ll leave that debate aside.

Hybrid or “Weakly Copyleft” Licenses

This brings me to an interesting category of licenses that I’ve seen termed “hybrid” and “weakly copyleft”. The three most well-known such licenses are:

  • Mozilla Public License (MPL),
  • European Union Public License (EUPL), and
  • GNU Lesser General Public License (LGPL)

The LGPL is a little messy in two respects:

  • It contains imprecise legal terms (from the analysis I’ve seen by Rosen)
  • It doesn’t precisely define the extent of the license with respect to a modified work

Here’s a relevant extract from Rosen’s writings:

These sections of the LGPL are an impenetrable maze of technological babble. They should not be in a general-purpose software license. The LGPL even concedes that “the threshold for this to be true is not precisely defined by law.” (LGPL section 5.) A licensee under these provisions won’t have a clue how extensive his or her good faith efforts must be when creating a derivative work in accordance with sections 2(d) and 5 of the LGPL.

From my reading, both the EUPL and the MPL are simply better licenses, they are less ambiguous, have more thorough definition sections, and like the Apache license address patent rights. To quote Rosen once again:

The MPL is a serious license. I will direct much less criticism to the structure and terms of the MPL in this book than to the other licenses I’ve already written about, because the MPL is a high-quality, professional legal accomplishment in a commercial setting

The MPL v2 makes no requirements on how MPL-licensed work is used, it can be incorporated into larger works and used with any mix of licenses including in proprietary/commercial settings. In this way it is much like the MIT and other “permissive” licenses. The MPL however, requires that direct improvements to the licensed work be shared (reciprocality), and that when distributing a larger work that includes an MPL component notice of the MPL component be distributed too (credit).

I essentially see this as a higher quality version of the MIT license, with mild extra stipulations that community-minded individuals will already be following (sharing direct improvements back, letting people know when you use it), and no restriction to its use in wider/commercial projects.

The EUPL seems very similar to the MPL, but is fantastically multi-lingual (it’s in 23 languages, where all have equal value), more widely interoperable, and covers SaaS usage too.

I find this mix of characteristics quite attractive. I think the MPL and EUPL both do a meaningfully better job expressing the spirit in which I share my work, while being better written than back-of-napkin licenses like MIT.

What are your thoughts on licensing choices? Have you looked at the MPL and/or EUPL? Is this a topic you’ve devoted much attention to?

I’d be interested in hearing other people’s thoughts and experiences.


Some references

19 Likes

I believe that for many of us coming from a scientific background what really matters is that we can use the software freely without legal restrictions, whatever that means. Because we don’t have time nor bandwidth to dive into legal terms, we just want to adopt the most widely adopted FOSS license out there, which is MIT in case of Julia and Python.

Choosing this “default” license has benefits:

  1. Approval of FOSS software in industry is more straightforward
  2. Users don’t have to think twice before adding a dependency

If you are really concerned with the way people use your software, maybe you need to consider closing the source? Legal terms don’t inhibit people from using the software in ways you don’t like, unfortunately.

8 Likes

Yes, yes, and a resounding yes! I’m personally inclined to use EUPLv2 everywhere now. Have some helpful links added to your collection:

I’m not particularly concerned with how people use my software, but I am concerned with how people treat others wanting to use that software, including derivatives. I share my software freely and openly because I want others to use it, and I want others that use my software to continue doing that with their modifications of my software, so that more people continue to benefit from it. MIT just doesn’t allow that kind of sharing, it’s more a “I don’t care at all what happens” licence.

Closing the software down entirely would, ironically, prevent people from using it entirely! I just don’t have the capacity (or ability, or even opportunity) to create a business out of arbitrary code. Not even mentioning that it’s generally not possible to make a business out of every software project.

10 Likes

Thanks for clarifying. Can you share an example? :slight_smile: What treatment exactly? What is a practical situation where MIT fails in your view?

MIT doesn’t govern at all how modifications should be shared, so some third party/company is free to just take your code and relicense it to sell it, with or without their modifications applied. That’s just a consequence of the the license:

to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software

See also licensing - Relicensing an MIT licensed project under the GPL that has non code contributions from others - Open Source Stack Exchange.

Or consider cases where you’ve made some nifty new algorithmic discovery - MIT doesn’t prevent a third party from swooping in and claiming a patent on that discovery, collecting royalties in the process (of course, the patent office in theory checks for prior work, but in practice…). See also the commentary of the FSF on it (in spite of their questionable stance on copyleft):

For substantial programs it is better to use the Apache 2.0 license since it blocks patent treachery.

No, under these considerations, I cannot with good conscience say that the MIT license guarantees that what used to be freely accessible, also stays freely accessible.

4 Likes

Thanks @Sukera for the additional clarifications.

I can see how this can be a concern for people, specially if the package is an “app”, or can be seen as a final product.

Most packages I develop are libraries, and so I really want to see people building products out of them.

In my experience this is not a realistic scenario in the sense that the company/institution wouldn’t allow opening the source code before the submission of the patent.

I think the key point here is that if there’s sufficient novelty for aspects of the work to be patentable, regardless of who might own a patent, it’s not clear that releasing code under the MIT license actually allows people to use the work without exposure to patent liability.

And nothing stops them from doing that with MPL or EUPL :slight_smile: It’s just that they can’t hide that fact if they’re building e.g. a SaaS. With MIT, they can, because accessing a service over the web is not the same as receiving a copy of the software. There isn’t even any obligation to inform people that someone has built something with your code, so you’re even missing out on the credit for your code.

Licensing doesn’t only concern companies or institutions though. As a private developer, I certainly can’t be bothered to check whether my code would be viable for a patent. I simply can’t afford to pay patent lawyers to look over every little thing, nor do I want to.

Not to oversimplify, but it really does seem to me that the MPL and EUPL can be roughly described as: MIT + more robust legalese + reciprocity + credit, with no restrictions on use.

3 Likes

So what is the difference in effect between a license explicitly forbidding patenting the algorithm that you invented and published in your package, and a liberal license like MIT? In both cases the only thing of relevance would be prior art, as I see it. If the patent office doesn’t find your prior art, they would issue patent, whatever the license of the package they are not aware of anyway. Then you can come and claim your prior art, pay money and win process :wink:

The MIT license just doesn’t explicitly address the patent side of IP at all.

In this regard, what licenses like Apache/MPL/EUPL do is provide extra guarantees to contributors and users. More specifically, they include explicit clauses that grant users a license to any patents held by contributors. This means contributors cannot later claim patent infringement on the use of their contributions, and also reduces the risk of third parties patenting the software and restricting its use. This protects users and contributors from patent litigation over the licensed software.

Since the MIT license leaves this ambiguous, you can’t clearly point to a part of the license and instead would have to rely on certain interpretations and proving prior art in court: a much more complicated and costly process.

Do I anticipate this coming up much (particularly with my own packages? Not at all! But why not opt for a license that just doesn’t have this ambiguity/hole in the first place?

2 Likes

It’s actually also a bit worse than that - since it’s left ambiguous, it’s much harder to sue for damages on top of invalidating the patent. Invalidating a granted patent because of prior art is one thing, but suing for damages on top? An entirely different beast. If it’s explicitly forbidden, it’s much easier to do that.

Not to mention that both of those law suits are very likely to be super expensive, so reducing the number of legal arguments you have to make in such a case is, to me at least, obviously beneficial.

1 Like

On the foibles of the MIT license, I’ve just come across another lawyer writing on it that may be of interest to some folks: https://writing.kemitchell.com/2016/09/21/MIT-License-Line-by-Line.html.

The conclusion is it’s mostly fine and has generally served well, but is “by no means a panacea for all software IP ills”.

Here's a critique within the piece relevant to the current discussion

Grant Scope

to deal in the Software without restriction,

From the licensee’s point of view, these are the seven most important words in The MIT License. The key legal concerns are getting sued for copyright infringement and getting sued for patent infringement. Neither copyright law nor patent law uses “to deal in” as a term of art; it has no specific meaning in court. As a result, any court deciding a dispute between a licensor and a licensee would ask what the parties meant and understood by this language. What the court will see is that the language is intentionally broad and open-ended. It gives licensees a strong argument against any claim by a licensor that they didn’t give permission for the licensee to do that specific thing with the software, even if the thought clearly didn’t occur to either side when the license was given.

including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so,

No piece of legal writing is perfect, “fully settled in meaning”, or unmistakably clear. Beware anyone who pretends otherwise. This is the least perfect part of The MIT License. There are three main issues:

First, “including without limitation” is a legal antipattern. It crops up in any number of flavors:

  • “including, without limitation”
  • “including, without limiting the generality of the foregoing”
  • “including, but not limited to”
  • many, many pointless variations

All of these share a common purpose, and they all fail to achieve it reliably. Fundamentally, drafters who use them try to have their cake and eat it, too. In The MIT License, that means introducing specific examples of “dealing in the Software”—“use, copy, modify” and so on—without implying that licensee action has to be something like the examples given to count as “dealing in”. The trouble is that, if you end up needing a court to review and interpret the terms of a license, the court will see its job as finding out what those fighting meant by the language. If the court needs to decide what “deal in” means, it cannot “unsee” the examples, even if you tell it to. I’d argue that “deal in the Software without restriction” alone would be better for licensees. Also shorter.

Second, the verbs given as examples of “deal in” are a hodgepodge. Some have specific meanings under copyright or patent law, others almost do or just plain don’t:

  • “use” appears in United States Code title 35, section 271(a), the patent law’s list of what patent owners can sue others for doing without permission.
  • “copy” appears in United States Code title 17, section 106, the copyright law’s list of what copyright owners can sue others for doing without permission.
  • “modify” doesn’t appear in either copyright or patent statute. It is probably closest to “prepare derivative works” under the copyright statute, but may also implicate improving or otherwise derivative inventions.
  • “merge” doesn’t appear in either copyright or patent statute. “Merger” has a specific meaning in copyright, but that’s clearly not what’s intended here. Rather, a court would probably read “merge” according to its meaning in industry, as in “to merge code”.
  • “publish” doesn’t appear in either copyright or patent statute. Since “the Software” is what’s being published, it probably hews closest to “distribute” under the copyright statute. That statute also covers rights to perform and display works “publicly”, but those rights apply only to specific kinds of copyrighted work, like plays, sound recordings, and motion pictures.
  • “distribute” appears in the copyright statute.
  • “sublicense” is a general term of intellectual property law. The right to sublicense means the right to give others licenses of their own, to do some or all of what you have permission to do. The MIT License’s right to sublicense is actually somewhat unusual in open-source licenses generally. The norm is what Heather Meeker calls a “direct licensing” approach, where everyone who gets a copy of the software and its license terms gets a license direct from the owner. Anyone who might get a sublicense under the MIT License will probably end up with a copy of the license telling them they have a direct license, too.
  • “sell copies of” is a mongrel. It is close to “offer to sell” and “sell” in the patent statute, but refers to “copies”, a copyright concept. On the copyright side, it seems close to “distribute”, but the copyright statute makes no mention of sales.
  • “permit persons to whom the Software is furnished to do so” seems redundant of “sublicense”. It’s also unnecessary to the extent folks who get copies also get a direct license.

Lastly, as a result of this mishmash of legal, industry, general-intellectual-property, and general-use terms, it isn’t clear whether The MIT License includes a patent license. The general language “deal in” and some of the example verbs, especially “use”, point toward a patent license, albeit a very unclear one. The fact that the license comes from the copyright holder, who may or may not have patent rights in inventions in the software, as well as most of the example verbs and the definition of “the Software” itself, all point strongly toward a copyright license. More recent permissive open-source licenses, like Apache 2.0, address copyright, patent, and even trademark separately and specifically.

2 Likes

I’ve probably thought more about open source licensing than most people. You can certainly try non-MIT licenses if you want, but there is dangers to engaging in license innovations. Everybody knows what the industry consensus is on how MIT works (which may or may not match what the licenses says exactly, but there is some legal relevance to the common understanding also) and that makes it easy for people to use, contribute to, etc. The key thing I would say is that you have to be clear in your mind on what you want to achieve by the licensing choice, and you have to be sure that the (significant) downsides of going against the bulk of the ecosystem will be worth it.

In particular, you should consider that non-MIT licenses will prevent the code in your package from being incorporated elsewhere in the Julia ecosystem. Now you may say “if that needs to happen, I’ll just relicense it”, but with the standard distributed copyright system, you don’t have that option if you’ve taken contributions. To fix this, you’ll like need some of CLA or additional side agreement that permits such relicensing, but now you’re engaging in full-on license innovation.

Mind you, I’m not saying license innovation is bad. But I am saying that it will cause pain, and you better be sure that the pain is worth it. For example, when I released Cedar, I made a very conscious licensing choice not to license the whole part MIT (though most of it is MIT dual-licensed for precisely the re-use reason). However, I spent many $10k on expensive lawyers to make sure to get this reasonable - it’s easy to screw up, but I thought it was worth it in that case. We’re also very carefully tracking contributor IP rights in case re-licensing becomes ncessary. I’ve never though it to be worth it in any other case, and I’ve written a lot of software.

20 Likes

Thanks for those insights Keno!

I presume you mean socially from people not wanting to read a license they’re not already familiar with? Or are you just talking about viral licenses?

It’s worth noting that GitHub does have T&C’s that affect this:

Additionally, unless there is a Contributor License Agreement to the contrary, whenever you make a contribution to a repository containing notice of a license, you license your contribution under the same terms, and agree that you have the right to license your contribution under those terms.

So there’s no need for license innovation to permit relicensing :slightly_smiling_face:

2 Likes

Thank you for this write-up. I appreciate your overview of licenses, it’s very informative.

I’ve been grappling with the question of licensing recently as I’ve prepared to open-source a project I’ve been working on for a while. The project is part of a larger initiative, a search system, and while I intend to open-source much of that system there are some pieces for which an open-source license is inappropriate. Aside from those pieces, however, an open-source license is not just a nice sentiment, a pleasant thought; it is critical to the operation of the system. GPL’s sentiment is compelling, but gets sticky when modifications are upstreamed into a project which has proprietary components.

This circumstance comes with a couple requirements for an appropriate license:

  • A copyleft condition for modifications
  • Permission to use in a larger work without requiring the disclosure of the larger work’s source

I was interested in AGPL because of its specification of “network use” as distribution, but it gets stuck in the same copyright mess as GPL when it’s used within a larger system. I’ve been considering LGPL, but haven’t been quite comfortable with it. Your Rosen quote about LGPL makes me feel a little less crazy, I felt like half the license was effectively irrelevant to ownership, copyright, etc.

MPL is on my radar, but I haven’t taken a serious look. Your post makes an interesting case and I’ll definitely dig into it some more. And I don’t know if I’ve even encountered EUPL, I’m also intrigued.

Beyond the project I just mentioned, there was a very brief period when I considered licensing VimBindings.jl as LGPL solely to ensure that if a cloud provider came along with some “web terminal” that used the package they’d have to share improvements. Then I got a headache thinking about it and just went with MIT.

Needless to say it’s a subject I’ve given some attention to. Not exactly an urgent issue, but it’s not nothing, especially for work that toes the line between proprietary and open.

3 Likes

No, I mean in the sense of “this packages has become ubiquitous, let’s put it in Base/stdlib” or “oh, turns out these three packages all have the same basic functionality, let’s create an MIT-licensed base package that makes them interoperable”

That’s not the issue. The issue is if you take contributions to a (for arguments sake) GPL-licensed package, but later (for one of the reasons above or another one), you would like to relicense to MIT, this is not possible without additional terms governing your right to do that. The GitHub T&C clause just means that the license clause in the parent repository applies. In this case it would mean that if if I contribute to your GPL-licensed project, I also license my contribution as GPL. However, that package is then functionally always GPL, because you cannot relicense it without my permission and I may be off on a 3 year long spiritual journey in Bali. FWIW, the FSF knows this which is why they always require copyright assignment in all of their GPL projects. Using GPL without that is a bit of a more recent innovation by others who use the GPL. Some of those people fully understood those implications and saw it as a feature, but a lof of people did not.

I just used the GPL as an example here, so I could make the FSF point ;). Same consideration applies to other licenses. But my overall point here is not even that, it’s that this stuff is hard to get right, so you better be sure.

2 Likes

That is trivially disprovable; section 6 of the EUPLv2 (I quote in full):

  1. Chain of Authorship

The original Licensor warrants that the copyright in the Original Work granted hereunder is owned by him/her or licensed to him/her and that he/she has the power and authority to grant the Licence.

Each Contributor warrants that the copyright in the modifications he/she
brings to the Work are owned by him/her or licensed to him/her and that he/she has the power and authority to grant the Licence.

Each time You accept the Licence, the original Licensor and subsequent
Contributors grant You a licence to their contributions to the Work, under the terms of this Licence.

That’s right, the EUPL has a CLA built in!

I guess the other point I’m making is: If you engage in license innovation (or possibly even if you don’t), have a clear mechanism for fixing things later, because it’s likely to go wrong the first time. That said, I will note that even that comes with challenges, because then the question becomes “who gets to decide” and open source developers often have a pretty allergic reaction to non-meritocratic control arrangements. You can probably do it for closely-developed packages where you expect to do the majority of the work and take the occasional contribution, but for community-developed packages it can cause friction.

3 Likes

Yes, relatively common for newer licenses, but that’s the same as the GitHub T&C - it does not permit later relicensing should that becomes necessary. Also, the issue of whether you can actually do that is undecided and the lawyers are split on it (same as the MIT patent issue).

4 Likes