How bad is type piracy actually?

In the world of C, typedef/struct definitions mostly reside in header files and can be included at will, so a type can be made available for interfacing between libraries.
In the world of julia type defintions reside within module code and reusing them in another module is type piracy.

So i wonder, what can i do to reuse C-API like struct definitions across module boundaries? For some time i wondered, if a module that ONLY defines the type could be used, but even that is still type piracy.

Any good idea?

2 Likes

Hunh? You’re free to reuse code across module boundaries. The thing you don’t want to do is drastically change behaviors based upon what modules are loaded:

using Doohickeys
# using Gizmos

The behavior of a Doohickey shouldn’t change based upon whether you’re using Gizmos or not. That’s all type piracy means.

16 Likes

I see. What is the actual definition of type piracy?

“Type piracy” refers to the practice of extending or redefining methods in Base or other packages on types that you have not defined. In some cases, you can get away with type piracy with little ill effect. In extreme cases, however, you can even crash Julia (e.g. if your method extension or redefinition causes invalid input to be passed to a ccall ). Type piracy can complicate reasoning about code, and may introduce incompatibilities that are hard to predict and diagnose.

https://docs.julialang.org/en/v1/manual/style-guide/#Avoid-type-piracy-1

8 Likes

Right, type piracy is defining function f(x::T) where you “own” neither f nor T. If f is your own function, or T is a type you defined, there’s no issue at all. But even if neither is true, it still might be OK. It’s just a smell that something bad might be happening, such as:

  • The author/designer of f really did not want it to support type T, so you’re misunderstanding what the function is supposed to mean.
  • The code is in the wrong place, and should be moved to where f or T is defined.

But if everything seems to be in order, pirate away.

25 Likes

Preferably in code that does not end up in packages though. In scripts/user code/private packages/projects it can be OK and sometimes necessary; but for code that is meant to be reused it should be avoided.

8 Likes

(I think i have asked the wrong question…) Can you clarify, what the meaning of “own” is?

My example would be
Module WindowingSystem has Twindow. And functions like WindowingSystem.create(title) returning Twindow.
Module DrawingSystem defines a function DrawIntoWindow(w::Twindow) - while importing WindowingSystem for Twindow.

Occasionally there are some really good reasons to pirate. From Julia’s documentation:

Another example might be a package that acts as a thin wrapper for some C code, which another package might then pirate to implement a higher-level, Julia-friendly API.

The “FriendlyWrapper.jl” package would implement Base methods (which it doesn’t own) on “Wrapper.jl” types (which it doesn’t own). The person who wrote Wrapper.jl might want to keep the code “lean” and only support the API that s/he’s familiar with from C, which is that person’s prerogative. But someone who wants a more Julian interface should not be forced to re-wrap the base C library just to support a more comfortable interface for people coming from the Julia world.

This comes up in practice. Many widget toolkits (e.g., Gtk.jl) are pretty huge on their own, but people also want to combine them with Observables.jl/Reactive.jl/whatever to build a nicer GUI programming workflow. It’s OK to write glue packages that bring together multiple components of the ecosystem and effectively create something new.

10 Likes

In which module did the definition first appear? For example, Base defines length so all packages that define a new Base.length method are extending something, Base.length, they don’t own. But that’s not piracy if you’re defining Base.length(x::T) for a specific T that is defined in the same module/package that this new length method is created in.

FWIW, the fact that standard libraries in Julia do type piracy (e.g. run rand(3,3) * rand(3,3) in the REPL without loading any standard libraries) means that we cannot safely “filter out” unused standard libraries when using PackageCompiler.jl to create “apps”. This leads to a bloated size of the resulting app. For example, we can’t safely filter out e.g. LinearAlgebra, which means we can’t filter out the OpenBLAS library which is huge (60 MB).

10 Likes

I wonder if I should at least collapse VectorizationBase.jl and SIMDPirates.jl into a single library, since it is unlikely anyone would want to use VectorizationBase.jl on its own.
To avoid piracy, SLEEFPirates.jl should also be wrapped into the same module, but AccurateArithmetic.jl currently depends on SIMDPirates.jl but not on SLEEFPirates.jl. Rolling them together would make the dependency a bit heavier.
It would also make it more difficult to swap special function implementations, which it’s still likely I’d want to do.

Another reason I have them spread across libraries is that my *Pirates.jl libraries started as forks, and I don’t want to obscure the original author’s contributions. One of the motivations for forking was to be able to define new methods without pirating, at least when defining “own” more loosely as a “blessed” set of packages allowed to extend types defined in VectorizationBase.jl. E.g., I won’t change the behavior of anyone’s existing code using SIMD.jl.

These things can happen but in the ideal case would imply some level of implicit cooperation between Wrapper.jl and FriendlyWrapper.jl — otherwise the first one could add methods for the same signature as an enhancement without a breaking API change. If this cannot be ensured and the code is used in production, it would be tempting to just fork Wrapper.jl as WrapperDoneRight.jl.

Sticking to not doing type piracy has the great advantage of not requiring any kind of coordination, as long as SemVer is observed, which scales much better.

Various plausible scenarios exist where type piracy makes sense — my main use case for it is hotfixing issues in semi-abandonned packages. It’s not that one should never do it — after all, the language does not forbid it explicitly. One should just always recognize the implicit cost. Practically, this cost is high enough to discourage casual use.

To me there’s a huge difference between:

f(::AbstractArray) = ...    # original package

pirated by:

f(::Ref) = ...

which just implements something that would be a MethodError otherwise, vs.

f(::Vector{Int}) = 10

which creates painful-to-track bugs that depend on which package is loaded.

4 Likes