Type alias and type piracy

Hi
If I declare an alias such as const MyType = YourType{2} and I implement let say Base.show(io::IO, x::MyType,...), I suppose it is type piracy.

However, when composing different package it is tempting to do such piracy. Here is an exemple:
I want to perform operations on arrays of measurment from the package Measurements.jl

julia>  using Measurements

julia> N  = 1_000_000
1000000

julia> m = measurement.(randn(N,5), rand(N,5));

julia> m[1]
-0.032 ± 0.9

julia> weightedmean(m)
-0.413120726 ± 9.6e-8

Each element of m contains a value and its standard deviation ( and other things to propagate correlations) and I can compute its weighted mean using weightedmean(m)

However many operations on such measurements such as weightedmean involve the precision (the inverse of the variance) rather than its std. To have a better performance (among other advantages) I prefer to bypass the sqrt(inv( )) by storing direction the precision using the very convenient ZippedArrays package to define the alias WeightedArray{T,N} that is an AbstractArray{Measurement{T},N},

const WeightedArray{T,N} = ZippedArray{Measurement{T},N,2,I,Tuple{A,B}} where {A<:AbstractArray{T,N},B<:AbstractArray{T,N},I}
 
ZippedArrays.build(::Type{Measurement{T}}, (val, weight)::Tuple) where {T} = measurement(val, sqrt(inv(weight)))


get_data(x::WeightedArray) = x.args[1]
get_precision(x::WeightedArray) = x.args[2]


function Measurements.weightedmean(A::WeightedArray)
    precision = get_precision(A)
    data = get_data(A)
    Σprecision = sum(precision)

    measurement(mapreduce(x -> x[1] * x[2] / Σprecision, +, zip(data, precision)), sqrt(1 / Σprecision))
end

Coded like this it is much more performant:

julia> M = WeightedArray(m);

julia> M isa AbstractArray{<:Measurement}
true

julia> m isa AbstractArray{<:Measurement}
true

julia> @btime weightedmean(m)
  28.816 ms (8 allocations: 76.30 MiB)
-0.413120726 ± 9.6e-8

julia> @btime weightedmean(M)
  10.765 ms (2 allocations: 96 bytes)
-0.413120726 ± 9.6e-8

The weightedmean of Measurements.jl could be more efficient (and I already propose a PR for that) but the issue is more general.
Coded like this function Measurements.weightedmean(A::WeightedArray) is type piracy as I don’t own both weightedmean and ZippedArrays that is aliased in WeightedArray.

My question is: what is the best way to prevent such type piracy without re-coding either most methods of one or the other package?

If you want to share this code without waiting for a contribution to the original package, put the implementation in a separate function of a separate package. If it’s the same name, we can just import it by name and use it where we would use the original function. Otherwise you can do just about whatever you want on your own.

My problem is a bit general as I like to compose packages ( its a really nice feature of julia) and trying to be as generic as possible I sometimes end up into such type piracy issue ( in that case I also overload methods frim other package)

If I understand well you propose to keep the same name of the method without importing the original method and call the function using MyPackage.weightedmean(...) (and without exporting it) maybe flagging it with public.
Is it what you mean?

What @Benny is implicitly suggesting, by “waiting for a contribution to the original package”, is to make a PR instead of doing type piracy.

I wasn’t making that suggestion because FerreolS already knew to make PRs: implement weightedmean(iterable; dims=:) by FerreolS · Pull Request #182 · JuliaPhysics/Measurements.jl · GitHub.

The problem with type piracy isn’t the direct fact a method is being replaced. After all, when you do it on your own and took care not to cause any serious issues, things can work fine. The issue is that when pirates share code with other people, they have to deal with the chaos of at least 2 versions of method tables failing to coexist and the associated precompilation errors.

If you need people to use your weightedmean implementation right now, the best way is a separate module, function, and method. The intended usage is really up to you, but personally I’d like to just using PendingPRPackage: weightedmean. Unless I’m doing anything crazy, it’d just seamlessly replace weightedmean calls even when I imported other names from Measurements. You can usually subsume the original implementation e.g. making a fallback method for PendingPRPackage.weightedmean that forwards to Measurements.weightedmean; if not, I’d resort to module-qualified calls. It’s rough, but it’s a lot better than type piracy. It doesn’t have to be a package, either; if the code is as small as that linked PR’s, then I’d include a file with a small module until the PR pans out.

4 Likes

Thanks.
I will not make a PR for that a it will add a dependency on ZippedArrays.jl, and as I may overload other methods from other packages in a similar manner.

I will not import Measurements.weightedmean and declare a public method weightedmean in my package with a fallback weightedmean(x...) = Measurements.weightedmean(x...) .

You don’t need to make it a full dependency, if you put the additional code into a package extension. Then Julia just loads the code when ZippedArray.jl happens to be loaded as well.

3 Likes

I’m not sure whether my problem is completely solved as I have an other piracy issue here:

Base.view(A::WeightedArray, I...) = WeightedArray(view(get_data(A), I...), view(get_precision(A), I...))

As usual I don’t own neither view nor ZippedArray . The solution MyPkg.view() seems a bit painful if I have to overload many Base methods.

I’m thinking about recoding ZippedArrays.jl functionality to really own WeightedData

My personal opinion is that minor type piracy can often be fine.
When some method throws an error, and you can be reasonably sure that the upstream package or another package won’t add it. In this case, piracy will not change any existing behavior, just turn an error into working code.
Also, for example, if your package is aimed mostly to be imported by the enduser, and not to be a deep dependency of something else.

But if the method already does something before your overload, be very cautious, piracy can really be dangerous in that case!


Btw:

  • Curious what ZippedArrays.jl do that StructArrays.jl don’t? They seem similar in scope, while latter is orders of magnitude more popular and well-maintained. Maybe it already has the view functionality you need, for example.
  • From your example, seems like you don’t need correlation between measurements. Measurements.jl has huge overhead in this scenario (not for weightedmean though, because weightedmean already ignores correlations), and you may want to consider solutions that don’t track correlations, e.g. Uncertain.jl:
julia> M = Measurements.measurement.(rand(10^3), 0.1*rand(10^3))
julia> U = Uncertain.Value.(rand(10^3), 0.1*rand(10^3))

julia> @btime mean($M)
  303.099 ms (501500 allocations: 22.96 MiB)
julia> @btime mean($U)
  19.917 μs (0 allocations: 0 bytes)
1 Like

The main reason is that @emmt who maintains ZippedArrays.jl has its office next to mine and as he often do real time computing its packages are usually very efficient. But I will have a look to StructArrays.jl. But I suppose that I will run into same piracy issues

Indeed, I don’t need correlation for now and I also considered you package to replace the same kind of structure I already coded in my package. In fact I borrow your mapreduce use for my PR on weightedmean on Measurements.jl. But as it is quite new, I was not sure it was mature enough (not much doc) and writting ±ᵤ is painfull in REPL. Also I may need to track correlation on further operations on the computed weighted mean (I know that I can convert Uncertain to Measurement and back).