Distributions.jl interop for probprog

For probabilistic programming in particular, it’s critical to be able to use distribution-related functions (especially logpdf) in non-standard ways. For example, I would love to be able to call logpdf(TDist(nu),x) where nu and/or x have type

  • ApproxFun.Fun
  • SymPy.Sym
  • ModelingToolkit.Operation

It seems there are at least two things that get in the way:

  • checks like Distributions.@check_args that can’t be determined statically
  • type constraints like logpdf(::Normal, ::Real)

Neither of these is in itself a bad thing. But it seems critical to have an easy workaround for both.

For the first case, one idea is for Distributions.jl to have an unexported “unsafe” kernel _logpdf with no check. Then the exported logpdf could wrap this in the appropriate check. It may be possible to use a similar trick to have something available without the type constraint, but this is less clear to me. Of course it would be better to be able to force other libraries’ types to be ::Real (if they really ought to be), but I don’t know that this can be done without either lots of forking or lots of wrappers.

If Distributions.jl isn’t general-purpose, people will write their own alternatives, and we’ll have lots of duplicated effort. There’s already some of this - I had to fork Distributions to get it to play nice with SymPy, and (as I understand it) the Gen team built their own version of it to have nice gradients.

Does an approach like the above make sense, or would it break other things? Is there a better way to go about it? Are there other capabilities specific to probabilistic programming that Distributions.jl should provide?

cc: @cpfiffer @torfjelde @mohamed82008 @alex-lew @mbesancon @zennatavares @willtebbutt @Elrod

1 Like

For that, I think we had the same discussion earlier in Type constraints in Distributions and StatsFuns.

1 Like

So we did, thanks for linking. Do you think having f(x::Real) call a generic, no-bounds-check, unexported _f(x) could solve both problems?

It could but I don’t think that’s a neat fix. Perhaps, someone else has a more useful opinion on this. A keyword argument can also work to remove the check.

2 Likes

I actually like the idea from an ease-of-use persepctive, but I wonder if someone could speak to what the performance of this would be. Does the fact that the “workhorse” function doesn’t actually have formal typing reduce performance?

This really only addresses the @check_args call. I think probably the bigger hamstring is the type constraints. Maybe there’s like a subtyping trick that’s possible?

As I understand there’s no penalty for generic functions, as long as they’re type-stable. Code specializes each time it’s called with a different argument type.

I like the idea of a keyword argument to remove checks

Wouldn’t the body of the function still need to check the keyword, making this still a runtime check?

Indeed, we could make this the last positional argument with Val{true} and Val{false} to dispatch on

1 Like

Not necessarily. Constant prop on the literal will likely kick in.

1 Like

Any way to encourage this? Does inlining help?

Otherwise, I like @mbesancon’s Val types idea, but we could even make it struct Unsafe end and effectively push the check to the type level

Ya this is addressed in the other topic I linked Type constraints in Distributions and StatsFuns.

1 Like

The first official release of Distributions.jl was in 2013, 6 years ago, before the actual release of Julia 0.3. IMO a lot of its design reflects that, eg

  1. it is full of ostensibly performance-enhancing checks and conversions which may not needed anymore, as the compiler got way smarter,

  2. the idiomatic Julia style for writing generic code has matured a lot since these days, using traits instead of elaborate type hierarchies, and duck typing where one can get away with it.

This is natural — code ages, and needs to be dusted off occasionally.

A complete, breaking rewrite & refactor by a small team of experienced Julia programmers would probably be the best way to remedy this, but I am not sure how to engineer the consensus necessary for that. It may come to a complete rewrite in a parallel package.

A lot of the checks could probably be removed and the code would just error anyway for incompatible dimensions. All subtypes of Real, and vectors with elements ::Real (not necessarily Vector{<:Real}, for well-known reasons, eg Any[1.0, 1]) should just work.

But for other types, eg symbolic, you may have to provide your own methods if they cannot trickle through an interface designed for numbers. Eg I am not sure what ApproxFun.Fun should do as an argument.

3 Likes

I just remembered there is already a NoArgCheck empty struct which can be used for that purpose, with the inner constructor doing nothing, the “standard construction method” checking args and the T(params, ::NoArgCheck) bypassing that

Removing forced check at every distribution construction happening on:
https://github.com/JuliaStats/Distributions.jl/pull/942

1 Like

Can you (or @cscherrer) expain how it helps this this particular use case?

My understanding is that for values that are ::Real or arrays of such, and support the usual operations on numbers that checks use (eg <, etc), skipping the checks is not needed to just make things work (they are useful as an optimization though, but that’s another use case, which I suppose motivated your PR).

For other types, eg symbolic values, it is very likely that a different implementation is needed for most distributions, as special functions won’t play nice with these objects (some pdfs would be an exception). This will require another method definition anyway.

Yes I think the check_args issues is mostly an optimization, Chad reported the two but my understanding is that they are not related. The arg_check was the easy one because consensual, lots of use cases remove the needs for verification.

Now removing all type bounds is a bit more tricky and doubling the functions not a nice solution either. My feeling is that the symbolic libraries mentioned should use a wrapper type which subtypes Real or Number

1 Like

If folks here want to take a look, the PR for optional removal of @check_args has been merged to master. If this seems ok, we will release v0.21.1 for people to have it (nothing breaking, the default is still to check).

Also, I’d be glad to get more opinions on #945 from @richardreeve, which fixes the semantics of the type hierarchy, which has been blocking us for some PRs (including Dirac for example)

3 Likes