Suggestion: Fortran-like Keyword to Assert Function is Pure

tkf · June 19, 2021, 9:54pm

A mechanism like this would be useful for avoiding “confusing” behavior of, e.g., rand.() .+ sprand(3, 0.1) (edit: it’s already mentioned by mcabbott Suggestion: Fortran-like Keyword to Assert Function is Pure - #45 by mcabbott):

julia> a = sprand(3, 0.1)
3-element SparseVector{Float64, Int64} with 1 stored entry:
  [1]  =  0.474913

julia> y = rand.() .+ a
3-element SparseVector{Float64, Int64} with 3 stored entries:
  [1]  =  1.39306
  [2]  =  0.976046
  [3]  =  0.976046

We have y[2] == y[3] because the broadcasting implementation assumes purity.

We can avoid this if the purity information is available at API level. That is to say, sparse broadcasting implementation can try to evaluate lower-dimensional sub-arguments as a pure function first. If it is successful, it is justified to broadcast pre-evaluated results. Otherwise, it needs to broadcast all non-pure arguments first and then materialize it.

I wonder if there is a way to handle puerility “check” as a simple (more or less) frontend lowering problem. I think this is desirable since an important property for dynamic language like Julia is that it has to be independent of inference. For example, the compiler (or even a macro) can, at a relatively early stage in the compilation, transform all calls f(args...) (including getindex etc.) in a pure function body to (say)

Base.pureapply(f, args...)

This implements the invariance that a pure function only calls pure functions. There is no need to know the types of args. The default implementation of Base.pureapply can, say, throw an error. ^[1]

A definition of pure function can be done with

@effectfree foo(args...) = $body

which is lowered to

Base.pureapply(::typeof(foo), args...) = $body
foo(args...) = Base.pureapply(foo, args...)

where all calls in $body are replaced with pureapply.

(A similar technique could be useful for annotating certain functions are blocking (no yield point) which helps us writing correct and efficient code in a cooperative task system.)

A trickier situation is when the implementation uses temporary buffers but makes sure the effect is not observable:

function foo(n)
    tmp = zeros(n)
    acc = 0
    for i in eachindex(tmp)
        tmp[i] = acc
        acc += i
    end
    return sum(tmp)
end

Maybe we need an “unsafe” variant for this:

@effectfree_turstme foo(n) = $body

Note that we can safely use pureapply(sum, tmp) but, of course, tmp[i] = acc cannot be pure. Making it “work” (i.e., rigorously checkable at least at run-time) would require “linear update” version of setindex! and any other mutation-based API.

To make this usable without inference in a composable manner, it’s probably required to use the “easier to ask for forgiveness than permission (EAFP)” pattern by returning Union{Nothing,Some}. On one hand, the definition of purity makes the use of EAFP very attractive (as nothing is observable when “asking” failed). On the other hand, implementing lifting is tricky when there are control flows and closures. ↩︎

Topic		Replies	Views
Discussion on Base.@pure usage Performance question	23	2323	March 12, 2021
Extending methods of pure function General Usage	7	1135	December 20, 2018
@pure macro General Usage	9	7593	June 5, 2021
Is there a way to tell if a method is pure? General Usage question	3	1090	February 28, 2017
Marking more standard library methods as pure to enable LLVM optimizations? Internals & Design	11	1556	September 17, 2019

Suggestion: Fortran-like Keyword to Assert Function is Pure

Related topics