Does Julia cache version of function with given input value?

Say I am doing y = x[perm] for a permutation perm. I believe that Julia would dispatch and compile a version of getindex(x, perm) for the type that I called. Now if I call y1 = x1[perm] again for the same types but different vector x1 then Julia would just use the same function that has been compiled. Please correct me if I am wrong here.

Since perm is the same for both calls, it would be efficient to cache this function x->x[perm] for use again later. Is Julia doing this automatically?

What if I wrap that step inside a function

function dowork(x)
    # some steps
    y = x[perm]
    # other steps
end

My guess is that the answer to both questions is NO (but for different reason). So I need to explicitly ask for it, by defining x->x[perm].

Correct.

No. Julia caches the compiled version of that lambda function, but not the lookup result. perm is a captured variable from the outside of that function and could thus change, affecting a future call to your anonymous function. x is a passed in parameter to the anonymous function - the function alone doesn’t even specify the type of x, so the getindex call may do something entirely different than index into an actual array. What happens depends on what gets passed into that anonymous function.

No. The results of function calls are never cached automatically, except for when you explicitly ask for a cache (e.g. By defining your function using something like Memoize.jl, which trades memory usage for execution speed (sometimes, if the function is expensive enough)).


Julias’ functions are not mathematical functions in the sense that their syntax/statements are the values they represent (like in Haskell). They are imperative steps for a machine to execute.

1 Like

Which part are you asking about here? Whether the compilation of x[perm] is reused or whether the result of that is reused when you later write x1[perm]?

The first one: whether the compilation of x[perm] is reused. Also if I force const perm = [1 2 3], would that change? I have in mind examples like addone(x) = x + 1 or addconst(x) = x + c for a global constant c. I should have @code_lowered these myself but maybe you have the answer available.

under no circumstances would Julia “cache” values/results.

No. The computation will be done at runtime, unless x constant propagates into your function and you can’t observe the difference between it being computed at compiler time vs. at runtime. This is different from caching calculations though.

Compiling the method getindex for argument types Vector{Int} and Int (array and index, respectively) is different to running that compiled method and returning a result. During compilation, only types are visible.


It sounds like there’s a little bit of confusion/something is unclear about types and values and how both of them behave in an imperative computer program. May I ask, what’s your background?

I have a physics background and is not very well-versed in programming. Sorry for the basic question, I should have learn julia basics properly.

julia> const one = 1
1

julia> add1(x) = x + 1
add1 (generic function with 1 method)

julia> addone(x) = x + one
addone (generic function with 1 method)

julia> addvar(x) = x + y
addvar (generic function with 1 method)

julia> y = 1
1

julia> [add1(1) addone(1) addvar(1)]
1×3 Matrix{Int64}:
 2  2  2

julia> @code_llvm add1(1)
;  @ REPL[75]:1 within `add1'
define i64 @julia_add1_2957(i64 signext %0) {
top:
; ┌ @ int.jl:87 within `+'
   %1 = add i64 %0, 1
; └
  ret i64 %1
}

julia> @code_llvm addone(1)
;  @ REPL[76]:1 within `addone'
define i64 @julia_addone_2959(i64 signext %0) {
top:
; ┌ @ int.jl:87 within `+'
   %1 = add i64 %0, 1
; └
  ret i64 %1
}

julia> @code_llvm addvar(1)
;  @ REPL[77]:1 within `addvar'
define nonnull {}* @julia_addvar_2961(i64 signext %0) {
top:
  %1 = alloca [2 x {}*], align 8
  %gcframe2 = alloca [4 x {}*], align 16
  %gcframe2.sub = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 0
  %.sub = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 0
  %2 = bitcast [4 x {}*]* %gcframe2 to i8*
  call void @llvm.memset.p0i8.i32(i8* nonnull align 16 dereferenceable(32) %2, i8 0, i32 32, i1 false)
  %3 = call {}*** inttoptr (i64 4348951440 to {}*** ()*)() #4
  %4 = bitcast [4 x {}*]* %gcframe2 to i64*
  store i64 8, i64* %4, align 16
  %5 = bitcast {}*** %3 to i64*
  %6 = load i64, i64* %5, align 8
  %7 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 1
  %8 = bitcast {}** %7 to i64*
  store i64 %6, i64* %8, align 8
  %9 = bitcast {}*** %3 to {}***
  store {}** %gcframe2.sub, {}*** %9, align 8
  %10 = load atomic i64, i64* inttoptr (i64 5759690280 to i64*) unordered, align 8
  %11 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 2
  %12 = bitcast {}** %11 to i64*
  store i64 %10, i64* %12, align 16
  %13 = call nonnull {}* @jl_box_int64(i64 signext %0)
  %14 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 3
  store {}* %13, {}** %14, align 8
  store {}* %13, {}** %.sub, align 8
  %15 = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 1
  %16 = bitcast {}** %15 to i64*
  store i64 %10, i64* %16, align 8
  %17 = call nonnull {}* @jl_apply_generic({}* inttoptr (i64 4463799936 to {}*), {}** nonnull %.sub, i32 2)
  %18 = load i64, i64* %8, align 8
  store i64 %18, i64* %5, align 8
  ret {}* %17
}

julia> z = 1
1

julia> typeof(z)
Int64

julia> addint(x) = x + z
addint (generic function with 1 method)

julia> @code_llvm addint(1)
;  @ REPL[85]:1 within `addint'
define nonnull {}* @julia_addint_2963(i64 signext %0) {
top:
  %1 = alloca [2 x {}*], align 8
  %gcframe2 = alloca [4 x {}*], align 16
  %gcframe2.sub = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 0
  %.sub = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 0
  %2 = bitcast [4 x {}*]* %gcframe2 to i8*
  call void @llvm.memset.p0i8.i32(i8* nonnull align 16 dereferenceable(32) %2, i8 0, i32 32, i1 false)
  %3 = call {}*** inttoptr (i64 4348951440 to {}*** ()*)() #4
  %4 = bitcast [4 x {}*]* %gcframe2 to i64*
  store i64 8, i64* %4, align 16
  %5 = bitcast {}*** %3 to i64*
  %6 = load i64, i64* %5, align 8
  %7 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 1
  %8 = bitcast {}** %7 to i64*
  store i64 %6, i64* %8, align 8
  %9 = bitcast {}*** %3 to {}***
  store {}** %gcframe2.sub, {}*** %9, align 8
  %10 = load atomic i64, i64* inttoptr (i64 5681719064 to i64*) unordered, align 8
  %11 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 2
  %12 = bitcast {}** %11 to i64*
  store i64 %10, i64* %12, align 16
  %13 = call nonnull {}* @jl_box_int64(i64 signext %0)
  %14 = getelementptr inbounds [4 x {}*], [4 x {}*]* %gcframe2, i64 0, i64 3
  store {}* %13, {}** %14, align 8
  store {}* %13, {}** %.sub, align 8
  %15 = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 1
  %16 = bitcast {}** %15 to i64*
  store i64 %10, i64* %16, align 8
  %17 = call nonnull {}* @jl_apply_generic({}* inttoptr (i64 4463799936 to {}*), {}** nonnull %.sub, i32 2)
  %18 = load i64, i64* %8, align 8
  store i64 %18, i64* %5, align 8
  ret {}* %17
}
julia> using BenchmarkTools

julia> @btime add1(1)
  0.085 ns (0 allocations: 0 bytes)
2

julia> @btime addone(1)
  0.078 ns (0 allocations: 0 bytes)
2

julia> @btime addvar(1)
  26.882 ns (0 allocations: 0 bytes)
2

julia> @btime addint(1)
  34.177 ns (0 allocations: 0 bytes)
2

you’re seeing the effect of y being a global non-constant variable

this is not effective benchmarking because the function is constant-folded, this time is “unreal”. This is less than 1 CPU cycle.

2 Likes