Save dispatched function for later call

Hey Julianners,
Is it possible to save a function call like?

  • “Base.+” but the dispatched version like: +(::Float32, ::Float32)

So you mean you want to get the specialized method object right? What’s the use of that?

I am 100% sure what method will run and I don’t want the million * million multiple dispatch to happen due to the saved “+” and other functions.

Yes the specialized method should be saved and called incredibly many amount of time. Is it possible?

it won’t always be dynamic dispatch if that’s what you’re worrying about. you likely have an incorrect mental model on how Julia works.

If you already know which method will run, then I guess it means that types of input are fixed. Julia will select that specialized method object automatically when compiling the method call. So you need to do nothing.

But maybe you want method object for other usages. Can you be more specific about what you want to do with these specialized method?

Yeah, I can be any specific.
I want something like this:

deref(ref::Base.RefValue{T}) where T = ref[]
deref(val) = val

ops = [+ , - , + ]
out = [Ref(0), Ref(0), Ref(0)]
ins = [(2,3), (4,5), (out[1], out[2])]

for (i,op) in enumerate(ops)
	in1, in2 = deref(ins[i][1]), deref(ins[i][2])
	out[i][] = op(in1, in2)
end

So I want to create a sort of network of computation graph and create different optimization for this operation list and many more different research with this way.

I am afraid of it that, due to the function’s are not compilable… so “op(…)” is a little bit abstract the compilation can’t really figure out each of the [+,-,+…] functions in compilation time.
What is so ever I specifically don’t want it to figure it out! So I want to directly specify because I don’t want the compiler or anything to have any load from dispatching to the appropriate method.

you may find

interesting because they have some sort of computation graph internally.

Yes, I know Dagger.jl. I want this to be different in many ways and I didn’t see how can I apply effectively Dagger.jl in my scenario, in spite of the fact that it has a scheduler which sounds really great. Also I find it very sluggish…

So I get it. You want some “cheap” runtime dispatches, right? You still need to perform some dispatches here, since the input types are unknown (they can be refvalue or int). But you don’t want to perform the unnecessary method table lookup (since you only want to invoke a specific method).

There are two issues here:

  1. ins is Vector{Any} here, so ins[i] is also inferred as Any. You must perform a runtime dispatch in this case, since you may store any things in ins and they can be any types. If you only store RefValue{Int} and Int, you can type ins as Vector{Union{RefValue{Int},Int} instead of Vector{Any}.
  2. Since you store the functions in a vector ops, I guess you want to type these functions, like Int32xInt32->Int32. Currently we have no official ways to do this, only experimental support (You can search FunctionWrapper.jl in github). It should “just” work for you.

But before you fix these issues, I think you just do things in a wrong way. You should not do this inside a language. If you want to create computation graph, you should directly work on Julia’s typed IR (or create your own IR/typed AST/Graph DSL) and perform code generation like a real compiler.

1 Like

Yeah, indeed it was too bare example. To resolve the first issue you mention:

IN1 = Float32[2]
IN2 = Float32[3]
IN3 = Float32[7]
IN4 = Float32[5]
ops = [+ , - , + ]
outs = Float32[0, 0, 0]
ins = [
  (Base.RefArray(IN1,1), Base.RefArray(IN2,1)),
  (Base.RefArray(IN3,1), Base.RefArray(IN4,1)),
  (Base.RefArray(outs,1), Base.RefArray(outs,2))
]

for (i,op) in enumerate(ops)
  outs[i] = op(ins[i][1][], ins[i][2][])
end

The 2. FunctionWrapper.jl sounds extremely great! I hope it really works!

Isn’t the typed AST sounds too low level for my task? For me this computation sounds very easy solution for my specific case, if it I could work with TypedFunction it would be great.

What you’re doing right now is basically implementing a dynamically dispatched interpreter by using existing julia functionality for all those dynamic parts. Those are so loaded with lookups that a good step up would be to just generate julia code with meta programming in the first place.

If julias’ semantics aren’t useful for your interpretation of operations (which I doubt a little), writing your own language (and grammar, parser, lexer, compiler etc) to avoid dynamic dispatch (so your language would have to be strongly statically typed) would seem like the only remaining option.

I love Julia. I don’t want a new language. I believe Julia is really extensive and it has close to no limits.
I am just looking for a solution and as I feel it was worth to ask the question FunctionWrapper.jl looks really promising but I have to test that it really works like we believe.

Are your actual arrays very much bigger than in your example? If not, have you considered working with tuples instead? They retain the type information of each element statically.

I agree with @Sukera, you are just trying to implement a interpreter (or a virtual machine) over an extremely simple arithmetic language. We can do this easily in Haskell with < 100 line codes. If you believe this counts as a computation graph, you might just waste your time.

It’s indeed possible to twist function pointers in Julia, just like FunctionWrappers.jl, to achieve a more general interpreter (for arbitrary types and functions). And if you know how to do this in C/C++, then you can do this kind of low level thing in Julia (you just need to use Julia’s internal). But this is not easy and beneficial. We already have a task scheduler. You can directly spawn the computation and let Julia do it. Task indeed has some overheads and may not be appropriate if we have a lots of small computation. But if the computation is small, then we shouldn’t spawn it at all (besides, a dynamic interpreter also has some small overheads, because we need to pack and unpack parameters to execute an opcode) .

Instead, you should look at into Debugger.jl or Julia’s abstract interpreter framework. These works are much closer to what you are doing here.

PS: If you really really want to know how to get function pointers, use which(m,tt) to get Method object. The specializations field of Method stores MethodInstance for this call. Then the cache field of MethodInstance stores CodeInstance, which is just like a handle. The specptr field of CodeInstance stores the function pointer of the function. The invoke field also stores a pointer. The difference is that the pointer in invoke is a wrapper of the pointer in specptr. It has a signature of void* f(void* tparams,void** args, int32 nargs). This signature is the same for all different functions. So you can ccall this function in an uniform way (there is no dispatch since the type of inputs is fixed.)

Yes, sorry. These are just sample data, these will be very big arrays or in some case tensors in development and production.

I understand haskell would mean less than 100 line of codes for this specific case. But I have the our company research codebase in julia and it would be really waste of time to port everything seamlessly. Also if it is possible and fast, why would we mix languages.

Thank you for all the points I will try to check them!

Sadly FunctionWrappers.jl wasn’t enough clear for me and I coudln’t apply appropriately. :frowning:

why not take the representation you have and “compile” it to Julia expressions (ie. write a macro), and then let the Julia compiler compile that to machine code? If you “know” the types of the inputs, then you simply write the Julia expressions with type annotation.

a = 1
b = 2
foo = Expr(:call, :+, a:Int, b::Int)
eval(foo)
3

Instead of simple expressions you could generate functions that carry out the operations of interest.

The problem here is the speed, that way is extremely inefficient.
Eval shouldn’t be used this many time because of it is time consuming.

using BenchmarkTools

function f(ops)
  for op in ops
    eval(op)
  end
end

function g(ops, a, b)
  for op in ops
    +(a, b)
  end
end

a=2
b=3

plus = Expr(:call, :+, a::Int, b::Int)
ops=[plus, plus, plus]

@btime f($ops)
@btime g($ops, $a, $b)

For me it show:
163.592 μs (99 allocations: 6.56 KiB)
1.828 ns (0 allocations: 0 bytes)

Maybe this is the important point to resolve? As far as I can tell, FunctionWrappers.jl is the exact right solution to your original problem. For example:

julia> using FunctionWrappers: FunctionWrapper

julia> ops = [+, *]
2-element Vector{Function}:
 + (generic function with 190 methods)
 * (generic function with 328 methods)

julia> wrapped_ops = [FunctionWrapper{Int, Tuple{Int, Int}}(op) for op in ops]
2-element Vector{FunctionWrapper{Int64, Tuple{Int64, Int64}}}:
 FunctionWrapper{Int64, Tuple{Int64, Int64}}(Ptr{Nothing} @0x00007fe24935fed0, Ptr{Nothing} @0x00007fe297964a78, Base.RefValue{typeof(+)}(+), typeof(+))
 FunctionWrapper{Int64, Tuple{Int64, Int64}}(Ptr{Nothing} @0x00007fe249360190, Ptr{Nothing} @0x00007fe297964a80, Base.RefValue{typeof(*)}(*), typeof(*))

julia> using BenchmarkTools

julia> @btime (for i in 1:2; z[i] = $ops[i](x, y); end) setup=(x = 1; y = 2; z = zeros(Int, 2))
  60.315 ns (0 allocations: 0 bytes)

julia> @btime (for i in 1:2; z[i] = $wrapped_ops[i](x, y); end) setup=(x = 1; y = 2; z = zeros(Int, 2))
  12.344 ns (0 allocations: 0 bytes)

wrapped_ops is a concretely-typed Vector of function wrappers, so we can iterate over it and called the wrapped operations much more efficiently.

If you find the FunctionWrapper syntax awkward, I wrote a macro here: Can FunctionWrappers.jl express higher order functions? - #4 by rdeits which might help:

wrapped_ops = [@fn((Int, Int) -> Int)(op) for op in ops]

or, equivalently:

wrapped_ops = @fn((Int, Int) -> Int).(ops)
1 Like

I still don’t really understand how you’re planning to use this. If you’re planning to execute each input exactly once, then the compilation time will be unacceptable. If you’re planning to take some user input, and then do some intense calculations with it, reusing the operation over and over in a loop, then the compilation time will be less important, and the fact that the operations are fast will be advantageous.