Computed goto (or labels as values) in Julia?

Hi,

I’m now interested in improving the performance of Julia code that emulates a finite state machine. It reads input data byte by byte and branches based on it. A branch looks like this:

byte = data[p+=1]
if byte in 0x30:0x39
    # do something...
    @goto state7
elseif byte in 0x2e:0x2e
    # do something...
    @goto state2
else
    @goto exit
end

Recently I read a blog post that says computed goto (or labels as values) is used to improve the performance of virtual machines of interpreted languages such as Python. It looks very similar to what I do in FSM code and I’d like to try it in Julia. Julia also has @goto statement but I have no idea whether it is possible to do the same thing in Julia. #5410 may be related but not the same. If I can do, the code may look like:

byte = data[p+=1]
@goto dispatch[byte]
@label number
    # do something...
    @goto state7
@label dot
    # do something...
    @goto state2
@label default
    @goto exit

I’m not sure whether it improves the performance, but if so, I’d like to introduce it in Automa.jl.

No we currently don’t have computed goto. Though it is used in flisp and the GC.

Your current usage shouldn’t need it though. LLVM should be smart enough to lower the branch into a switch and that’ll be equivalent (or better than) what you want to change to (it’s much easier for the compiler to find the target array). For the two advantages mentioned in that blog post, the first one doesn’t apply and should also be possible with an assume function. The second one is basically jump threading. Since you aren’t doing that here, LLVM might do that for you and it’s a less useful optimization on modern hardware, I don’t think it’ll make a huge difference.

1 Like

Thank you a lot.

Your explanation makes sense to me. I read the generated code and found some if-else branches are transformed to switch instructions in LLVM code. I still think the performance of complicated if-else branches may be improved by computed goto but creating so many 256-word dispatching tables may be just a waste of memory.

switch statements are lowered to jump tables so LLVM will create a 256 entry jump table if it thinks it’s better than a bounds check plus jump table. I’d usually trust llvm on making low level decisions like this.

Is the (still open) issue below just a cost model thing?

It’s a fixed thing…

So, is it possible to get a jump threaded table with Julia as in that post? Or have CPUs gotten so much better in four years that the 20-25% advantage for interpreters no longer holds?

Jump threading is not related to any julia code in this thread (not the OP’s original code and not the linked issue).
5 years is quite a long time for CPU designs and the branch prediction panelty is much smaller now. In particular, the branch predictor is said to be completely redesigned in Haswell and in particular handle repetitive patterns much better.



"""

    function labeledgoto(x,y,z)

    labeledgoto example

    very useful, keep it in Julia

"""

​

function labeledgoto(x,y,z)

#wonderfully useful 

#keep it in Julia    

    println("x=$x")

    if (y > 1.e0)

        @goto a33

    end

    println("y=$y")

    @goto a31

    @label a33

    println("@label a33")

    return

    @label a31 

    println("@label a31")

    return 

end

println("with arguments 1.,3.,1.")

labeledgoto(1.,3.,1.)

println("with arguments 1.,1.,1.")

labeledgoto(1.,1.,1.)

with arguments 1.,3.,1.
x=1.0
@label a33
with arguments 1.,1.,1.
x=1.0
y=1.0
@label a31

Yichao is absolutely right, LLVM is smart enough, and I have a neat example to illustrate. Not only does LLVM lower the branch into a switch, it also compiles it into an indirect jump!