Match LLVM with Julia source?


#1

I have reason to suspect that incorrect LLVM is being generated for my code in some circumstances. Poring over 3000+ lines of LLVM is out of the question, but I might be able to figure out 10-15 lines if I knew where to look.

I’m wondering: how exactly does the LLVM emitted by code_llvm() match up with the source? There are labels like L1241 in LLVM, but they seem to point to irrelevant lines. Is there a way to insert a “marker” statement into my Julia source that will compile to an easy-to-locate statement of LLVM?


#2

On the master branch of Julia, we’ve added printing of line number information. Putting in some obvious token (like a call to a no-op @noinline function) can also help with narrowing down the search. Running with ./julia -O0 will not run as many optimizations, which can make the IR more explicit (although, also much longer). And finally the L1241 refers to the statement index of the corresponding label in @code_typed.


#3

Thanks for the help! Using these techniques plus poring over GDB, I have tracked down the exact location of the ReadOnlyMemoryError() in my code, but I still don’t know what is causing it. It appears to be a bug either in the Julia run-time system or the LLVM run-time system.

My code (0.6.2) looks like this:

            if whichderiv == -1
                [snip]
            elseif whichderiv == 0
                [snip]
            elseif whichderiv == 1
                [snip]
            elseif whichderiv == 2
                [snip]
            elseif whichderiv == 3
                [snip]
            else
                @assert false
            end

This compiles into a switch statement in LLVM:

  switch i64 %1053, label %L888 [
    i64 -1, label %if202
    i64 0, label %if313
    i64 1, label %if314
    i64 2, label %if315
    i64 3, label %if338
  ]

This compiles to the following native code, which has a jump-table in it:

        movq    -48(%rbp), %rax
        leaq    1(%rax,%r15), %rax
        cmpq    $4, %rax
        ja      L16870
        movabsq $140732104772572, %rcx  # imm = 0x7FFEBF1D0BDC
        movslq  (%rcx,%rax,4), %rax
        addq    %rcx, %rax
        movq    -360(%rbp), %r15
        jmpq    *%rax

And the problem is that the jump-table entries are corrupted, so that the jmpq statement lands in never-never land. It is already corrupted as soon as the routine starts for the first time post-compilation, so apparently my code is not to blame. The offsets in the table look reasonable relative to one another, but they are offsets with respect to the wrong base address. This problem is inconsistent; it is in Linux but not Windows, and seemingly trivial changes to how I run my code fix the problem (but there are other suspicious errors later).


Best approach for runtime dispatching inside a hot loop (heterogeneous tree structure)
#4

That might make sense. I think I’ve seen issues and/or fixes for llvm for where it emits the wrong relocations / does the wrong fixups for jump-tables – assuming small-code-model or signed-ness, where it should be noting our CodeModel::Large and avoiding making such assumptions. Not sure if that really helps, but maybe at least suggests that you’re on the right track.


#5

Thanks very much! With the clue that CodeModel is involved, I found the right string to google and found some relevant information here:

https://groups.google.com/forum/#!topic/llvm-dev/kA3TnRQ9oA4

But fixing LLVM is definitely outside my scope of knowledge, so I’d rather just work around the problem. Possibly I could avoid the bug in LLVM if I prevent Julia from compiling my if statements into LLVM switch statements. And presumably I could prevent this by obfuscating the fact that the if branches take cases on a single variable. I can experiment with this (try some obfuscation of the if-conditions; check the LLVM; repeat as necessary). If you have suggestions for a better workaround, that would be great.


#6

I just opened an issue about the problem with LLVM’s switch statement on GitHub

and Yichao Yu responded immediately that the problem is fixed in master. So rather than obfuscation of if statements, I’ll start porting my code to 0.7.0-DEV to see if the problem goes away.


#7

See also the issue I linked for the two possible changes (llvm patch or julia patch) if you want to backport/compile your own version.