Relocation target is out of range using JITLink

Hi there,

I am testing my luck to port julia to riscv64. I know this is not a supported platform, but I can almost get a sysimage with a bit dirty hacks, except that I get stuck in bootstrapping SuiteSparse/src/cholmod.jl with the following awkward relocation range issue.

...
LinearAlgebra  ──── 99.693033 seconds                                                                                
Markdown  ───────── 11.953065 seconds                                                                                
Printf  ───────────  1.165579 seconds                                                                                
Random  ───────────  5.017103 seconds                                                                                Tar  ──────────────  2.277063 seconds                                                                                
Dates  ──────────── 24.510604 seconds                                                                                
Distributed  ──────  9.259273 seconds                                                                                
Future  ───────────  0.034906 seconds                                                                                
InteractiveUtils  ─  5.119594 seconds                                                                                
LibGit2  ────────── 11.526592 seconds                                                                                
Profile  ──────────  3.791027 seconds                                                                                SparseArrays  ───── 23.531039 seconds                                                                                
UUIDs  ────────────  0.101738 seconds                                                                                
REPL  ───────────── 38.194260 seconds                                                                                
SharedArrays  ─────  4.289493 seconds                                                                                
Statistics  ───────  1.150079 seconds
JIT session error: In graph globals-jitted-objectbuffer, section .text: relocation target "jl_RTLD_DEFAULT_handle" at
 address 0x4001bfa918 is out of range of R_RISCV_PCREL_HI20 fixup at 0x423718e03c (jlplt_ijl_alloc_array_1d_16507, 0x
423718e000 + 0x3c)                                                                                                   
Failure value returned from cantFail wrapped call                                                                    
Failed to materialize symbols: { (JuliaOJIT, { ccall_ijl_alloc_array_1d_16506, jlplt_ijl_alloc_array_1d_16507_got, jl
plt_ijl_alloc_array_1d_16507 }) }                                                                                    
UNREACHABLE executed at /usr/lib/llvm/14/include/llvm/Support/Error.h:786!

As this arch is relatively new, the old llvm RuntimeDylib api doesn’t support it and there is currently no large codemodel in riscv spec. Good thing is JITLink has full support for riscv so I switched to JITLink with CodeModel=Medium (same as small codemodel in arm64).

Similar to arm64, riscv is a RISC machine with fixed 4 byte sized instruction (ignore c extension). The R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12 relocation pair enables the machine to load a symbol with ±2GB offset relative to current PC and usually used in auipc + ld/addi instruction. This should be analogous to adrp + ldr in arm64, which I recall has ±4GB range (I might be wrong, not an arm expert).

Now, looking at the JIT error, the offset is 423718e000-0x4001bfa918=0x2355936E8, which is well beyond the valid range for riscv and arm64 as well. I wonder does similar issue happen in arm64 previously and Is there possible way to get around with this? I also noticed only on macOS ARM64, jitlink with small codemodel (medium/medany in riscv sense) is used.

I am still a beginner exploring the llvm and julia world. Hope I can get some hints and insights from experts here.

4 Likes

Btw, my fork with mininal riscv64 support is in https://github.com/alexfanqi/julia/tree/riscv64 in case people would like to test it out.

1 Like

I would just skip supporting SuiteSparse, for now (if actually your problem), not too useful for many, only for sparse matrices. It’s I believe the last GPL dependency and there is an option in the Makefile to strip it out. I think long-term it will get dropped from Julia, moved to a package, since it’s a GPL dependency.

You can also skip the completely unrelated(?) PCRE but that’s going to break a lot of code, maybe even Base. There is a replacement RE2 regex lib that may actually be better…

1 Like

Thanks for the tips. The build can progress further with it. But I found this issue actually seems not skippable and probably going to be a major blocker for riscv port. It may be related to the trampoline stubs in cli/trampolines/ and libblastrampoline that is needed for openblas. But It could also be how jitlink manages memory and addressing range.

I currently have this trampoline stub, which gives ± 2GB addressing range and I thought should be pretty much equivalent to adrp+ldr+br in aarch64.

auipc t3, %pcrel_hi(_name)
ld t3, %pcrel_lo(_name)(t3)
jr t3

But then, in reality, there are cases like in the question description, where the distance of two symbols can be much larger than 4GB, which exceeds the limit for small codemodel in arm64 as well. So this issue might come out anywhere the jitted symbol is more than 4GB far away from julia’s symbol like this one I got in a new place.

JIT session error: In graph _spawn_primitive-jitted-objectbuffer, section .text: relocation target "jl_world_counter"
 at address 0x4001bde008 is out of range of R_RISCV_PCREL_HI20 fixup at 0x422625d776 (julia__spawn_primitive_5350, 0x
422625c000 + 0x1776)                                                                                                 
Failure value returned from cantFail wrapped call                                                                    
Failed to materialize symbols: { (JuliaOJIT, { jfptr__spawn_primitive_5351, julia__spawn_primitive_5350, jlcapi_uv_re
turn_spawn_5355 }) }

I am very curious why this error doesn’t appear on arm64 MacOS. I am not familiar how llvm manages this differently on these two archs. Also thinking on Julia side, could this issue be solved by certain memory management?

https://github.com/rust-lang/rust/issues/59802 this might be of help.

1 Like

Yeah, I set this via CodeModel::Medium from the start because I happen to know lli in llvm needs --codemodel=medium for jitlink to work on riscv. Following this hint, I then tried setting Reloc::PIC_. It also doesn’t work. Perhaps, I will need to write a custom jitlink memory manager to allocate memory near libjulia-codegen.so. Or is it possible to access these symbols via plt/got?

Oh, wait. I missed Reloc:Static in another place. Changing it to Reloc:PIC_ fixed this problem.

2 Likes

any chance this has been fixed yet? it wouldn’t shock me if the changes to make m1 chips more reliable has fixed this.

I think it is riscv specific. Medium codemodel is prefered, small codemodel is too small and large codemodel doesn’t exist. _PIC is needed for riscv as long as external symbols are imported via AbsoluteSymbol.

But it does help with other aspects. I now have a working Julia on riscv64 GitHub - alexfanqi/julia: The Julia Programming Language