For my use case it seems libjulia-codegen and libLLVM are not required anymore at runtime, which is also the “promise” of trimming. Do you know which library requires it? I am pretty sure I am using some of the artifacts (it doesn’t run without), but haven’t looked too much into which ones.
I get the ccall requires compiler error, package is HiGHS.jl, see Creating fully self-contained and pre-compiled library - #15 by asprionj. I don’t get any more details as error message, since it happens at runtime… so I don’t know exactly which library it is since there are multiple artifacts required (by the chain of dependencies of HiGHS.jl).
My question basically is if any ccall requires the two libs (LLVM and codegen), or if this only applies if the artifacts are left in their default folder that “normal julia” relies on…
I see. But JET.@testopt reports no errors? What about juliac --trim?
I don’t know what this is supposed to mean and I also wouldn’t expect ccall to require compiling Julia. The message shows up here (and possibly elsewhere, I haven’t checked), and this error is described in the PythonCall FAQ as a libstdc++ incompatibility with the Python side. Not apparent if that’s related to HiGHS. I think it was implied, but do you not run into this error when running the same or equivalent code in typical REPL usage?
Thanks for the hints, will check them all later and update here. Especially I haven’t worked with JET so far, should give that one a try.
- When I run the code in the REPL (or just by command-line execution of julia), there’s no such problems.
juliac --trimdoes not work due to the regression between trimming and binding partitions, as described in a former post: Creating fully self-contained and pre-compiled library - #14 by asprionj
Ok, so one difference in our setups currently is that I just leave the artifacts where they are (and thus where they are expected by “normal julia”). On Windows, that’s something like C:\Users\<username>\.julia\artifacts. In contrast, you
just copy over the
sharedirectory that PackageCompiler comes up with
But where exactly do you put this share directory? I could try to manually do the same, just to see if that resolves the problem. Maybe the “runtime part of julia” is required if it needs to find artifacts “dynamically” at runtime…? I’d then probably also have to use the --relative-rpath option, I guess? Or does this just apply to the julia libraries, not the artifacts?
If you want to get rid of libcodegen and libllvm you need juliac --trim, and to get this to work it helps to appease JET. In my tests using etc totally works even with many packages. I also have a much more simplified repo that does just a simple trimming on NonlinearSolve.jl GitHub - RomeoV/TrimSimpleNonlinearSolve.jl.
Have to get into JET etc. a bit. For now I just use the jq-1.0 branch (a complete better rewrite) of JSON.jl. Having a JSON interface is the first step I need to take.
There’s a long output of @report_opt, it’s quite deeply nested, and all actual “runtime dispatch” errors are 1-5 levels below this:
││││││││││┌ print(io::IOBuffer, x::Type) @ Base ./strings/io.jl:35
│││││││││││┌ show(io::IOBuffer, x::Type) @ Base ./show.jl:968
Now it’s a bit hard for me to understand. First, I don’t print/show anything in my function I want to compile. Second, if the problem is in Base, it’s probably a long way (up to impossible) for me to “fix”, right?
EDIT: FYI, I gave juliac --trim a try anyway, and both on nightly and rc I get something around 100 trim-verify errors.
To understand the state of --trim better I recommend watching Jeff’s JuliaCon 2025 State of --trim talk here. In general, trimming will for the most part not “just work” for any Julia packages, as it only supports a subset of Julia functionality, and in particular no runtime dispatch and limited dynamic dispatch.
Therefore, most established libraries that have not been adapted with trimming in mind won’t work well with --trim, and may never work. In particular, file IO is inherently type unstable if you don’t know what you’ll be dealing with ahead of time yet. Similarly e.g. CLI parsing.
The JuliaHub team has made an HDF5 implementation that is trimmable: GitHub - gbaraldi/StaticHDF5.jl: Read and write julia arrays to HDF5 (the author of this package is also one of “the” juliac guys). Perhaps you can first convert your json file to hdf5 and then use that library. I have made a trimmable CLI parser GitHub - RomeoV/TrimmableCLIParser.jl if you are also looking for something like that.
Regarding having 100+ JET errors, that’s quite normal, but sometimes it only takes one fix to collapse all the errors. However, sometimes it also needs a complete redesign of how something works to be fully type inferrable. It really depends on the library.
My personal recommendation if you want to go forward: Make a number of self-contained minimum working/failing examples, similar to my GitHub - RomeoV/TrimSimpleNonlinearSolve.jl repo. Perhaps one for your HiGHS interface with hardcoded data, one for your File IO, etc. Make sure you find something that is fully JET compliant and try to --trim it, so that for everything on top of that you can bisect the changes to find out what made the code non-trimmable. Then, start opening concrete discussions about very concrete problems either here, on Slack static-compilation channel, on Zulip, or in an issue of the repo.
Some library maintainers are interested in pushing this forward (e.g. some of the SciML and the JuliaHub people). I haven’t talked to the JuMP people, perhaps @odow can comment on this himself. If the library maintainers are interested in pushing their library towards trimmability, it can be helpful to contribute tests to the upstream repos that test common functionality for @test_opt and work with them to fix issues. This seems daunting at first but becomes easier as you become more familiar with JET etc. For your own use, sometimes you can also just fork their repo and delete/hardcode some choices that are tripping up JET but where you don’t need the flexibility. However, of course it would be better to keep everything upstreamed.
Finally, it is not always necessary to actually remove all JET or trimming verifier errors. Essentially, if you have errors on code paths that you will actually never hit you are “fine”.
I haven’t talked to the JuMP people, perhaps @odow can comment on this himself. If the library maintainers are interested in pushing their library towards trimmability,
I have no personal plans to work on anything juliac related while it is experimental and requires an unreleased version of Julia. Ill take a look in 6 months or so.
But I’m happy to review/merge PRs if people want to do the work to find what changes are needed.
Thanks for the tutorial and sharing your observations. I could reproduce compiling hello.exe on the newly released final Julia 1.12 version on Windows, however I cannot confirm that the DLLs that you list above are sufficient for a relocatable binary. To ensure that no DLLs are loaded from other installed applications, you could temporarily clear the Path environment variable for the shell session. Using PowerShell:
PS> $Env:Path = 'C:\Windows;C:\Windows\system32;'
PS> .\hello.exe
Hello, World!
Assuming that Julia 1.12 was installed with Juliaup, the following DLLs from the .julia\juliaup\julia-1.12.0+0.x64.w64.mingw32\bin folder are required next to the hello.exe file on my PC:
libatomic-1.dll 0.26 MB
libblastrampoline-5.dll 2.26 MB
libgcc_s_seh-1.dll 0.93 MB
libgfortran-5.dll 11.53 MB
libgmp-10.dll 1.05 MB
libjulia-internal.dll 14.49 MB
libjulia.dll 0.21 MB
libmpfr-6.dll 2.51 MB
libopenblas64_.dll 37.79 MB
libopenlibm.dll 0.52 MB
libpcre2-8.dll 0.76 MB
libquadmath-0.dll 1.15 MB
libstdc++-6.dll 25.02 MB
libwinpthread-1.dll 0.32 MB
In total and including the hello.exe executable (1.67 MB), these add up to exactly 100 MB.
I used the following source file and command to compile the binary:
function @main(args::Vector{String})::Cint
println(Core.stdout, "Hello, World!")
return 0
end
PS> julia $Env:USERPROFILE\.julia\juliaup\julia-1.12.0+0.x64.w64.mingw32\share\julia\juliac\juliac.jl --experimental --output-exe hello --trim hello.jl
The release highlights describe an alternative, canonical way using a command line interface from the JuliaC.jl package:
After installing the JuliaC package as an app (pkg> app add JuliaC), and adding the .julia/bin folder to the Path, I was able to compile the example “Hello world” binary using the command from the link. This automatically copies the required DLLs into a build/bin folder. The total size grows to 111 MB for the executable plus 29 DLLs, but this way is certainly more convenient than manually copying the DLL files from the Julia folder.
Does this work for compiling shared libraries?
i found the cpu_target not supported now, so the migration is not guaranteed ![]()
Hi everyone, we’re now using JuliaC.jl to create a shared library, with all dependencies bundled – awesome! Thanks a lot to the folks making this possible!
Now, two points are still a bit tricky, and we did run into problems with (most probably) them.
First, thread safety of the JuliaC-generated shared library. When embedding julia, one has to take care of only using the jl_init-ing or julia-spawned threads for all further jl_* calls (Embedding Julia · The Julia Language). How does this transfer over to a JuliaC-generate shared lib? Shall only a single thread load the library, and then handle all the calls to the Base.@ccallable functions? We’re getting segfaults after around 70 calls when we call from different threads, but none when calling from a single thread.
Is this documented somewhere and we just missed it?
Second, a low-level detail on memory handling and the julia-side GC. I first had something like this:
buf_size = io.size # io is an IOBuffer holding an encoded protobuf-message
out_ptr = Ptr{UInt8}(Libc.malloc(buf_size + 4))
unsafe_copyto!(out_ptr, pointer(vcat(sizeinfo_bytes, e.io.data[1:buf_size])), buf_size + 4)
return out_ptr
Now, my AI-coding assistant mentioned (when working on the first issue) that the julia GC might collect the temporary array created via vcat, while the copying-process was still ongoing. The proposed solution is:
payload = take!(e.io)
buf_size = length(payload)
out_ptr = Ptr{UInt8}(Libc.malloc(buf_size + 4))
# directly write size information
unsafe_store!(out_ptr, UInt8((buf_size >> 24) & 0xff))
unsafe_store!(out_ptr + 1, UInt8((buf_size >> 16) & 0xff))
unsafe_store!(out_ptr + 2, UInt8((buf_size >> 8) & 0xff))
unsafe_store!(out_ptr + 3, UInt8(buf_size & 0xff))
# Copy payload from Julia vector
GC.@preserve payload begin
Base.unsafe_copyto!(out_ptr + 4, pointer(payload), buf_size)
end
return out_ptr
Is this actually “better”? Is it actually required to avoid any unsafe behaviour?
Thanks a lot in advance!
It is better because it avoids allocating e.io.data[1:buf_size], it avoids allocating the concatenated array, it is clearer what is happening.
It is safer because a derived pointer is only valid while the object is pointing to is preserved (which it is if it is stored in a global variable or is inside a GC.@preserve. In your first code, it is not preserved so the pointer is not valid to use for anything.
But access to io.size and io.data is not recommended in the first place bceause they are internal fields. Typically you would take! the data from the buffer.
Anyway, good job robot ![]()
This was fixed in 1.12, so you shouldn’t be seeing it with JuliaC.
Thanks a lot for the explanations. And the robot did use take! instead of accessing internal fields of io. The first implementation is no me
.
So it is OK now to call julia (when embedded) from different OS threads? Or is this only valid for binaries generated with JuliaC / PackageCompiler?
But then, why I am getting segfaults when using multiple threads to call into the shared library, and not if I funnel all calls through a single thread? I have to say that, in the “minimum non-working example” we tried to come up with, we’re only getting the segfault when stopping everything. That is, run e.g. 200 calls (from Java, BTW), split amongst 5 threads, that all works, and when the loop ends, we get the segfault. I have to double check if in the real setup, the behaviour is different…