Building stand-alone helloworld produces many dlls, being 200MB big

There’s already a PR to Julia for this functionality:
https://github.com/JuliaLang/julia/pull/32273

As always, if each person asking for a feature rolled up their sleeves and helped implement that feature, we’d already have it :smile:

8 Likes

Trust me, @jpsamaroo, if I helped write that functionality, we’d be in an even worse position. Lol

4 Likes

I know it’s followed by a smiley, but the truth is only if people with the expertise to understand and implement the feature get involved will we have the feature. People who are 15 years of full time study away from being highly skilled compiler writers are never going to implement that feature in less than 15 years, even though perhaps some compiler experts could do it with maybe 2 months of full time work…

if the feature is super important, the only reasonable method of changing the priorities of the team is to pay large sums of money to the experts who could do it.

7 Likes

bump

Not sure why you are bumping this. However, if we’re already reviving this old topic, there has been important progress recently: https://github.com/JuliaLang/julia/pull/41936

1 Like

I was bumping this because I just got an email about julia 1.7.0 and remembered that this here is the only issue that causes julia to not be my home language. So I was wondering if there has been any important progress recently, so thanks for your answer. :slight_smile:

1 Like

Basically the last remaining blocker for code that doesn’t need the compiler is to “tree shake” the sysimage, i.e. remove all dead code paths and unused data object. For many simple programs/libraries, this would probably bring the total on-disk cost to <50MB.

I don’t think anyone is currently working on this, but if you can parse Julia’s coverage data, you can use that to figure out which functions are actually used by a given run of the program, and try to delete everything other function in the sysimage via ObjectFile.jl or some other binary patching program.

2 Likes

50MB :flushed:
I don’t want to print the entire Wikipedia, just Hello World for starters :stuck_out_tongue:
By now I might know enough about parsing and compiling to give this a try at some point. But why would you expect a tree-shoken binary to be that massive?

The main thing is that if you want arbitrary Julia code to run standalone, you need to bundle LLVM which is a pretty big dependency.

1 Like

Note the “probably” and “<”; I have no idea how small we can get, but I imagine we can get pretty competitive with, e.g., C or C++.

I don’t, I’m just being conservative.

1 Like

I wonder if JET would be able to do this tree shaking? @aviatesk

If we sacrifice eval and generics and dynamic code generation stuff? Is there any critical feature of Julia that needs LLVM?

no. (although a lot of people would consider generics and dynamic code generation critical features of julia).

2 Likes

Indeed. But in some cases you just want to compile a single method for a fixed tuple of input types. Are there any major obstacles to making small binaries in those cases?

1 Like

Just as an example to emphasize this: lots of established large (millions of lines of code) HPC codes in quantum chemistry only operate with a very basic set of number types (they’re often with in Fortran or C). Since rewriting the entire code base in Julia isn’t going to happen any time soon it would be great for growing Julia in these areas if one could at least implement a few new features / functions in Julia, compile them to a library and readily attach them to the big code base. Restricting oneself to, say, just Float64 input and output for the library functions isn’t really a limitation here, since the big Fortran code won’t ever call with something else anyways. And even if I initially can’t use certain language features, I’d still prefer writing the new functionality in Julia. Lots of new features that get added to such code bases aren’t complex in the computer science sense (using fancy language features) but only in the algorithmic / science sense. The final code might eventually just be a bunch of nested loops with tons of indices :slight_smile:

8 Likes

You can actually compile (some) Julia programs down to 4KB or less, but there’s a huge catch (eBPF).*

Julia doesn’t need LLVM, except for code generation, which is strictly speaking optional, see also (recently merged): separate codegen/LLVM from julia runtime by JeffBezanson · Pull Request #41936

We don’t need to sacrifice generics, if I understand things correctly. You can compile ahead-of-time for all the types you will encounter and thus lose the LLVM (since only recently merged, I believe only an option on Julia 1.8 master). And EVEN with eval at runtime, LLVM could still be optional. Julia isn’t only a compiler, it’s also an interpreter, and you can ask for some code interpreted with Base.Experimental.@compiler_options compile=min (note it sets “compiler options for code in the enclosing module”, in theory some such option could lose restrictions, and be more automatic). That’s currently a strategy to lower latency for first use (TTFP) for code that’s not speed-critical. Plots.jl was first to use the earlier, but not as good, Experimental.@optlevel two years ago.

To do this manually is probably not user-friendly, my point it it’s possible to have some eval at runtime, and someone could just make a user-friendly compiler with options to allow for, or if you do not need eval at runtime, just skip LLVM. PackageCompiler.jl is the go-to option, it was slow way back, and seemingly (as) slow when I tried recently, but otherwise not hard to use. I didn’t try out all the features, I know you can optionally drop some dependencies.

In theory you can drop (almost) all dependencies (e.g. one large one, OpenBLAS is already optional/switchable). What you want to keep is e.g. the garbage collector, or you severely restrict the kind of programs you can run.

Actually with:

the GC isn’t used. That’s one other notable Julia compiler, while specialized. Yes, I believe you have generics for Julia code that runs on the GPU.

I’m not too worried about skipping the GC, as I think it’s only about a few kilobytes (or possibly about 2 MB or less as with Golang) at the most (I’m assuming, I don’t know the size of the generational GC, certainly GC from 1959 Lisp must have been only a few KB, then generational idea not yet invented, useful but not strictly needed).

This runtime provides the garbage collector amongst other services such as runtime reflection and stacktrace information. This is the main reason why a simple Hello World application results in like a 2 MB executable binary.

That’s also a likely realistic minimum for Julia static executable binaries with all capabilities of Julia included.

* Now, about going smaller, e.g. under 4 KB, which is possible, since already done with BPFnative.jl (one of my favorite Julia packages, not because I need it, more because of the proof-of-concept of small/Linux kernel Julia code) for eBPF VM Linux kernel code. I’m sure BPFnative.jl makes smaller than 4 KB VM code “binaries”, since that is/was the limit when it was in development:

Compiling to eBPF does get rid of all dependencies, including the GC, which also limits the type of programs you write. For a “Hello world” eBPF program see (I link to the interesting trivia about supporting some form of strings without GC, and I’m not sure where the output goes, maybe “hello world” can only go into the Linux kernel log?):

Some other relevant packages:

I’ve not looked into, seemed relevant:

and intriguing from the same guy (relevant? or not):

is an embedded virtual machine […]
It is a simple Forth-like environment interpreted (in the future hopefully also JIT-ed) in Julia, which you can easily extend in native Julia to provide an API to the scripts

7 Likes

Yeah, eBPF is a solid way to compile very simple programs down to something tiny and portable. Pair BPFnative.jl with ubpf, and you have yourself a compiler and runtime. At some point I’ll get ubpf support added to BPFnative.jl so that this is easy to do directly from Julia.

2 Likes

Yes, my main point was that small binaries are (already) possible (with eBPF), with restrictions. But it also seems we’re not very far from small binaries (not using eBPF) with no restrictions (or very few, that do not matter much).

I think most will be very satisfied with a 2MB small target (a huge improvement) and that seems easily attainable. I’m not even sure any code needs to be written, just using some non-default options. Or if code needs to be written, then a small amount. I can’t say for sure how long that will take, even small changes can take a long time, if on nobody’s radar (or not high-priority).

For the eBPF option however, that’s a non-Turing-complete option for the kernel (for good reasons). I believe you can disable the verifier, at least for userspace, for full Turing-complete. And you can relax the size limit to 1M instructions, and seemingly more assuming I’m understanding the next tweet correctly:

U32_MAX would mean we have no bound other than the max size that can be passed to the kernel on an unsigned 32-bit integer. There is a bound through the complexity limit, so this answer is wrong.

But eBPF is also restrictive in other ways, even for userspace. First of, I believe all your code would need to be (restricted) Julia code, no C/C++ dependencies etc.? I suppose malloc/free works, not the GC, but could the GC be added? And do you know anything about the WebAssembly option? It’s conceptually similar. It has no GC, but a GC for WA is being worked on, or you can provide your own i.e. Julia’s (in theory).

3 Likes