Why can Julia be better about boxing types than OCaml?

mrufsvold · August 15, 2025, 1:42pm

I was reading this article about OCaml’s special int type:

The tl;dr is that OCaml represents data as a 64-bit header followed by the actual data. This includes int64 and, often, floats. However, they special case the default int type so that the least significant bit is always set to true. This allows them to check that value is an int, not a pointer.

As a compiled language with a rich type system, I’m shocked that there isn’t enough information for the complier to elide tags and unbox almost everything. However, I don’t know any OCaml folks, so I thought I’d try the inverse question here:

Why doesn’t julia need to box isbits types in most cases?

Edit: Some additional context from another article:

a float value is represented by a pointer to a block (tag 253) holding one data word (64-bit) or two (32-bit). Floats are thus boxed, and a single float value requires three or four words (one for representing the pointer, one for the block header, and one or two for the float bits). Note that this representation is only required when floats are stored in data structures, or passed from one function to another. Within a single function’s body, local float values can be represented unboxed in memory or in registers, and the OCaml compiler apply such optimisations. There are also plans to allow unboxed calling conventions between functions.
About unboxed float arrays | LexiFi

So local floats are stored on the stack, but if you have a field in a record, then it’s actually a pointer to a heap allocated float (unless all the fields are floats).

jling · August 15, 2025, 1:51pm

may be of interests

mbauman · August 15, 2025, 2:45pm

Yeah, this is the crux right here (from the presentation above):

Julia adds a third option:

Why not both? Compile specializations when we know the type, otherwise box dynamic/type-unstable code.

stevengj · August 15, 2025, 3:40pm

Note that this is an old trick, also called “tagged pointers”, which has been used in e.g. some implementations of Scheme and JavaScript. A related technique is NaN boxing, which mixes pointers and Float64 instead of integers, and has been used for e.g. JavaScript and Lua.

Here is a nice old (2011) blog post on these techniques: Value Representations in JavaScript Implementations.

Of course, this was not entirely new to Julia. Optimizing compilers for other dynamic languages, like JavaScript, have done this too. However, Julia was designed from the beginning to be friendly to devirtualization and inference, while remaining dynamic, rather than trying to retrofit it onto an existing dynamic language.

mrufsvold · August 15, 2025, 3:45pm

The speaker addresses the “why not both” a little ways into the talk with “It explodes compilation time and can be quite brittle to surprising edge cases.”

He didn’t have a call us out like that.

Tamas_Papp · August 17, 2025, 2:35pm

I have to admit that I understood what “devirtualization” is when I started programming Julia. When I wrote C++ (decades ago), I implicitly assumed that because the compiler could do this, it was done. Only after I understood how Julia does this did I recognize why it is not so obvious to do this always in C++.

Topic		Replies	Views
Boxing obvious cases where the type is clear Performance	13	692	December 7, 2023
This month in Julia world - 2023-10 Newsletter	1	1588	November 2, 2023
N-bit Integers in Julia Internals & Design bitint	10	1303	February 19, 2023
Win32 bit versus Win64 bit Julia General Usage	5	2694	June 6, 2017
Potential solution to the overflow problem; 64-bit considered harmful, 32- or 21-bit better General Usage integer-overflow	6	3736	October 18, 2021

Why can Julia be better about boxing types than OCaml?

Related topics