Endian in C++20 vs Julia; and byteswap in C++23

If you never use reinterpret, directly (or indirectly), I think this is a non-issue in Julia, but otherwise it could affect correctness of possibly of all Julia programs.

I wish we could simply assume little endian (we can onl x86, and Arm we support), and I think we can mostly, with PowerPC I think the major big endian exception. So this is very much trivia for most users (note also network order, not a problem, just need to know of and do correctly).

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0463r1.html

Does Julia have similar to check endianness? It’s advanced to use reinterpret and I can mostly think of it done on floating-point numbers to view then as integers, to access components. Julia probably has portable code to do such, are there other major uses of reinterpret?

Does Julia have similar to:
https://en.cppreference.com/w/cpp/numeric/byteswap

1 Like

There are a bunch of endianness related functions here: I/O and Network · The Julia Language

I’ve used one of them once but that was years ago. Don’t know if this helps but thought it might be worth pointing out in case you haven’t seen them.

2 Likes

For what is worth Julia currently basically only runs on little endian systems. Making it work on big endian is perhaps possible, but in practice it’d require someone spending lots of time chasing all hardcoded assumptions of the endianness.

Julia only runs on powerpc64le, which is the 64-bit little endian PowerPC architecture.

5 Likes

Yes, it has bswap. Along with ltoh and htol etc. to convert the host’s native byte order to/from little-endian, and ENDIAN_BOM to explicitly check the endianness.

So yes, Julia already has all the facilities you need to write code that works on any endianness.

As @giordano points out, however, since Julia currently only runs on little-endian systems there is probably lots of Julia code that implicitly assumes little-endianness that would have to be fixed if you were ever to port Julia to a big-endian system.

Right now it’s hard to foresee Julia being ported to any big-endian architecture, since little-endian is so dominant.

5 Likes

Right, and probably in C++ too! I just noticed the trivia about C++ supporting big-endian better in C++20, so thought of what Julia does, and since it’s not important at all (on any non-legacy hardware), why introduce it then in C++20? I can only think of legacy like SPARC64 supporting big-endian only, and it actually has tier 1 support in Zig. Plus IBM s390x mainframe the only big-endian that I can think of living for a bit longer.

Could/should Julia support big-endian, at best as second-class? I mean could it implicitly do htol when reinterpreting, i.e. a no-op on little-endian, and no longer zero-cost, but cheap bswap then on big-endian?

Most upcoming languages, e.g. Rust have only tier 1 support for little endian.

MIPS was big-endian, but the Chinese offshoot:

  • All LoongArch systems are little-endian.
  • LoongArch is not binary compatible with either MIPS or RISC-V, although the ISA and ABI show heavy influence of the two.

Rust has in tier 2:

loongarch64-unknown-linux-gnu LoongArch64 Linux, LP64D ABI (kernel 5.19, glibc 2.36)
loongarch64-unknown-linux-musl LoongArch64 Linux, LP64D ABI (kernel 5.19, musl 1.2.5)
powerpc-unknown-linux-gnu PowerPC Linux (kernel 3.2, glibc 2.17)
powerpc64-unknown-linux-gnu PPC64 Linux (kernel 3.2, glibc 2.17)
powerpc64le-unknown-linux-gnu PPC64LE Linux (kernel 3.10, glibc 2.17)
riscv64gc-unknown-linux-gnu RISC-V Linux (kernel 4.20, glibc 2.29)
riscv64gc-unknown-linux-musl RISC-V Linux (kernel 4.20, musl 1.2.3)
s390x-unknown-linux-gnu S390x Linux (kernel 3.2, glibc 2.17)


|aarch64-unknown-linux-ohos|✓|ARM64 OpenHarmony|

and in tier 3:

|aarch64-nintendo-switch-freestanding|*||ARM64 Nintendo Switch, Horizon|
|aarch64-unknown-teeos|?||ARM64 TEEOS|

|aarch64_be-unknown-linux-gnu|✓|✓|ARM64 Linux (big-endian)|

|armeb-unknown-linux-gnueabi|✓|?|Arm BE8 the default Arm big-endian architecture since Armv6.|

|i686-unknown-hurd-gnu|✓|✓|32-bit GNU/Hurd 1|

|i686-unknown-redox|✓||i686 Redox OS|

I’m more interested in supporting potentially interesting operating systems (and maybe older game hardware).

To answer my own question, in Julia to test/confirm you run on little-endian machine you would do:

if ENDIAN_BOM == 0x04030201

but I’m thinking, to explicitly not have to do that, or use htol which does similar implicitly, could code be made just work?

ENDIAN_BOM

The 32-bit byte-order-mark indicates the native byte order of the host machine. Little-endian machines will contain the value 0x04030201. Big-endian machines will contain the value 0x01020304.

1 Like

You should rarely need to check ENDIAN_BOM explicitly.

A common situation in which you might care about endian-ness is for binary I/O — you (or the spec) decide whether you want the format to be little-endian or big-endian, and use htol or hton when writing, respectively, and ltoh or ntoh when reading.

Better yet, use a portable binary format like HDF5.jl that already takes care of endian-ness for you.

In some cases you might check ENDIAN_BOM as an optimization, in case you can skip an htol/ltoh pass over an array. (It might be nice, in theory, to have optimized broadcast/broadcast!/map methods for htol and ltoh that can eliminate them in the common case where they are the identity. e.g. x .= htol.(x) in principle could be completely eliminated on a little-endian machine.)

2 Likes