Constant-time arithmetic operations

Julia has a great potential as a language for cryptographic implementations, the niche Rust is now beginning to occupy. There is one big obstacle for that: uncertainty about constant-time guarantees of operations. Rust is based on LLVM too, and it has those — what’s Julia’s status? Is it enough to avoid branching? For reference, this library is used for constant-time algorithms in Rust.

One thing that you can do in Rust that I don’t know how to replicate in Julia is making a value opaque to the compiler, namely preventing it from knowing that an integer containing a boolean value is maximum 1 bit in size (see black_box() in the link above). The Rust implementation uses assembler for that. Or would that be a non-issue in Julia?

1 Like

For research, potentially yes, but for practical implementations used in the wild, I am not sure. GC should have even larger implications for time-based attacks.

All issues can be worked around with enough effort, but these applications may not be a comparative advantage of Julia.

1 Like

To be honest, I’m not sure how GC would be an issue if the event of an object going out of scope is not data-dependent, which, as it seems to me, it wouldn’t be if there are no branches in the code operating with secret data.

Also, if that really is an issue, how hard would it be to add a function to temporarily turn GC off? Most cryptographic operations working on secret data don’t really need a lot of garbage collection.

You can write practically non-allocating code in Julia, but there are no facilities for ensuring this in a verifiable way.

I think that using Julia for practical crypto (with a moderately well-equipped and motivated adversary) is dubious. And while you can tweak the language in various ways, the consensus is that good crypto solutions are built from the ground up with the appropriate design decisions.

Julia targets an entirely different (one could say opposite) niche: convenient fast prototyping for generic, composable scientific programming code.

Why would you want to ensure it though? What is the problem with allocations that are not data-dependent?

I am still unclear as to why, besides the problem I expressed in my open post.

How else would you trust an implementation?

We are not talking about “I have checked with @benchmark and there were no allocations”, but rather “I can prove that there are no data-dependent allocations because …”.

How hard would that be? If you have a piece of code without any conditionals, the only way a data-dependent allocation can occur is if there is some bug (or malicious code) in Julia compiler. So are you arguing that it will be too hard to verify that the compiler doesn’t have such bugs? In this case the question of allocations seems to be irrelevant, because if you don’t trust the compiler, it doesn’t matter whether it guarantees you no allocations (or anything else) or not.

I admit I don’t know much about this topic, I’m just trying to understand why is there such an opposition to using Julia for crypto.

I am afraid you still don’t understand. The issue is not what happens with the current Julia implementation, but what the language guarantees generally. Theoretically, a piece of nontrivial non-allocating code could be allocating tomorrow. Types could become uninferred and values boxed if compiler heuristics change, etc. It’s not like C.

That said, it is conceivable that you can use Julia like C and get away with it: declare and assert all types, always work with preallocated constructs, only use primitives with concrete types, etc. But this begs the question: what’s the advantage of Julia? why not program in C then?

I don’t think there is an opposition — you are free to use Julia for crypto, or submit PRs that make it easier to do so. It’s just that frequently, the question of

is impossible to answer without a lot of work and trade-offs. Specifically, if you want to use a nontrivial subset of Julia for crypto, a lot of rules about GC and the compiler would have to be made really, really explicit at a level that would make a lot of reasonable (but currently unforeseen) changes breaking.

You can use ASM via LLVM IR so I guess you can do a similar thing? But IIUC you need something like asm volatile to tell the optimizer to not touch the code. I wonder if something like this is already possible. I couldn’t find a relevant topic in the LLVM manual when skimming it but obviously “skimming it” is not enough.

BTW, black_box function you quoted does not seem to use volatile even though Rust’s asm! macro seems to have this option. Maybe that’s because it’s only about beating the Rust compiler and not LLVM? I don’t know anything about ASM or Rust so I may be saying something stupid, tough.

Reading comments like this, it feels like Rust is not there yet:

(Of course, it’s possible that they are writing this while trying very hard to standardizing things.)

1 Like

BTW, black_box thing sounds a lot like clobber/escape suggested in https://github.com/JuliaCI/BenchmarkTools.jl/pull/92

I think an API for telling the compiler (Julia+LLVM) to not optimizing anything is important outside cryptography, especially in benchmarks. I think it’s reasonable to have this in julia itself.

1 Like

It is my understanding that for the Julia compiler, leaving a lot of low-level details unspecified is a feature, as it allows optimizations and experimentation without breaking changes.

I agree that not specifying too much about the performance is important for evolving the language. But I’m looking forward to seeing the core devs to come up with more good UIs for programmer-compiler communication for the performance. We already have some, like @inbounds, @inline, and @generated. I don’t think I’m not crazy to expect more to come.

Also, IIUC, what cryptography needs here is a way to tell the compiler (Julia+LLVM) to not do any optimization. I think it’s much more reasonable API to have than requesting “at least this much of optimization” (e.g., something like @always_inline). I think this API is also beneficial for almost all Julia users as this is very useful in reliable benchmarks.

Asm is opaque to LLVM.
I’m using asm to specifically emit vfmadd231 (c = a * b + c) in certain cases in LoopVectorization, as LLVM will sometimes emit vfmadd213 (b = a * b + c) + vmovapd (c = b) instead (the vmovapd would be unnecessary with 231), but this prevents llvm from changing the asm, so I need to switch from the asm to the llvm intrinsic in cases where llvm may want to combine it with a load or select, to make it possible.
While the asm itself is opaque, I think llvm can still reorder things, but I’m pretty sure it would not know anything about the value of what gets returned. Meaning I believe black_box would work as intended.

1 Like

Hi thanks for the info :slight_smile: Do you know if it is an API guarantee or does it just happen to be the case for now? I don’t know anything about cryptography but IIUC they need to explicitly turn off the optimizer for some piece of code.

But I actually only care about using something similar for benchmarks (then “happens to be the case” is OK). Looking at https://github.com/JuliaCI/BenchmarkTools.jl/pull/92, it seems writing asm is enough?

I’d move forward with the code in the PR. I haven’t tested it, but I’m think llvm will move or optimize-away asm if it can.
Meaning if the asm has no side effects, and you aren’t using a return value, it’ll get rid of the dead code. With black_box, you’re presumably using the return value for something. clobber and escape have side effects.

Maybe someone who knows more about compilers can chime in, but while the LLVM language reference doesn’t say anything about it being guaranteed opaque, I suspect (a) changing that would violate the spirit of asm call (the asm was almost certainly written to avoid the optimizer – either to explicitly guarantee optimal code, like with BLAS, or improve benchmark fidelity), and (b) making it not opaque to the optimizer would require converting the asm into a representation it understands. I find it unlikely that this will change.

IIUC the compiler may optimize the inline assembly if it can. GCC and Clang has asm volatile for the programmer to explicitly tell them to not touch it. At least that’s what learnt from watching the talk explaining clobber() and escape():

This used is in google/benchmark, too: