There is no locking going on here, thread safety only comes from the atomic reads. The difference lays in the way global bindings are implemented:
When assigning a value to a binding, a separate box is allocated for each new value and a module’s binding table then contains a pointer to that box. This adds an extra layer of indirection compared to constant refs, which always refer to the same location in memory so their memory address can be inlined in codegen.
This is potentially fixable by adding special handling for concretely typed globals, but that will require some careful thought. Fixing this is not a very high priority at the moment since code this performance sensitive should generally not be accessing global state anyways