Avogadro's number as a floating-point value

Ok, on this topic: What on earth were the SI people thinking when they specced the current value of avogadro’s number?

julia> Int128(6.02214076e23)
602214075999999987023872

I get that they wanted to spec a nice round number. Nice round numbers for fundamental physics constants are always nice.

But couldn’t they have chosen an integer that is exactly representable in Float64 instead of 6.02214076×10^23, i.e. 6_02214076 * Int128(10)^15?

Some rounding error is inevitable anyway for subsequent calculations (you usually multiply N_A with something). It is insignificant in practice for most applications.

Well, since SI is mostly about decimals, they are clearly not very compute friendly. Personally I think of SI as a standard for communication and interchange, not for computing. In a way, Float64 and such should be outside their purview. Plus isn’t Avogadro a definition in SI, not a physical constant?

I would usually choose my own pi, g, or Avogadro precision depending on the application. Fortunately, Julia does as well with pi as I would ever imagine.

1 Like

You can define a precision-adaptive Avogadro constant in the same way as Julia does for pi:

julia> Base.@irrational Nₐ big"6.02214076e23"

julia> Nₐ
Nₐ = 6.02214076e23...

julia> Nₐ * 1.0 # Float64
6.02214076e23

julia> Nₐ * 1.0f0 # Float32
6.0221406f23

julia> Nₐ * big"1.0" # BigFloat
6.02214076e+23

julia> setprecision(5)
5

julia> Nₐ * big"1.0" # BigFloat, 5 bits
6.04e+23
14 Likes

Perhaps it is best to define it as a BigInt, as Avogadro’s number is not irrational, and nowadays it is even a defined constant with no measurement error (so no ... needed).

2 Likes

No, because then any computations with Float64 will get promoted to BigFloat and be slow. By using the @irrational type the precision is automatically adapted to the surrounding calculation, even for values that are not technically irrational numbers.

11 Likes

That is really good and maybe something i need, but too small for a package haha

Or use JuliaPhysics/PhysicalConstants.jl

PhysicalConstants.jl/src/codata2018.jl at master · JuliaPhysics/PhysicalConstants.jl (github.com)

3 Likes

It is a definition. And when they redefined it, a couple years ago, the process was afaik “most convenient / round integer number within measurement error of the previous definition”.

My beef is that humans have a very good way of referencing the number of molecules in a mol – they write N_A and know that it’s roughly 6.02e23 (depending on how often they work with that, they may know more or less digits).

However, only digital computers will actually work with that number to high precision. Making that number exactly representable in Float32 was probably off the table (inconsistent with previous definition), but they could have chosen a Float64-representable number.

So there was a trade-off between: Can conveniently write it down exactly in decimal with chalk on a blackboard, vs can conveniently exactly represent it on a digital computer.

I think they chose wrong. This choice introduces (tiny, imperceptible) error in virtually all computations using that; you might say who cares?

But this introduces lots of complexity and design choices to scientific software. When you do computations in DoubleDouble, then you can’t just take your Float64-N_A and convert it, no, in e.g. C++ you need to define AVOGADRO_DOUBLE and AVOGADRO_DOUBLEDOUBLE and so on, and what do you do if desired float-precision is a template parameter?

Many programmers are pathological perfectionists. Julia can represent avogadro’s number in a context-specific precision, just like pi or e. But this sucks compared to “clean Float64 that is also an Int128”.

It would be a simpler world if SI had recognized that round binary numbers are sometimes better than round decimal numbers, and nowadays the primary user beyond the first 5 digits are digital computers.

5 Likes

Perhaps SI could have left part of it to IEEE. True physical constants seem subject to revision as measurements improve, unlike units like 1 mega=10^6 and 1 s=vibrations (I think?). So why not have semantic version numbers for Avogadro, and let them evolve? People then need to specify Avogadro v1.18, BigFloat v1.1, Julia 2.0…

AFAIK there’s no expectation that integers be perfectly representable in binary floating point. Sure, you could adjust the mole to have 602214075999999987023872 atoms, but that is a lot more to write than 6.02214076×10^23. Concise decimal, the more popular way to read and write numbers, seems to be their priority.

Avogadro’s number may also be my new go-to example of floating point precision: an integer needs at least log2(6.02214076e23) < 79 bits of precision at that scale, yet the 53 bits in Float64 can represent the close 602214075999999987023872 and 602214076000000054132736 because its unit in last place at that scale is Int128(eps(6.02214076e23)) == 67108864, not 1 like an integer’s.

1 Like

The fact Base.Irrational promotes down to interacting size is cool, and seems to do what is wanted. But it still annoys me as a hack which piggybacks on some other concept. Specifically, consider:

julia> sqrt(2.0f0)*5.0
7.071067690849304

which promotes to Float64, but doesn’t have the accuracy of a Float64. In fact, the accuracy flows from the Float32 of the sqrt. So the issue of promoting up or down is not a physical constants or Irrational issue.
Now comes the part stasists will tear down with some good reasons quickly - we could create two Float types: ExactFloat and RoundedFloat (in various sizes). When interacting with an ExactFloat the type size is promoted upward, and when interacting with RoundedFloat it would promote downwards.
For example:

julia> sqrt(2.0f0)*5.0  # `sqrt` returns a RoundedFloat32
7.071068f0              # this will be a RoundedFloat32

I don’t see that as accuracy flowing one way or the other, it’s just the necessary behavior for referential transparency, which is something you want programming languages to respect as much as possible: to understand what this code is doing you can replace sqrt(2.0f0) with its value, independently of the surrounding code, and it doesn’t change the result.

1 Like

I would rather say “no it doesn’t”.

The purpose of Avogadro’s number is to count the uncountable, namely, atoms. The original definition was the number of nuclei in 12 grams of ¹²Carbon. This number has never been determined experimentally, down to the last particle, and it is unclear if it ever will be. The 2019 definition truncates to the precision obtained by the Avogadro Coordination project, using absurdly pure crystals of ²⁸Silicon, polished to a previously-unseen degree of roundness. A quest by the best metrologists, over more than a decade, to perform the most ambitious measurement ever undertaken. That’s as far as they got.

The number which concerns you, 6.022140759999999 87023872e23, is this number with seven insignificant figures of additional and spurious precision. It is both accurate and precise well past the point where it reflects any known physical facts about the universe. It is 6.02214076e23, not an approximation of it, and certainly not an incorrect or lossy encoding of it. It is the same number.

Any operation which becomes imprecise from using this number, does so because floating point is inherently lossy. It wouldn’t matter if you started with an exactly representable number, because 6.0221407600000000000000e23 is not a more accurate number, it just has different, spurious, insignificant figures of false precision. The difference between these two numbers is not real, it’s an artifact of the imagination. As soon as you perform any numeric operation on 6.02214075999999987023872e23, you get another number which is also not perfectly representable, except by chance. If you do many of these operations carelessly, you might end up with a significantly-incorrect digit. But the starting point will have made no difference at all in reaching that result.

6.02214076e23 is not followed by a pretty trail of zeros, it stops existing at the last significant figure.

7 Likes

Maybe we should denote Avogadro’s number as an integer since it is in fact a large integer by definition.

julia> const Nₐ = int128"602214076000000000000000"
602214076000000000000000

julia> Nₐ ≈ 6.02214076e23
true

julia> BigFloat(Nₐ)
6.02214076e+23

julia> BigFloat(Nₐ) == Nₐ # no need to use approx
true

Reference: https://www.nist.gov/si-redefinition/meet-constants

2 Likes

This appears to be optimal:

julia> const Nₐ = int128"602214076000000000000000"
602214076000000000000000

julia> Int128(convert(Float64, Nₐ))
602214075999999987023872

julia> f1 = Int128(convert(Float32, Nₐ))
602214064354984894398464

julia> f1plus = Int128(nextfloat(convert(Float32, Nₐ)))
602214100383781913362432

julia> abs(Nₐ - f1) < abs(Nₐ - f1plus)
true

It isn’t actually an irrational number, and the conversion logic does the right thing in finding the best approximation available.

Unitful (naturally) has mol as a unit, which is the way to go if doing practical calculations with molar masses. It’s unclear to me how to convert mols to number-of-particles using Unitful, but casting that definition of Nₐ to BigFloat (if the value isn’t an integer) and multiplying by the value would give a correct result, which could then be truncated to whatever degree of precision was useful. In no sense would such a result be inaccurate or imprecise, except in the limited sense that the Float type might show an inappropriate level of imprecision. For calculations of the physical world, Float64 is manifestly adequate.

1 Like

The significant decimal digits of Nₐ correspond to the value before 2019, when Nₐ was experimentally determined (Number of atoms in 12 g 12C). After 2019 Nₐ became a fixed number (similar to the speed of light 299_792_458 m/s). Perhaps one could have made Nₐ “compute friendly” by modifying slightly the experimental value. But this would not be “human memory friendly”. There are people who more or less daily work with the number and memorize it to full precision.

3 Likes

Indeed. If experiments determine the value of Avogadro’s number (which is no longer the same thing as the Avogadro constant, which is the proper name of the value we’ve been discussing) to a higher degree of precision, SI may choose to update the standard to reflect that. Or not, if they decide that continuity and consistency is more important than having the constant continue to reflect the best available value of Avogadro’s number as originally defined.

I hope they do, because in practice the mole is used under the assumption that n mol * molecular-weight-dalton = n gram, which since 2019 must be treated as an approximation. Prior to 2019, this was the definition of the mol, and the dalton, and by derivation, the Avogadro constant.

Knowing chemists (having been one in a previous life), if a better value for Avogadro’s number is found, they’ll use it in preference to whatever SI declares the Avogadro constant to be. Although one must be doing some… unusual calculations, for additional precision to show up in the result in a relevant way, unusual calculations do happen.

I agree with @mkitti that the best way to represent the constant is as an Int128, but it’s important to understand that for practical reasons, 6.02214076×10e23 should be treated as though it’s a decimal float, with only the precision shown in the scientific notation of the number. This is unlike, say, the definition of a second, where exactly 9192631770 transitions of ¹³³Cesium occur in one second (under the assumed experimental conditions).

Btw, don’t use a Float32 for that value:

julia> Int(Float32(9192631770))
9192631296

julia> Int(Float64(9192631770))
9192631770
1 Like

my own two cents on Avogadro’s number.

My field is mainly equations of state (given a set of parameters, predict properties for a fluid and how they behave in equilibria situations). in particular, some equations of state (EoS for short) require the specification of the gas constant (R = N_A * k_b where k_b is the boltzmann’s constant). For example, the current european standard EoS for natural gas (imagine, you measure the temperature and pressure in a pipe, and you calculate how much volume is passing through said pipe using the reference EoS) was defined before the change of N_A to an exact constant, and because of that, the value of the gas constant is different too for old vs new reference EoS.

There are other types of EoS, known as Statistical Associating Fluid Theory (SAFT). and those basically determine the properties per molecule, so you need to reescale by N_A to obtain bulk properties. In particular i was porting a NIST library that implements some high accuracy aproximations for a type of SAFT EoS (that require evaluating such EoS in extended precision), and the message on one of the commits says it all: