LLama2-7b difference in inference when between Float16 and Float32

Palli · September 13, 2023, 10:58pm

My first thought was are you sure it’s not bfloat16? It seems not but Float16 (either format), on Julia rounds for each operation losing accuracy, thus accumulates error.

Are you running the model on the GPU? It might be that those do all operations with a larger accumulator. I’m not sure a CPU has that capability, unless you cast to Float32 or Float64, and you would likely need to do it explicitly.

The Llama2 models were trained using bfloat16, but the original inference uses float16. The checkpoints uploaded on the hub use torch_dtype = ‘float16’which will be used by theAutoModelAPI to cast the checkpoints fromtorch.float32totorch.float16`.

The dtype of the online weights is mostly irrelevant, unless you are using torch_dtype=“auto” when initializing a model using model = AutoModelForCausalLM.from_pretrained(“path”, torch_dtype = “auto”). The reason is that the model will first be downloaded ( using the dtype of the checkpoints online) then it will be casted to the default dtype of torch (becomes torch.float32) and finally, if there is a torch_dtype provided in the config, it will be used.

Training the model in float16 is not recommended and known to produce nan, as such the model should be trained in bfloat16.

I couldn’t confirm, since that link doesn’t work, also it was trained in a GPU most likley, thus not really in 16 bits only. Maybe it’s just natural you can’t use Float16, at least on CPUs. Besides it’s very much slower, only though of as a storage format.

Note in case helpful to you:

C++ has (but not C) since C++23 bfloat16 i.e. std::bfloat16_t (also std::float16_t, C has it):
https://en.cppreference.com/w/cpp/types/floating-point

@Oscar_Smith Maybe Julia should add bloat16, to catch up with C++ future… though a package is as good (a different argument can be made for standardized languages and their stdlibs), maybe no need to have in the Julia non-standard, rather excise Float16…?

Topic		Replies	Views
Status of BFloat16 Performance	4	806	August 24, 2023
16 bit float on Transformers.jl General Usage transformers	7	585	December 28, 2023
Mix-mode training of large languages models in Julia Machine Learning	7	617	July 26, 2023
Gradient of llama2 computed by Zygote seems to be incorrect Machine Learning	6	741	September 13, 2023
Massive performance penalty for Float16 compared to Float32 Performance performance	17	8098	June 20, 2022

LLama2-7b difference in inference when between Float16 and Float32

Related topics