It seems to be quite straightforward but a bit messy to implement a ‘good enough’ way for ints and maybe floats, e.g. supply a set of types to check vs (and a tolerance for floats).
Use case for me is basically dataframes from multiple files read using Distributed and I’m hoping that by squeezing the element type I can return more data to the host without going oom.
The fact that my searches returned null makes me think that this is not meaningful to do though.
Since the JDF package is a bit large if this is all you want, you can also use something like this:
"""
squeezeeltype(x; tol_kw...)
Return a collection that has the strictest type that will contain
all of its elements. Only really meant to work on `Number` types.
Keyword arguments are passed to `isapprox`, which is used to
determine whether to truncate floating point representation.
"""
function squeezeeltype(x; tol_kw...)
T = mapreduce(y->_mintype(y; tol_kw...), promote_type, x)
convert.(T, x)
end
function _mintype(x::AbstractFloat; tol_kw...)
(isnan(x) || isinf(x)) && return Float16
for T in (Float16, Float32, Float64)
abs(x) <= floatmax(T) &&
isapprox(T(x), x; tol_kw...) &&
return T
end
BigFloat
end
function _mintype(x::Integer; kw...)
x > zero(x) ?
_mintype_int(x, (UInt8, UInt16, UInt32, UInt64, UInt128)) :
_mintype_int(x, (Int8, Int16, Int32, Int64, Int128))
end
function _mintype_int(x::Integer, Ts)
for T in Ts
typemin(T) <= x <= typemax(T) && return T
end
BigInt
end
# fallback
_mintype(x; kw...) = typeof(x)
This might be a prime use case for @nospecialize, but I’m really no expert.
You could make the argument that round floats should return an Integer type. You can slightly modify the AbstractFloat case to do this using isinteger(x).
You can also surely do the above (probably faster) using the bit representation of the numbers (i.e. checking which is the first nonzero byte with bit shifts, etc.) but that’s left as an exercise for the reader