Change default precision


#1

Something I find myself doing often is write an algorithm in Float64 (without bothering with types, as I would in Matlab say), and then try to change the default precision to Float32 or BigFloat (to test the sensitivity to numerical errors, to see if I can gain in performance…) Julia makes it very easy to write type-agnostic functions, but typically

  • I will not bother writing fully generic functions, and will have a number of zeros(n), randn(n) or the like in my functions
  • The top-level script still has to pick a type, and will call zeros(n), randn(n) etc.

The correct way to go about trying my code in Float32 / BigFloat would be to make sure each of my functions is type-agnostic, define a global DEFAULT_TYPE constant and use that in my top-level scripts. This can be done, but is a bit annoying (and requires thinking explicitly about types when coding)

Wouldn’t it be great if this could be done with a one-line change at the top-level of the script? (or with a command-line option) The only thing that would have to be changed is that a number of functions (zeros/randn and many others) produce Float64 by default. If I understand correctly https://github.com/JuliaLang/julia/pull/23205, this could be done by having a DEFAULT_FLOAT_TYPE() = Float64 function somewhere, replacing all explicit uses of Float64 in Base with this, and then overwriting that method. Would that be feasible?

A concern is that it would force recompilation of all code, including library code (I want my computation to be done in BigFloat, but not necessarily all the internals in the plotting library, say). This might be acceptable, though.


#2

You could either

  1. always convert to the type of a specific argument, which you pass around, or

  2. convert to the type of a global constant (would need to restart the kernel though).

Eg

like(x::T, y) where T = convert(T, y)
like(x::T, y::T) where T = y

## just pass around x for type determination
foo(x, y) = like(x, y+2)

## or have a constant, change this and restart
const x = 1f0

foo(y) = like(x, y+2)

#3

Yes, I can do that. In fact in many cases it’s simpler than that, I can just get the type from somewhere and just call randn(eltype(arr), n) or something like this. My point is that this requires additional machinery which can be cumbersome to code, because it requires that I think about types when writing code (which I often don’t want to do). Here’s a stupid example, but hopefully it shows what I mean:

function compute_rand(d)
    A = randn(length(d), length(d))
    A = A+A'+Diagonal(d)
    eig(Symmetric(A))[1][1]
end
compute_rand(randn(10))

If I want to do this on Float32, I have to change as follows:

function compute_rand(d)
    A = randn(eltype(d),length(d), length(d))
    A = A+A'+Diagonal(d)
    eig(Symmetric(A))[1][1]
end
flttype = Float32
compute_rand(randn(flttype,10))

Of course on this particular example it’s not that big a deal, but on more involved code it can be a pain to be careful about writing completely generic code. Ideally, I’d only have to do set_default_float_type(Float32) and that’d be it.


#4

I understand the problem, but as AFAICT making any kind of outcome depend on a mutable global is in fundamental conflict with the performance model of Julia.

Perhaps I write different kinds of functions, but I find type stability/propagation pretty straightforward with the existing language features. Eg consider

function compute_rand(d)
    ix = indices(d, 1)
    A = similar(d, (ix, ix))
    randn!(A)
    A .+= A' + Diagonal(d)
    eig(Symmetric(A))[1][1]
end

compute_rand(randn(10))
compute_rand(randn(Float32, 10))

#5

I understand the problem, but as AFAICT making any kind of outcome depend on a mutable global is in fundamental conflict with the performance model of Julia.

Not really:

default_float() = Float64
n_randn(n) = randn(default_float(), n)

display(n_randn(2))
default_float() = Float32
display(n_randn(2))

so you could define randn(2) = randn(default_float(), n) and change default_float() later. I understand it may have other problems down the road (forcing recompilation, contaminating other packages), but in principle it’s very much feasible.

Regarding your function, I agree it’s not that hard to write generic code, but don’t tell me you find writing this as simple and natural as the naive version above :slight_smile: This is what I meant above: when writing numerical code, I don’t want to think about types (and I don’t have to unless I want to change type). I just want to have my cake and eat it too, isn’t that what julia’s about?


#6

For performance reasons, using the constructs for type stability/propagation has become a habit for me, even for simple functions. Perhaps the your code can be said to be more “natural”, but consistently following a few simple rules (try to use similar, zero, one, indices, …) without thinking about whether I would or would not need them in a particular case actually saves cognitive resources for me.


#7

Of course it is much simpler and more natural and easier to understand than the version above.

If you don’t want to think about types, then you should just make your function work on one type and convert everything to that.


#8

What scares me about this is that if it’s allowed, a silent global constant or environment variable could mean that the file is supposed to be used very different than what you can actually see from the code itself. I don’t like that idea.

One thing you could do though is write a macro that finds all literals and replaces them with what you want. For example:

@parse_precision (Float32,Int8) begin




end

which reads all of your code and replaces floating point definitions with parse(Float32,"...") and etc. for ints (or maybe something a little smarter so it still inlines). Make it so you can throw it on functions or wrap large pieces of code, and that would be a nifty 1-macro package.


#9

On the off-chance this isn’t a troll, I’ll feed: I don’t see in what sense it could be considered simpler. It may be “easier to understand” in the sense that it’s easier to see what’s going on in terms of types, memory allocations and so on. It’s definitely a useful point of view if writing tight loops and you care about memory allocations, it’s not when you have a matlab-type code that just manipulates arrays or just want to try an algorithm without caring about performance (not every part of a code is performance-critical, and julia in general is very good at “hybrid” programming: write matlab-style code, profile, rewrite tight loops in C style when needed).

@ChrisRackauckas Who’s to say randn(1) should return a Float64? Global state is always problematic, I grant you that, but it’s awkward to get around in many cases (number of threads, default RNG, treatment of subnormals, bigfloat precision…). Your @parse_precision is nice, but still not going to help me if I have a randn(2,2) lying around in a function.


#10

I would say that the “correct” way, certainly the recommended way if you are trying to write library code that will be used by others, is to always determine the precision used in a function from the function arguments.

In most cases, you get get the precision from the type of floating-point data passed to a function, but occasionally (e.g. rand(Float64)) you will need to pass an explicit type.


#11

That’s what I mean by “type-agnostic”, sorry if the naming was confusing. I know that’s the correct way, but I’m not writing library code to be used by others, and neither do 99% of users.


#12

Its documentation? (Is this a trick question? :wink:)

It is somewhat unfortunate that you consider replies that don’t share your POV trolling.


#13

I 100% agree with this. Any function should just be written generically and I don’t think it’s much more difficult to do anyways.

But I do see an issue at the top-level call scripts. For example, when I am writing code that is supposed to call NN library code, all of my “setup code” has to be converted to Float32 to do well on the GPU, so I find myself writing Float32(0.5) or Float32[0.2,0.4,0.4], i.e. having to remember to put Float32 all over the place so that way the library functions get the write types. Catching the rand case here is indeed difficult as @antoine-levitt noted, and I have found myself accidentally have a few arrays of Float64 because they were generated by rand/randn and I forgot to declare the type to be Float32 (and then I have to add a CLArray in front of everything, etc.). There must be something that can be done to make this kind of code easier.


#14

Its documentation? (Is this a trick question? :wink:)

You know what I mean, it’s like this currently because it’s convenient and simple, but there’s no fundamental reason why the default precision should be Float64 (if I was writing GPU code, I’d probably want 32bits precision as default)

It is somewhat unfortunate that you consider replies that don’t share your POV trolling.

Sorry for the unnecessarily aggressive tone, I shouldn’t have. I don’t consider all replies trolling, just this particular one: I still don’t see how

    ix = indices(d, 1)
    A = similar(d, (ix, ix))
    randn!(A)

could be “of course simpler” than

A = randn(length(d), length(d))

My point is: I know writing generic functions is the correct thing to do. It is however a burden on my mind I don’t need when writing the type of code I write (I don’t care about reuse, I don’t care about other users, I care about implementing my algorithms, seeing if they work, and throwing most of them away because they’re rubbish). This is what I imagine users (as opposed to Base or package developers) do as well.

It seemed possible to keep writing my dirty non-generic code and switch easily to Float32/BigFloat, so I thought I’d ask if that was something people would consider a good idea. I see that the answer is overwhelmingly “no, write generic code”, so I’ll shut up now :slight_smile:


#15

It’s simpler in the sense that you can look at the code and can actually understand what it is supposed to do. Without hidden global states that suddenly changes the program in a very non-obvious way (i.e. you won’t see the effect immediately).

In which case simply mixing the type isn’t a problem either which means that you shouldn’t do this if you care about performance and it’s useless when you don’t.


#16

randn(T, length(d), length(d)) isn’t much longer.

It can. You can also overwrite those.


#17

If you have a function that builds up a large sparse matrix and then diagonalizes it (for instance), you likely do not care about performance in the assembly phase (which is going to be negligible in most cases), but changing the type of the float is going to impact the performance of the eigensolver.

I agree about your point on global state; they are nasty and hard to think about. However, they are also extremely convenient, and unavoidable in many cases (julia itself has a few already)

antoine-levitt:
but still not going to help me if I have a randn(2,2) lying around in a function.

It can. You can also overwrite those.

Exactly. In fact I’ve thought about making a package that would do this. My concerns are

  1. that basically relies on a grep Float64 of julia Base, and therefore might break whenever methods are added/modified, which is why I thought it would be better directly in the base language

  2. what happens if somewhere deep in PyPlot (say) some method calls randn(2,2)?
    Will that method be affected by the change? Will the whole PyPlot package get recompiled?

If 2 is not a problem, I’ll make a package and try it out.


#18

Put the type on the container and the convertion will happen automatically.

Nothing that affect the behavior of code that I’m aware of, and we are getting rid of them as much as possible.

Absolutely NOT. That’ll be terrible.

AFAICT you shouldn’t care about this and you should not change base function so that this won’t happen.


#19

antoine-levitt:
Will that method be affected by the change? Will the whole PyPlot package get recompiled?
AFAICT you shouldn’t care about this and you should not change base function so that this won’t happen.

Isn’t this what you meant before by overwriting randn(2,2)?

I was thinking of this:

function my_fun(n)
    sum(randn(n)) #just an example but actually do something more complicated
end

import Base.randn
Base.randn(dims::Integer...) = randn(Float64, dims...)
display(my_fun(2))

import Base.randn
Base.randn(dims::Integer...) = randn(Float32, dims...)
display(my_fun(2))

My point is that I want to do this in Float32, but without passing a type around in my_fun (because I’m a bad person who doesn’t want to write generic code)


#20

No, I mean to use a macro to rewrite it.

The type is usually easily inferable from parameters but no we’ll not accept any change to base or pacakages that uses global state like this.