Side Effects of println?

weech · May 9, 2019, 5:03pm

I’m getting some really strange behavior, and I’m not sure what to make of it. Essentially the values in an array change based on making a call to println elsewhere in the script. I can’t figure out how to reproduce it in a smaller example, but basically the code structure goes like

# Do lots of things
println(inconsequential_value)
# Do lots of things
ao = parseseries("ao.txt") # Function to read in data
println(cor(ao, pc)) # Gives correct value
# Plot ao and pc

for the correct case and

# Do lots of things
ao = parseseries("ao.txt") # Function to read in data
println(cor(ao, pc)) # Gives incorrect value
# Plot ao and pc incorrectly

for the bad case, where the difference is making a call to println earlier in the code. Here are images of the two arrays. I can’t give the actual bad values because calling println corrects the data (but passing it on to PyCall does not). What’s interesting is it gives the same wrong values every time. What I would like is some guidance on how to go about debugging this error. I went through my script commenting things out until the error went away, and I was left with a lot of unrelated stuff that has to execute before the error will appear. I’m not sure how to narrow it down further.
The bad data:

The correct data:

For the record the parseseries function looks like:

normalize(series) = (series .- mean(series)) / std(series)
function parseseries(filename)
    open(filename) do f
        data = Vector{Vector{Float64}}()
        for line in eachline(f)
            tokens = filter(x -> length(x) > 1, split(line, " "))
            year = parse(Int, tokens[1])
            !(year in 1979:2010) && continue
            jan = parse(Float64, tokens[2])
            feb = parse(Float64, tokens[3])
            dec = parse(Float64, tokens[end])
            push!(data, [jan, feb, dec])
        end
        results = Vector{Float64}(undef, 31)
        for year in 1:30
            results[year] = mean([data[year][3], data[year+1][1], data[year+1][2]])
        end
        return normalize(results)
    end
end

but I don’t think that’s the issue. Maybe it’s a GC issue or something.

Edited to fix typo.

Tamas_Papp · May 9, 2019, 5:10pm

That’s the least likely.

Note that in

results = Vector{Float64}(undef, 31)
for year in 1:30
    results[year] = mean([data[year][3], data[year+1][1], data[year+1][2]])
end

the element results[31] is undefined, so basically random. This could be the cause.

Be very, very careful with undef.

EDIT You can protect yourself from errors like this (= hardcoding the wrong size) in many ways. One of them is

for year in eachindex(results)
    results[year] = ...

weech · May 9, 2019, 5:20pm

That solved the inconsistencies, thanks for your help. It’s still weird how all of the values in the array were incorrect with the bug though, unless there’s another bug in NCAR Graphics that made them all appear to be less than 0 when there was garbage in the last place.

jeff.bezanson · May 9, 2019, 5:31pm

Is that really true? Once you already have the ao array, printing it looks fine and every operation on it after that is fine, but if you plot it instead of printing it, it has the wrong values? The value you get in an undef array can depend on what operations have been done before, but it shouldn’t mysteriously change once the array is created. Maybe instead, every value except the last one is ok?

weech · May 9, 2019, 5:42pm

Yes that is true.

ao = parseseries("ao.txt")
println(cor(ao, pc))

prints -0.16844440419944623 which is obviously using bad values.

ao = parseseries("ao.txt")
println(ao)
println(cor(ao, pc))

prints the array with -0.0395386 in the last spot and prints 0.6317015985345612

ao = parseseries("ao.txt")
println(cor(ao, pc))
println(ao)

prints 0.6317015985345612 and then array with -0.0395386 in the last spot. I can take it further and plot the data before printing out the array, and it still fixes the data.

jeff.bezanson · May 9, 2019, 5:54pm

Ok, I think this is still just the same issue. It looks like it’s the presence of println in the code that changes the uninitialized value, not that running println is changing the value in the existing array. Are all these statements running at the top level, or are some inside functions? If the latter, then changing the code will hit different paths in the compiler, which could certainly change later uninitialized values.

weech · May 9, 2019, 5:56pm

Yes, this is all inside functions. It is strange, but undefined behavior is supposed to be strange.

jeff.bezanson · May 9, 2019, 5:58pm

Good, that’s a relief!

Topic		Replies	Views
Presence of println is affecting function behavior New to Julia question	10	258	November 18, 2024
Expected behaviour of println.(Array) (or @. println(Array) ) New to Julia question	4	360	May 20, 2020
Quick debugging using println? New to Julia question	11	2569	December 8, 2020
PythonCall spends a lot of time showing stuff for JAX Performance python , pythoncall , jax	3	202	October 15, 2024
Very strange language choice for printing integer vs float arrays General Usage question , input-output , io	2	507	July 28, 2022

Side Effects of println?

Related topics