Side Effects of println?

I’m getting some really strange behavior, and I’m not sure what to make of it. Essentially the values in an array change based on making a call to println elsewhere in the script. I can’t figure out how to reproduce it in a smaller example, but basically the code structure goes like

# Do lots of things
println(inconsequential_value)
# Do lots of things
ao = parseseries("ao.txt") # Function to read in data
println(cor(ao, pc)) # Gives correct value
# Plot ao and pc

for the correct case and

# Do lots of things
ao = parseseries("ao.txt") # Function to read in data
println(cor(ao, pc)) # Gives incorrect value
# Plot ao and pc incorrectly

for the bad case, where the difference is making a call to println earlier in the code. Here are images of the two arrays. I can’t give the actual bad values because calling println corrects the data (but passing it on to PyCall does not). What’s interesting is it gives the same wrong values every time. What I would like is some guidance on how to go about debugging this error. I went through my script commenting things out until the error went away, and I was left with a lot of unrelated stuff that has to execute before the error will appear. I’m not sure how to narrow it down further.
The bad data:


The correct data:

For the record the parseseries function looks like:

normalize(series) = (series .- mean(series)) / std(series)
function parseseries(filename)
    open(filename) do f
        data = Vector{Vector{Float64}}()
        for line in eachline(f)
            tokens = filter(x -> length(x) > 1, split(line, " "))
            year = parse(Int, tokens[1])
            !(year in 1979:2010) && continue
            jan = parse(Float64, tokens[2])
            feb = parse(Float64, tokens[3])
            dec = parse(Float64, tokens[end])
            push!(data, [jan, feb, dec])
        end
        results = Vector{Float64}(undef, 31)
        for year in 1:30
            results[year] = mean([data[year][3], data[year+1][1], data[year+1][2]])
        end
        return normalize(results)
    end
end

but I don’t think that’s the issue. Maybe it’s a GC issue or something.

Edited to fix typo.

That’s the least likely.

Note that in

results = Vector{Float64}(undef, 31)
for year in 1:30
    results[year] = mean([data[year][3], data[year+1][1], data[year+1][2]])
end

the element results[31] is undefined, so basically random. This could be the cause.

Be very, very careful with undef.

EDIT You can protect yourself from errors like this (= hardcoding the wrong size) in many ways. One of them is

for year in eachindex(results)
    results[year] = ...
2 Likes

That solved the inconsistencies, thanks for your help. It’s still weird how all of the values in the array were incorrect with the bug though, unless there’s another bug in NCAR Graphics that made them all appear to be less than 0 when there was garbage in the last place.

Is that really true? Once you already have the ao array, printing it looks fine and every operation on it after that is fine, but if you plot it instead of printing it, it has the wrong values? The value you get in an undef array can depend on what operations have been done before, but it shouldn’t mysteriously change once the array is created. Maybe instead, every value except the last one is ok?

1 Like

Yes that is true.

ao = parseseries("ao.txt")
println(cor(ao, pc))

prints -0.16844440419944623 which is obviously using bad values.

ao = parseseries("ao.txt")
println(ao)
println(cor(ao, pc))

prints the array with -0.0395386 in the last spot and prints 0.6317015985345612

ao = parseseries("ao.txt")
println(cor(ao, pc))
println(ao)

prints 0.6317015985345612 and then array with -0.0395386 in the last spot. I can take it further and plot the data before printing out the array, and it still fixes the data.

Ok, I think this is still just the same issue. It looks like it’s the presence of println in the code that changes the uninitialized value, not that running println is changing the value in the existing array. Are all these statements running at the top level, or are some inside functions? If the latter, then changing the code will hit different paths in the compiler, which could certainly change later uninitialized values.

Yes, this is all inside functions. It is strange, but undefined behavior is supposed to be strange.

Good, that’s a relief! :sweat_smile:

1 Like