Baffling bug counting nonzero elements in a vector

I have been banging my head against the following bug for more than a couple days now, and I’m still baffled. I want to count the number of zero and non-zero elements in a couple vectors as shown below. In the code below, the counts get printed in the @show statement labelled “90”. However, for some reason m and m0 are both printing as zero, even though @show statements labelled 82 and 91 (which are executed immediately before and after the counting loops) show a bunch of nonzero elements in the “unaryIdToFeatValList” vector. (The binaryIdToFeatValList vector doesn’t have this problem.) The declaration of unaryIdToFeatValList (a member of a struct) is:

...
    unaryIdToFeatValList::Vector{Tuple{UInt32, UInt16, UInt16}} 
@show 82, base.unaryIdToFeatValList[fvi-10:fvi]
m0 = m = x = 0 
for e in base.unaryIdToFeatValList
    x += 1
    UInt(e[1]) == 0 ? (m0 += 1) : (m += 0)
end
n0 = n = y = 0
for e in base.binaryIdToFeatValList
    y+=1
    e[1] == 0 ? n0 += 1 : n += 1
end
@show 90, m, m0, x, n0, n ,y
@show 91, base.unaryIdToFeatValList[fvi-10:fvi]

A sample output is:

(82, base.unaryIdToFeatValList[fvi - 3:fvi]) = (82, Tuple{UInt32, UInt16, UInt16}[(0x00000007, 0x0003, 0x0000), (0x00000008, 0x0003, 0x0000), (0x00000009, 0x0003, 0x0000), (0x00000001, 0x0004, 0x0000)])
(90, m, m0, x, n0, n, y) = (90, 0, 0, 659, 257499955, 45, 257500000)
(91, base.unaryIdToFeatValList[fvi - 3:fvi]) = (91, Tuple{UInt32, UInt16, UInt16}[(0x00000007, 0x0003, 0x0000), (0x00000008, 0x0003, 0x0000), (0x00000009, 0x0003, 0x0000), (0x00000001, 0x0004, 0x0000)])

Both arrays can grow quite large (~500,000,000+ elements), pushing the memory limits of my laptop. I don’t know if there are possibly subtle side effects from overextending my code? Or maybe I’m just overlooking something really stupid. Any feedback would be much appreciated.

It’s a very good idea to give a reproducer in such questions, actually many times I wanted to ask a question like this and then found the answer myself while trying to minimize the reproducer (because it’s a form of troubleshooting/diagnosis).

However I have a hunch your issue may be that you’re not using the global keyword? The result would be that you’re just creating new variables m and m0 in local scope.

Actually, there’s quite a few oddities in your question/code.

I think this always evaluates both increments. Use if instead.

BTW, instead of UInt(e[1]) == 0 it could be prettier to use iszero(e[1]) or iszero(first(e)).

Why would you add the identity element? I guess this is a typo?

These two seem inconsistent:

1 Like

No, an error would be thrown at m += 0 when it tries to access a nonexistent m before assignment. You’re right that a global keyword would be necessary in a file or @eval statement, but it’s likely that this code is being run in the REPL or a Jupyter notebook. (Personal pet peeve at how globals are treated differently there).

No, cond ? x : y works just like if cond x else y end, you’re thinking of ifelse(cond, x, y).

I agree this explains why m doesn’t change. But if UInt(e[1]) == 0 was ever true, then m0 should change. OP has not provided the binaryid___ values, but the few values of unaryid___ shown indeed never have 0 at any tuple’s 1st element. The 0s are at the 3rd element.

3 Likes

Ah, the m += 0 was the problem! I analyzed that line countless times without catching it. Sorry to bother you all, but thanks so much for the fresh eyes!

1 Like