Last row is broken when iterating an array

When trying to load a CSV file and use a custom iterator over its columns, the last row is broken in Julia 1.0. I’ve had to update the iterator approach, but the basic code was working in 0.6. Yes, I did test this with no warnings in 0.7.

This behavior looks like a bug to me and possibly related to #28763. Also if you have a better idea for this use case using built-in libraries, please feel free to share.

using DelimitedFiles # for readdlm

mutable struct mytbl # a custom object
    source
    function mytbl(csvdata::Base.GenericIOBuffer)
        source = readdlm(csvdata, ',')
        source = convert(Array, source[2:end,:]) # clear the headers
        new(source)
    end
end

Base.length(it::mytbl) = size(it.source, 1)

function Base.iterate(it::mytbl, (el, i)=(it.source[1,:], 1))
   return i >= length(it) ? nothing : (el, (it.source[i + 1,:], i + 1))
end

# some sample data
TABLE_CAST = """id,height,age,name,occupation
1,10.0,1,string1,2012-06-15 00:00:00
2,10.1,2,string2,2013-06-15 01:00:00
3,10.2,3,string3,2014-06-15 02:00:00
4,10.3,4,string4,2015-06-15 03:00:00
5,10.4,5,string5,2016-06-15 04:00:00
"""

table =  mytbl(IOBuffer(TABLE_CAST))

[ row for row in table ]
# the last element is #undef

[ row[1] for row in table ]
# Returns:
[1, 2, 3, 4, 229445824]

[ row[2] for row in table ]
# Returns:
[10.0, 10.1, 10.2, 10.3, 4.94e-324]

Note that I’ve tested this with more or less elements, the last row is always…well, broken.

Can you reproduce the problem by looking directly at the result of readdlm, without the mytbl wrapper? In any case, I’d file an issue on GitHub.

You can try with CSV.jl or CSVFile.jl, which are more powerful anyway, especially when columns have different types.

Thanks @nalimilan, advice taken.

readdlm loads the table just fine. It’s during the iteration that passing the element around seems to break things. Please advise if I should file an issue.

Meanwhile I’ve managed to work around the problem with this iterator:

Base.length(it::mytbl) = size(it.source, 1)
function Base.iterate(it::mytbl, (el, i)=(nothing, 1))
    if i > length(it); return nothing; end
    return (it.source[i,:], (nothing, i + 1))
end

That sounds weird. I’d definitely file an issue.

Done, thanks:

https://github.com/JuliaLang/julia/issues/29443