"error in running finalizer" in CSV.jl

Hi,

When using CSV.Rows from CSV 0.9.4 repeatedly (using Revise to re-run a test), I get a:

error in running finalizer: Base.IOError(msg="unlink("<home>\\AppData\\Local\\Temp\\1\\jl_9056.tmp"): permission denied (EACCES)", code=-4092)
uv_error at .\libuv.jl:97 [inlined]
unlink at .\file.jl:958
#rm#12 at .\file.jl:276
rm##kw at .\file.jl:267 [inlined]
#81 at <home>\.julia\packages\CSV\b4GfC\src\rows.jl:138

Unfortunately, I haven’t managed to produce a small test-case that shows the bug yet, but the code is roughly:

using Revise, CSV
entr(["."]) do
    it = CSV.Rows("large-file.csv", header=0)
    (header, state) = iterate(it)
    while true
        next = iterate(it, state)
        if next === nothing break end
        (row, state) = next
    end
end

I’ll keep working on a reduced test case that actually fails, but in the mean time, does anyone have any insight into what I might be doing wrong?

This error would occur when CSV.jl is trying to clean up a temp file used while parsing and removing the temp file failed. It’s not clear why that would fail, unless maybe you don’t have access to delete files in that temp directory?

Thanks for the quick reply!

I can delete the file from the Explorer, so I assume I have permissions? I wondered whether it wasn’t that the file was still open somewhere else in CSV.jl - might there be another finalizer that needs to be called or GC that needs to happen for that file to be closed?

The error seems to happen (on the large code) the second time through the entr loop, so I have to wonder whether Revise is not playing a part. I’m also reading several files in rapid succession (but not simultaneously). Is there any chance of a collision on a temp file name?

As I said, I’ll see if I can get to the point that I can replicate it.

This code seems to exhibit the error:

using CSV
function read()
    it = CSV.Rows(open("file.csv", "r"), header=0)
    (header, _) = iterate(it)
    header[1]
end
field = read()
GC.gc(true)
println(field)

In order to trigger the problem:

  • you need to pass CSV.Rows an IOStream, otherwise it doesn’t create a temp file
  • you need to hold onto a string that you read from the file (does this string refer to the contents of the temp file?)
  • you need to finalize the iterator while the string is still alive.

I would have thought that a workaround would be to return String(header[1]) from read, but that doesn’t seem to work.

An even easier test case is:
julia -e "using CSV; CSV.Rows(open(\"file.csv\", \"r\"))"

I’ve had it confirmed by a colleague that they get the same problem, noting that this is with CSV 0.9.4 on Windows 10 (for some reason I can’t upgrade to 0.9.10 - not sure what’s pinning that…)