Massive `readstring` -> Too many open files


#1

My program processs large amount of files using readstring. Eventually it dies with Too many open files. I quess the reason is the way read function is implemented: it just do open but do not close, neither use do block. Should I report a bug? Or I misunderstand something?


#2

readstring appears to be properly closing the file https://github.com/JuliaLang/julia/blob/8eca02717bcea524877fad01aacedc807898bcd1/base/io.jl#L537 by using the callback verion. Which function are you calling?


#3

I thought that it is readstring from base/process.jl which is called. It invokes read which, AFAIS, don’t do close.

I’m not sure how Julia decides which of two (io.jl vs. process.jl) readstring to call.


#4

If you are using the file one then it’s not the one for AbstractCmd. See the doc for how these works. The one for command is closed after the read.


#5

Now I see. It seems, that calling readstring("/path/to/file") turns into the following chain of calls:

(1) readstring("/path/to/file") → 
(2) open(readstring, "/path/to/file") → 
(3) readstring(file_handle) → 
(4) read(file_handle) → ...

And open (2) do close handle after readstring (3) returns.

It stays unclear though, why many readstrings lead to Too many open files. I can’t provide minimum working example so far (the program crushed with this is huge). I’ll try to do it asap.


#6

I found the other place in the project with open not balanced with close.


#7

For future reference, not sure what OS you are on, at least on linux it is easy to check /proc/self/fd and see what are the opened files, that should give you an idea of when/where it happens. I imaging windows/mac have similar tools.


#8

I’m on Linux. This will be helpful, thank you very much!


#9

lsof exists on many UNIX dialects (including mac) to do this in a more general manner