Massive `readstring` -> Too many open files

My program processs large amount of files using readstring. Eventually it dies with Too many open files. I quess the reason is the way read function is implemented: it just do open but do not close, neither use do block. Should I report a bug? Or I misunderstand something?

1 Like

readstring appears to be properly closing the file https://github.com/JuliaLang/julia/blob/8eca02717bcea524877fad01aacedc807898bcd1/base/io.jl#L537 by using the callback verion. Which function are you calling?

1 Like

I thought that it is readstring from base/process.jl which is called. It invokes read which, AFAIS, don’t do close.

I’m not sure how Julia decides which of two (io.jl vs. process.jl) readstring to call.

If you are using the file one then it’s not the one for AbstractCmd. See the doc for how these works. The one for command is closed after the read.

Now I see. It seems, that calling readstring("/path/to/file") turns into the following chain of calls:

(1) readstring("/path/to/file") → 
(2) open(readstring, "/path/to/file") → 
(3) readstring(file_handle) → 
(4) read(file_handle) → ...

And open (2) do close handle after readstring (3) returns.

It stays unclear though, why many readstrings lead to Too many open files. I can’t provide minimum working example so far (the program crushed with this is huge). I’ll try to do it asap.

I found the other place in the project with open not balanced with close.

For future reference, not sure what OS you are on, at least on linux it is easy to check /proc/self/fd and see what are the opened files, that should give you an idea of when/where it happens. I imaging windows/mac have similar tools.

2 Likes

I’m on Linux. This will be helpful, thank you very much!

1 Like

lsof exists on many UNIX dialects (including mac) to do this in a more general manner

2 Likes