Error opening too many connections

xiaodai · October 1, 2019, 2:44pm

I am trying to write a parallel algorithm to write to many files. Is there a way to detect how many connections I can open? Or a way to rate limit my code?

ios = open.(
"file".*(1:3000),
Ref("w"),
)

gives error

ERROR: SystemError: opening file "a.jdf\\x2045": Too many open files
Stacktrace:
 [1] #systemerror#44(::Nothing, ::typeof(systemerror), ::String, ::Bool) at .\error.jl:134
 [2] systemerror at .\error.jl:134 [inlined]
 [3] #open#516(::Nothing, ::Nothing, ::Nothing, ::Bool, ::Nothing, ::typeof(open), ::String) at .\iostream.jl:254
 [4] #open at .\none:0 [inlined]
 [5] open(::String, ::String) at .\iostream.jl:310
 [6] _broadcast_getindex_evalf at .\broadcast.jl:630 [inlined]
 [7] _broadcast_getindex at .\broadcast.jl:603 [inlined]
 [8] _getindex at .\broadcast.jl:627 [inlined]
 [9] _broadcast_getindex at .\broadcast.jl:602 [inlined]
 [10] getindex at .\broadcast.jl:563 [inlined]
 [11] macro expansion at .\broadcast.jl:909 [inlined]
 [12] macro expansion at .\simdloop.jl:77 [inlined]
 [13] copyto! at .\broadcast.jl:908 [inlined]
 [14] copyto! at .\broadcast.jl:863 [inlined]
 [15] copy at .\broadcast.jl:839 [inlined]
 [16] materialize at .\broadcast.jl:819 [inlined]
 [17] savejdf(::String, ::DataFrame) at c:\git\JDF\src\JDF.jl:64
 [18] top-level scope at none:0

pixel27 · March 19, 2020, 1:22pm

I feel like this question is too open ended. There is a finite number of files that can be open at the same time per process. Sorry there is no way around that. Now the question is what are you doing with those open files.

Do you need them open at once because you are reading/writing to them simultaneously? If you do, ouch, you are probably out of luck and would have to implement something that opens and closes the files “as needed” and the performance is probably going to suck.

Or can you “batch” your process so it only processes N files at a time. Then you can probably work something out. But then you would have to update your code to process the data in these batches.

If you know the max number of files you will want open you might be able to just change some limits.
On linux there is there is a normal limit of 1024 open files. However that can be changed. On windows (which you appear to be on) I did find:

Which seems to have a default of 512 and a hard limit of 8,192 files, but you might be able to make a ccall to it to change it…maybe.

waralex · March 19, 2020, 5:24pm

On linux and mac os you can see limit with

> ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         0
-m: resident set size (kbytes)      unlimited
-u: processes                       46553
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  16384
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 46553
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15:                              unlimited

And you can set limit of max open files with

sudo ulimit -n <new limit>

waralex · March 19, 2020, 5:30pm

Are you sure that writing 3000 files at the same time is a good idea? I know several systems that require an increase in ulimits -n for work, but they do not write to thousands of files at the same time, they keep open descriptors for their needs, especially for reading

johnh · March 19, 2020, 7:14pm

As @waralex says - what is the use case here? It sounds interesting!

xiaodai · March 19, 2020, 11:46pm

I don’t need 3000 actually. Just writing out columns of a DataFrame out. So can limit to how many threads i have

Topic		Replies	Views
Weird error: `SystemError: memory mapping failed: Too many open files in system` General Usage	2	751	June 23, 2020
Using Threads with I/O to processing many files in parallel New to Julia	3	956	December 23, 2016
Error when using "too many" workers Julia at Scale	10	2167	April 18, 2018
Massive `readstring` -> Too many open files General Usage	8	2086	May 1, 2017
Limiting the maximum number of parallel threads with @spawn, as with @threads General Usage parallel , multithreading	2	700	May 8, 2020

Error opening too many connections

Related topics