Segfault while loading images in multiple threads

Hello,

I tried to process images in parallel but the load operation results in a Segmentation Fault. Is loading images not thread safe?

using Images, FileIO
Threads.@threads for path in readdir("img"; join=true)
  load(path)
end

The img directory contains only PNG images. Running the above script with a single thread works, but with multiple threads (julia --threads 4) it gives me a segmentation fault:

signal (11): Segmentation fault
in expression starting at none:1
in expression starting at none:1
unknown function (ip: 0x7fbd79b0d298)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0e80a)
unknown function (ip: 0x7fbd79b0d294)
unknown function (ip: 0x7fbd79b0d725)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0c8b7)
unknown function (ip: 0x7fbd79b0e80a)
unknown function (ip: 0x7fbd79b0d766)
unknown function (ip: 0x7fbd79b0d725)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0c8b7)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0d766)
unknown function (ip: 0x7fbd79b0e80a)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0d725)
unknown function (ip: 0x7fbd79b0d32d)
unknown function (ip: 0x7fbd79b0d9ec)
unknown function (ip: 0x7fbd79b0e80a)
unknown function (ip: 0x7fbd79b0dafc)
unknown function (ip: 0x7fbd79b0d725)
unknown function (ip: 0x7fbd79b0ea0a)
unknown function (ip: 0x7fbd79b0d9ec)
unknown function (ip: 0x7fbd79b12347)
unknown function (ip: 0x7fbd79b0dafc)
jl_restore_incremental at /usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7fbd79b0ea0a)
unknown function (ip: 0x7fbd79b12347)
jl_restore_incremental at /usr/bin/../lib/libjulia.so.1 (unknown line)
_include_from_serialized at ./loading.jl:681
_require_search_from_serialized at ./loading.jl:782
_require_search_from_serialized at ./loading.jl:782
Errors encountered while loading "/path/img/000811.png".
All errors:Errors encountered while loading "/path/img/000001.png".

===========================================Errors encountered while loading "/path/img/001081.png".

All errors:
===========================================
All errors:
===========================================
_require at ./loading.jl:1007
_require at ./loading.jl:1007
require at ./loading.jl:928
require at ./loading.jl:928
require at ./loading.jl:923
require at ./loading.jl:923
unknown function (ip: 0x7fbd79b1cac0)
unknown function (ip: 0x7fbd79b1cac0)
unknown function (ip: 0x7fbd79b1e16e)
unknown function (ip: 0x7fbd79b1e16e)
jl_toplevel_eval_in at /usr/bin/../lib/libjulia.so.1 (unknown line)
jl_toplevel_eval_in at /usr/bin/../lib/libjulia.so.1 (unknown line)
eval at ./boot.jl:331 [inlined]
topimport at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:13
eval at ./boot.jl:331 [inlined]
topimport at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:13
checked_import at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:30
checked_import at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:30
#load#28 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:195
#load#28 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:195
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:184 [inlined]
#load#14 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
macro expansion at /path/open_images.jl:4 [inlined]
#3#threadsfor_fun at ./threadingconstructs.jl:81
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:184 [inlined]
#load#14 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
macro expansion at /path/open_images.jl:4 [inlined]
#3#threadsfor_fun at ./threadingconstructs.jl:81
#3#threadsfor_fun at ./threadingconstructs.jl:48
#3#threadsfor_fun at ./threadingconstructs.jl:48
unknown function (ip: 0x7fbd49e314ac)
unknown function (ip: 0x7fbd49e314ac)
unknown function (ip: 0x7fbd79b053b9)
unknown function (ip: 0x7fbd79b053b9)
unknown function (ip: (nil))
nknown function (ip: (nil))
Allocations: 11229080 (Pool: 11225888; Big: 3192); GC: 8
Allocations: 11229080 (Pool: 11225888; Big: 3192); GC: 8
Segmentation fault (core dumped)

I’m using julia version 1.5.2.

1 Like

do you use the lock=true argument in Base.open?

I just tried this and got a similar error:

Threads.@threads for path in readdir("img"; join=true)
    open(path, "r"; lock = true) do io
        load(io)
    end
end
Errors encountered while loading nothing.
All errors:
===========================================
Errors encountered while loading nothing.
Errors encountered while loading nothing.All errors:

===========================================Errors encountered while loading nothing.

Errors encountered while loading nothing.All errors:

All errors:
===========================================
All errors:
===========================================
===========================================
Errors encountered while loading nothing.
All errors:
===========================================
concurrency violation detectedconcurrency violation detected
===========================================

===========================================concurrency violation detected

===========================================concurrency violation detected

===========================================
Fatal error:


Fatal error:
concurrency violation detected
===========================================
concurrency violation detected
===========================================

Fatal error:
concurrency violation detectedconcurrency violation detected
===========================================

===========================================concurrency violation detected

===========================================concurrency violation detected

Fatal error:
===========================================


Fatal error:
concurrency violation detected
===========================================
concurrency violation detected
===========================================

Fatal error:

signal (11): Segmentation fault
in expression starting at none:1
unknown function (ip: 0x7f01122bfe6c)
unknown function (ip: 0x7f01122c6817)
jl_restore_incremental at /usr/bin/../lib/libjulia.so.1 (unknown line)
_include_from_serialized at ./loading.jl:681
_require_search_from_serialized at ./loading.jl:782
_tryrequire_from_serialized at ./loading.jl:712
_require_search_from_serialized at ./loading.jl:771
_tryrequire_from_serialized at ./loading.jl:712
_require_search_from_serialized at ./loading.jl:771
_require at ./loading.jl:1007
require at ./loading.jl:928
require at ./loading.jl:923
unknown function (ip: 0x7f01122d0ac0)
unknown function (ip: 0x7f01122d216e)
jl_toplevel_eval_in at /usr/bin/../lib/libjulia.so.1 (unknown line)
eval at ./boot.jl:331 [inlined]
topimport at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:13
checked_import at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:30
#load#28 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:195
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:184 [inlined]
#load#14 at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
load at /home/someuser/.julia/packages/FileIO/wN5rD/src/loadsave.jl:133 [inlined]
#1 at /path/open_images.jl:5 [inlined]
#open#287 at ./io.jl:325
open##kw at ./io.jl:323 [inlined]
macro expansion at /path/open_images.jl:4 [inlined]
#3#threadsfor_fun at ./threadingconstructs.jl:81
#3#threadsfor_fun at ./threadingconstructs.jl:48
unknown function (ip: 0x7f00da62ef3c)
unknown function (ip: 0x7f01122b93b9)
unknown function (ip: (nil))
Allocations: 13377958 (Pool: 13373909; Big: 4049); GC: 10
Segmentation fault (core dumped)

Perhaps you could give my new actor library YAActL a try (see the announcement):

The following installs a very simple file server:

julia> using YAActL, FileIO

julia> fs = Actor(load)
Channel{YAActL.Message}(sz_max:32,sz_curr:0)

then you can do:

julia> @threads for file in readdir("img"; join=true)
           img = call!(fs, file)
           # then do the processing in parallel
       end

which is the same as your code above, but works (without lock). It opens the files sequentially and serves the content over the fs channel. Then the threads can compute in parallel. This is threadsafe by design.

It works, but if I understand correctly the loading and decoding is now done in a single thread. Can’t Julia load images in parallel?

Thank you for trying!

  1. The loading is done in one thread (the one the actor resides on). It serves the images over the fs channel to the threads that called it.
  2. The threads then do the processing of the images (in your case 4 at a time) in parallel.

This makes only sense if the processing takes significantly longer than the loading. Otherwise you won’t gain much by multithreading. See Amdahl’s law. The reading of the files can’t be parallized. It’s the same with a lock. It creates a queue which makes tasks wait until they can read the file.

You can parallelize all operations after the file access since only the reading needs to happen in a single thread. But this must be implemented in a library or by the user. Amdahl’s law suggests to keep the part that cannot parallelized as small as possible.

That’s probably true in most situations, including mine, but there’s still the decoding step that could be parallelized. I guess I could separate the loading and decoding, but that doesn’t seem to be very easy since load only accepts a filename or a stream.

Anyway I was able to run my process. And surprisingly saving the resulting images in multiple threads does work.

Thank you for your help.

1 Like

Please remind me what makes that impossible.

I perhaps should have written that it is not practical:

Hardware file access works sequentially. If you allow parallel cores concurrently access the file system, then somewhere in between you have to manage the switching, to save and restore permanently status for each concurrent file access. This is possible but involves considerable complexity and may be error prone.

See also: A related stack overflow question
Interesting too: Multi-threaded File IO

So no reason not to try to split the load over a couple of threads then, in particular as there’s decoding to do as well. It seems to me like there’s some thread unsafety in FileIO because reading directly with PNGFiles works fine for me.

using PNGFiles
filenames = filter(x->endswith(x, ".png"), readdir("."; join=true));
x = Vector{Any}(undef, length(filenames));
@sync for (i, path) in enumerate(filenames)
    Threads.@spawn begin
        x[i] = PNGFiles.load(path)
    end
end

This is probably not a good way to get performance but at least there should be no fundamental problem with multithreading it.

Update: When I try this with 100 png files of size 4096x4096 and 8 threads I get a factor 4 speedup over sequential read. Seems fairly decent.

2 Likes

FileIO and ImageIO do lazy loading of the underlying file IO packages, so what may be happening is multiple threads trying to load the package at the same time, if it hasn’t happened already in the session.

The first example may work if the underlying IO package is first loaded on the main thread, either by invoking a single image load first, or by explicitly using ImageIO, PNGFiles or ImageMagick etc

using Images, FileIO, ImageIO, PNGFiles
Threads.@threads for path in readdir("img"; join=true)
  load(path)
end

Note that if the file format changes, it may try to load different packages and hit the same problem.

These two should be a fix to make the original example work with the ImageIO backend. Only the first is needed if you’re using either the ImageMagick or QuartzImageIO backend.
https://github.com/JuliaIO/FileIO.jl/pull/276
https://github.com/JuliaIO/ImageIO.jl/pull/11

3 Likes

Master on both packages should now be fixed.

@sigmike would you mind checking with your setup, and we can release if fixed.

In a fresh session:

pkg> add FileIO#master ImageIO#master
julia> using Images, FileIO
julia> Threads.@threads for path in readdir("img"; join=true)
    load(path)
end

Thanks!

5 Likes

Yes it works. And it was about 3 times faster to load all the images with 4 threads than with 1.

Thank you.

3 Likes

Great thanks