Benchmark tools breaks when using "Threads.@threads"?

Ahmed_Salih · April 27, 2019, 6:54pm

Hey guys!

I am trying to benchmark a function in a package I made. The package can be found here, https://github.com/AhmedSalih3d/PostSPH.jl, and the function inside of it I am testing is readVtkArray. When I run the function alone everything works fine, but if I try to do anything with @benchmark or @btime I see:

using PostSPH
using Benchmarkstools
cd(raw"path-with-vtk-files")
@btime k = readVtkArray("parts",Cat(2))

I get the error:

Error in file number 3

Error thrown in threaded loop on thread 1: MethodError(f=typeof(Base.convert)(), args=(Array{Float32, N} where N, 1.#QNAN), world=0x0000000000006420)Error in file number 6

Error thrown in threaded loop on thread 2: MethodError(f=typeof(Base.convert)(), args=(Array{Float32, N} where N, 1.#QNAN), world=0x0000000000006420)

Which does not occur, if I run with @time or nothing at all. There I get as expected:

k = readVtkArray("parts",Cat(2))
11-element Array{Array{Float32,N} where N,1}:
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; 0.0 0.0 0.0; 0.0 0.0 0.0]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; 0.00718685 0.0 0.00367148; -0.00353015 0.0 4.91067e-6]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -0.000894189 0.0 0.000899359; 0.000442034 0.0 0.00166281]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -0.000594841 0.0 -0.00052736; -0.000520424 0.0 0.00080863]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -0.000276363 0.0 -0.000111521; -0.000175042 0.0 0.000154482]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -5.00944e-5 0.0 -0.00012329; -0.000177598 0.0 6.39014e-5]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -3.33737e-5 0.0 -4.81802e-5; -0.000104715 0.0 7.41634e-6]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -0.000103613 0.0 -4.02584e-5; -9.69871e-5 0.0 2.61577e-5]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -4.87995e-5 0.0 -4.44328e-5; -6.59069e-5 0.0 5.18157e-6]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -4.91999e-5 0.0 -2.78963e-5; -1.20714e-5 0.0 1.5948e-5]
 [0.0 0.0 0.0; 0.0 0.0 0.0; … ; -6.2435e-6 0.0 2.14577e-5; 1.46149e-5 0.0 -1.46728e-5]

I don’t know how to a minimal working example for this, so I made a dropbox folder with some .vtk files; Dropbox - JuliaDropbox - Simplify your life, if anyone wants to test for themselves. The Github link contains a read me of how to install and source code in src.

A .vtk file is a visual toolkit file which is used to visualize for an example simulations in Paraview. If anyone could point me to why this would occur, when using “Threads.@threads” I would be very happy.

Kind regards

kristoffer.carlsson · April 27, 2019, 7:08pm

Are you using IO in the threaded loop?

Ahmed_Salih · April 27, 2019, 7:21pm

Yes, atleast I think so. Inside this code snippet:

k = Vector{Array{catType[typ]}}(undef, nFilenames)
            Threads.@threads for i = 1:nFilenames::Number
                try
                    @inbounds k[i] = readVtk(filenames[i], typ,PosTyp)
                catch
                    #Since DualSPHysics starts from 0000 - Test
                    println("Error in file number ",i-1)
                    @inbounds k[i] = NaN
                end

I use threads on the for loop in which I call “readVtk” which opens an IOStream.

kristoffer.carlsson · April 27, 2019, 8:05pm

IO in threaded regions are AFAIU not supported.

Ahmed_Salih · April 27, 2019, 8:10pm

Ah okay, how would I go about benchmarking then? Sometimes benchmarking has worked if the files have been bigger, but seems weird?.. Using @time and documenting?

Kind regards

yuyichao · April 27, 2019, 8:19pm

Your problem is likely not related to benchmark. You are using the thread, not the benchmark.

It’s also likely not IO related. For one it’ll usually crash. This particular printing also won’t run unless you have an error so it won’t affect working code.

You most likely have another race condition or but in your code. The benchmark code simply runs the code many more times than you usually do that exposes the bug.

Ahmed_Salih · April 27, 2019, 8:49pm

Are you sure? Couldn’t it be because the benchmarking tool tries to run the same instance of code at multiple times? And therefore it ends up reading from same file multiple times?

Kind regards

yuyichao · April 27, 2019, 9:12pm

Well, only you know your code so if you cannot run your code multiple times then you cannot use benchmark tools.

In any case, that’s still unrelated to the interaction between threading and benchmark.

Ahmed_Salih · April 27, 2019, 9:15pm

In theory it is unrelated I guess, but I expected that to benchmark it would run my code once, finish everything, then start over, finish everything and so on. Seems like it just spawns multiple instances and then tries to run them all at once, and that is why my code would break - as far as I understand.

Then I guess making a for loop with @time and saving maybe 50 iterations would give a good estimate?

ffevotte · April 27, 2019, 9:25pm

Everything seems to work on my machine:

sh$ JULIA_NUM_THREADS=4 julia

julia> include("readVtk_readbytesslow.jl")
readVtkArray (generic function with 1 method)

julia> for _ in 1:10_000
         readVtkArray("parts")
       end

julia> using BenchmarkTools
julia> @btime readVtkArray("parts");
  5.204 ms (454 allocations: 994.30 KiB)

Could you run the same kind of tests and report whether you have errors when stress-testing with @btime and/or a simple for loop? And maybe let the number of threads vary if you can?

yuyichao · April 27, 2019, 9:29pm

No that is not happening. As I said, all the threads are in your code.

Ahmed_Salih · April 27, 2019, 9:30pm

Hmmm, okay I guess that Threads.@threads might not be the problem then… what you are running is an older version of my code simplified to only read one specific array type. I will try testing different array types now, thanks!

Ahmed_Salih · April 27, 2019, 9:37pm

@ffevotte I tested the same array now with my PostSPH package and it still gives an error for me. Would you kindly check with my Github package? The command is just:

@btime k = readVtkArray("parts",PostSPH.Idp)

Ahmed_Salih · April 27, 2019, 9:38pm

What you are saying makes sense to me now, thanks for explaining.

ffevotte · April 27, 2019, 9:45pm

Yep, this version sometimes fails. And a simple for loop is enough to trigger the error:

$ JULIA_NUM_THREADS=4 julia

julia> using PostSPH

julia> for _ in 1:10_000
         k = PostSPH.readVtkArray("parts",PostSPH.Idp)
       end
Error in file number 0

Error thrown in threaded loop on thread 0: MethodError(f=typeof(Base.convert)(), args=(Array{Int32, N} where N, nan), world=0x00000000000063ea)Error in file number 3

Error thrown in threaded loop on thread 1: MethodError(f=typeof(Base.convert)(), args=(Array{Int32, N} where N, nan), world=0x00000000000063ea)Error in file number 3

signal (11): Segmentation fault
in expression starting at no file:0
unknown function (ip: 0x7f5a3323a63d)
unknown function (ip: 0xcd6dae54cd0ef76c)
Allocations: 11194468 (Pool: 10923590; Big: 270878); GC: 142
Segmentation fault

This confirms that, as @yuyichao said, the error is not related to @btime. And hopefully the differences between the “old” and “new” versions gives hints as to where to look for an error…

Ahmed_Salih · April 27, 2019, 9:56pm

Yeah, just went looking through my code and it seems like the difference might be in the Github version I use “readuntil” and “read”, while in the bare minimum file on dropbox, I use “readuntil” and “readbytes”. I chose the first approach since it is much faster. I have a hard time spotting other major differences - atleast I know now that it is not Threads.@threads which made the error.

Thanks for your time, don’t really understand how it can work properly without benchmarking and then suddenly do this when I start to benchmark, but maybe in a few weeks I will find out why.

Kind regards

Ahmed_Salih · April 27, 2019, 10:02pm

Okay, I went and checked on a different data set and now it works:

@benchmark k = readVtkArray("PartAll",PostSPH.Points)
BenchmarkTools.Trial:
  memory estimate:  643.19 MiB
  allocs estimate:  24552
  --------------
  minimum time:     418.663 ms (0.00% GC)
  median time:      447.024 ms (0.00% GC)
  mean time:        562.856 ms (21.67% GC)
  maximum time:     915.151 ms (43.00% GC)
  --------------
  samples:          9
  evals/sample:     1

Seems like there was an error in the Dropbox files. Why the code in Dropbox did not fail might be because readbytes will always give some output, compared to the one on Github. Thanks for your time guys, sorry that it was a dumb mistake at the end, I learned a lot

ffevotte · April 27, 2019, 10:03pm

Let me try to explain this once more: your code is probably flawed, in such a way that it fails once every 10_000 runs because of a race condition between threads, that happens only rarely. Your code does not “work properly”. It happens to seem to work most of the time when you use it regularly.

The easiest way to evidence the race condition consists in running your code a large number of times to trigger the chain of events that leads to the race condition. This is something that happens to be done by @btime, but that you can also reproduce with a simple for loop.

Topic		Replies	Views
Running a function inside BenchmarkTools breaks it General Usage	2	378	March 15, 2019
Simple multi-thread loop with array Performance question , parallel , multithreading	11	758	April 13, 2021
Wrong results when I use @btime General Usage question , package , benchmarktools	5	636	November 11, 2021
Bug in BenchmarkTools? General Usage	2	198	February 8, 2023
What's the problem with this simple multi-thread code? General Usage question	17	1129	March 11, 2022

Benchmark tools breaks when using "Threads.@threads"?

Related topics