Help needed getting started with threads!

I removed it and the termination code disappeared! :person_shrugging:

I guess I should check that the last few Excel files were actually created before the script terminated…

And you’re positive the code does not appear in the equivalent non-threaded version?

Maybe XLSX.jl is not thread-safe and some global resource is getting corrupted? I’m afraid I can’t be of much further help here…

There were already lots of helpful discussions here, but I just wanted to drop this in case it helps getting the overall design of the multi-threaded part right:

(But it is also a good idea to check whether the library you are using has some problems with threading.)

2 Likes

And you’re positive the code does not appear in the equivalent non-threaded version?

Yes, certain.

Maybe XLSX.jl is not thread-safe and some global resource is getting corrupted? I’m afraid I can’t be of much further help here…

Maybe…

(But it is also a good idea to check whether the library you are using has some problems with threading.)

I don’t know how to do this (other than try using it and and fail).

a blog post tutorial

Thank you for this. During my trials yesterday I had already found and read this.

Your first example says:

@time begin           
    task1 = @spawn (println(threadid()); sleep(1))    
    task2 = @spawn (println(threadid()); sleep(1))
    wait.([task1, task2])
end

I don’t understand the purpose of sleep and wait here. I would have thought it was to do what @sync does, but in a later example you have both sleep and @sync.

Just to be clear, all the credit for the blog post with the examples and nice explanations goes to @Satvik but I think I can answer at least part of your questions:

sleep(1) just does nothing for 1 second (in this time, Julia should decide to switch tasks, because it knows that nothing will have to be done). That simulates a function call that would take around 1 second to complete. If you were to only run println(threadid()) you couldn’t really tell whether the multi-threading worked or not, since the code executes in the blink of an eye.

Instead of the whole expression (println(threadid()); sleep(1)) you would put whatever function you want to execute in each Task.

wait prevents the Julia code from continuing to execute until the end until the given tasks have completed (otherwise the main Julia process where you create the tasks would already exit and the tasks are “lost”). @sync essentially does the same, but with nicer syntax (it’s easy to forget to call wait for one of many tasks, but @sync covers all of them.


I don’t know enough about the inner workings of the package (or Julia) either, I’m afraid. At least the exit code means nothing to me.

Can you create a minimal and self-contained example that consistently produces the error for you. Sharing this here or in an issue on Github is probably the best way forward.

It looks like simply creating lots of Excel files is not the main issue, at least this example seems to run fine (I took the content from the documentation of XLSX.jl and added a sleep(2) in the function that writes the file to see whether it makes any difference whether many files are (potentially) open at the same time. It didn’t make a difference for me.

using XLSX

function write_test_content(filepath)
    XLSX.openxlsx(filepath, mode="w") do xf
        sheet = xf[1]
        XLSX.rename!(sheet, "new_sheet")
        sheet["A1"] = "this"
        sheet["A2"] = "is a"
        sheet["A3"] = "new file"
        sheet["A4"] = 100

        sleep(2)

        # will add a row from "A5" to "E5"
        sheet["A5"] = collect(1:5) # equivalent to `sheet["A5", dim=2] = collect(1:4)`

        # will add a column from "B1" to "B4"
        sheet["B1", dim=1] = collect(1:4)

        # will add a matrix from "A7" to "C9"
        sheet["A7:C9"] = [ 1 2 3 ; 4 5 6 ; 7 8 9 ]
    end
end

mkpath("test_xlsx_files")

Threads.@sync for i in 1:100
    Threads.@spawn write_test_content(joinpath("test_xlsx_files", "file_$(i).xlsx"))
end

I can see all the files with the correct content. Do you get the same behavior? If yes, then the problem has more likely something to do with the content or the way it is written to the files in your case.

EDIT: This is the version of Julia/the package I used:

versioninfo
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (x86_64-apple-darwin22.4.0)
  CPU: 8 × Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, icelake-client)
Threads: 4 default, 0 interactive, 2 GC (on 8 virtual cores)
  [fdbf4ff8] XLSX v0.10.2
1 Like

sleep isn’t doing any useful work in any of the examples, the point is just to demonstrate the characteristics of Threads.@spawn. sleep is a function that requires essentially no processing and can be yielded from right away. Even if you’re only running a single core, running several concurrent copies of sleep(1) will still only take ~1 second. This is similar to reading files or requests from the network.

hash_lots is the opposite, a function that uses the cpu the whole time, and therefore only scales with the number of cores. The examples try to show that Threads.@spawn does concurrency, parallelization, and task switching.

For your specific problem, one thing I see in XLSX.jl is that it sometimes uses Dicts as caches under the hood (see e.g. XLSX.jl/src/types.jl at master · felipenoris/XLSX.jl · GitHub). Dicts aren’t threadsafe, so that might be why your code is crashing. Are you getting segmentation faults? Unfortunately there’s no particularly easy workaround if that’s the case.

2 Likes

No. No stacktrace or error message at all, just the termination code I reported before: terminated with exit code: -1073740940.

 *  The terminal process "C:\Users\TGebbels\.julia\juliaup\julia-1.10.5+0.x64.w64.mingw32\bin\julia.exe '--color=yes', '--startup-file=no', '--history-file=no', 'c:\Users\TGebbels\.vscode\extensions\julialang.language-julia-1.120.2\scripts\debugger\run_debugger.jl', '\\.\pipe\vsc-jl-dbg-8456b44a-84e1-4207-bbe6-e4ef97a4444c', '\\.\pipe\vsc-jl-dbg-a9faa164-10bc-449a-b912-b6604e377fb2', '\\.\pipe\vsc-jl-cr-298a8501-1dec-4ec6-a109-c8a5c411e93e'" terminated with exit code: -1073740940. 

Not much for me to go on.

@Sevi My full code successfully produces 335 Excel files across 5 different directories (the outer loop I mentioned before). Only for the last directory, it stops after 143 files when I’m expecting 156. I don’t think there is anything special about those missing files - after all I can write them successfully if I do everything sequentially - but I’ll look a bit closer.