I wonder, is it safe to modify the same Data Frame in a multithread loop?
@threads for i in 1:size(df)[1]
df.column[i] = some_function(…)
end
Thank you in advance.
Best
I wonder, is it safe to modify the same Data Frame in a multithread loop?
@threads for i in 1:size(df)[1]
df.column[i] = some_function(…)
end
Thank you in advance.
Best
Generally, yes. See https://github.com/JuliaData/DataFrames.jl/issues/1905
I think this would be super slow due to type instability (especially when some_function
also depends on an element of df
) - the type of df.column
is not known at compile-time. Better broadcast your function over a complete column, this would not be multi-threaded but probably much faster overall.
Yes, you are right.
I just wonder, about how is safe to modify the same dataframe in the different threads. (not for sure not the same row of the df)