no, the data and output array are already loaded
1 Like
did it, but still 2 times of the thread performanceā¦
quick updatesā¦I found that I could make the @threads parellization has the same amount of time compared to single thread running on the log in nodeā¦not on the computation node by srun ptyā¦
(on login node)
julia> @btime coreTEM!($space_selected_models, $space_forcing[space_index], $space_spinup_forcing[space_index], $loc_forcing_t,
$space_output[space_index], $space_land[space_index], $tem_info)
29.113 s (7 allocations: 51.33 KiB)
julia> @btime runTEM!($info.models.forward, $run_helpers.space_forcing, $run_helpers.space_spinup_forcing, $run_helpers.loc_forcing_t,
$run_helpers.space_output, $run_helpers.space_land, $run_helpers.tem_info)
23.544 s (88 allocations: 56.81 KiB)
(on srun node)
julia> @btime coreTEM!($space_selected_models, $space_forcing[space_index], $space_spinup_forcing[space_index], $loc_forcing_t,
$space_output[space_index], $space_land[space_index], $tem_info)
20.707 s (7 allocations: 51.33 KiB)
julia> @btime runTEM!($info.models.forward, $run_helpers.space_forcing, $run_helpers.space_spinup_forcing, $run_helpers.loc_forcing_t,
$run_helpers.space_output, $run_helpers.space_land, $run_helpers.tem_info)
39.177 s (88 allocations: 56.81 KiB)