The difference is that in the broadcasting version you create dsi[1:end-ss] once and use it for all elements in ts, whereas in the loop version you create it for every element.
With AoC2024/day19/input.txt at main · asnoyman/AoC2024 · GitHub as input file (which does not seem to be the same one as you’re using as the output is not the same), and using f1, f1a as above, and f1b similar to f1a but with
dsi_end = dsi[1:end-ss]
for e in ts
if endswith(dsi_end,e)
cm[length(e)+ss]+=cm[ss]
end
end
I get
julia> @btime sum(f1($ds, $ts, i) for i in 1:length($ds); init=0) # broadcasted version
65.037 ms (250735 allocations: 17.22 MiB)
758890600222015
julia> @btime sum(f1a($ds, $ts, i) for i in 1:length($ds); init=0) # dsi[1:end-ss] inside the loop
101.799 ms (37804 allocations: 7.92 MiB)
758890600222015
julia> @btime sum(f1b($ds, $ts, i) for i in 1:length($ds); init=0) # dsi[1:end-ss] outside of the loop
61.943 ms (37804 allocations: 7.92 MiB)
758890600222015