It’s a small question but I have an impulse to ask:
I find that the generator is lazy
julia> const a = fill(0, 4);
julia> function f!(i)
i == 3 && error()
a[i] = i
-i
end;
julia> foreach(println, f!(i) for i = 1:4)
-1
-2
ERROR:
Stacktrace:
[1] error()
@ Base ./error.jl:45
[2] f!
@ ./REPL[3]:2 [inlined]
[3] #2
@ ./none:-1 [inlined]
[4] iterate
@ ./generator.jl:48 [inlined]
[5] foreach(f::typeof(println), itr::Base.Generator{UnitRange{Int64}, var"#2#3"})
@ Base ./abstractarray.jl:3188
[6] top-level scope
@ REPL[4]:1
julia> a
4-element Vector{Int64}:
1
2
0
0
So here you find that -1 and -2 gets printed before the second arg of foreach is unlazily evaluated. (as opposed to foreach(println, [f!(i) for i = 1:4])).
Now I want the behavior of the non-lazy [] case. But I feel that it allocates a Vector, which is a non-lightweight object in julia. My intuition is that there should be a more efficient container than the disposable [] here. Or there should be a fused foreach method.
So,
is there a solution better than []? (I guess NTuple is not because the length in practice might be large)
PS the background is that I’m doing foreach(wait, [Threads.@spawn(f(j)) for j = 1:1000000])
But according to my above findings, I suspect that it’s incorrect to write waitall(Threads.@spawn(f(j)) for j = 1:1000000).
Am I correct?
I don’t understand what you’re after? If you don’t want a generator, i.e. the “lazy” evaluation, you need to store the output of f! (or @spawn) somewhere. If it doesn’t fit in a tuple/the stack, it must be allocated. A vector is fine for that.
But I think there should be a more concise datastruct (less ordinary) than the standard Vector.
Because a Vector has other characteristics, e.g. index, variable length, which are not needed for my particular usage.
This sounds a bit like an XY problem. And I don’t really agree with this assessment
Of course, there are a few bytes required on top of the bare elements of the vector, but it doesn’t quite get more lightweight than this (except maybe Memory which was already mentioned). If your collection has more than a couple elements, the size of the elements will dominate the small overhead of the rest.
Anyhow, before “optimizing away” something like the allocation of the vector in your example, make sure you have identified the performance bottlenecks of the code as a whole e.g. by profiling it.
It’s hard to say for sure from your isolated example, but if the allocation itself is really taking up a significant portion of time/memory, then you could think about changing the overall algorithm (you can loop over the array twice, once just to detect errors; preallocate a big vector and re-use it all the time; use an IOBuffer to store whatever you want to print until you decide if you actually want to print it; etc.). Otherwise, I wouldn’t bother with this particular allocation (as @sgaure said, if you want to store the results somewhere then you have to also put them somewhere in memory…).
If you want the tasks to fail early, rather than finish all and then realize that one of them had an error, have a look at the failfast keyword of waitall:
help?> waitall
search: waitall waitany wait iswritable
waitall(tasks; failfast=true, throw=true) -> (done_tasks, remaining_tasks)
Wait until all the given tasks have been completed.
If failfast is true, the function will return when at least one of the given tasks is finished by exception. If throw is true, throw
CompositeException when one of the completed tasks has failed.
failfast and throw keyword arguments work independently; when only throw=true is specified, this function waits for all the tasks to
complete.
The return value consists of two task vectors. The first one consists of completed tasks, and the other consists of uncompleted
tasks.
foreach(wait, ...) would wait until all tasks are finished and then return, whereas waitall by default errors as soon as one of the tasks errors (or as soon as it notices).
You are falling into the optimization trap here causing you to overthink something that does not have an impact. Fortunately for your program to be fast, it is not important for everything it does to be as fast as possible - it’s sufficient if the parts that take most time to be as efficient as possible.
The work you do in the tasks should outweigh the cost of allocating a vector to store their references by many order of magnitude. So you do not need to waste time/braincycles thinking about it. Should that not be the case then you should rather change your setup ans probably not use Tasks at all.
These characteristics do not get in the way in any meaningful sense. They don’t require any allocations except for 40 bytes for the Vector wrapper (24 bytes) around Memory (16 bytes). Indexing calls getindex, which is inlined and eliminated by the compiler, just like getproperty is, and most often the indirection due to reallocatability of the Vector is also eliminated. These extra characteristics of a Vector compared to, say, a pointer, can for most purposes be thought of as a minor compile time cost.