I’ve being reading a little bit some basic Julia tutorials and I have a doubt:
Could anyone explain the difference between Iterators, collections and arrays, please.
I’m coming from R and there aren’t iterators or collections there, I don’t understand why we need it.
It’s not necessary to store every single number in order to iterate over a range of equally spaced numbers, so Julia takes advantage of this. Rather than create a temporary vector, Julia uses a data type that uses less memory and does the same thing.
My understanding is that iterators simply allow you to loop over objects. In R, you loop over an integer index. In Julia, it is also possible to loop over objects. For example, you can iterate over an array of arrays:
data = [rand(2,2) for i in 1:10]
for d in data
println(d)
end
Using iterators like this is certainly not necessary, but it can be convenient and easier to read. You can also iterate over the object and index with enumerate() or multiple objects concurrently with zip().
By the way, if you are simply initializing an array, you can use
Just to be clear, a range like 1:10 in Julia is not just an iterator (= any type you can loop over, i.e. any type with start, next, done, and usually eltype and length), it is a subtype of AbstractVector (and has all usual array methods like getindex and ndims), so you can mostly treat it as a drop-in replacement for a read-only array.
That’s the key. Since it doesn’t allocate memory, usually it will work in circumstances where it’s read-only. 1:4 never makes an array, but A=1:4; A[1] still works. But since there is no array in memory to actually write to, A[1] = 4 fails without collecting to a real array. So if you pass it into algorithms which use the array but don’t write into it, 99% of the time you’re fine. The other 1% is someone too stictly typing their dispatches (i.e. a bug to report).
It is possible that some of those are outdated — Julia evolves very rapidly. Read the manual.
Iteration (traversal of a collection) is implemented using generic functions in Julia. This means that for each type, you can specify how it is traversed. This has various advantages: some stuctures have a layout which favors a certain kind of traversal, and in some cases, the values can be generated very cheaply on demand, as for 1:10. This is a big advantage compared to R, where 1:10000 means that you actually allocate that vector.
In principle, every function that can expects an iterable object should be able to deal with types that implement the interface. collect is a workaround for when it is not the case: eg collect(1:10) converts to a vector [1,2,3,4,5,6,7,8,9,10]. As a user, design your code so that it works with all iterables (simply not restricting the type will be fine in most cases). If you encounter restrictive behavior in a library, report an issue.