I re-discovered this old thread, which I started well more than a year ago, when I was a complete newbie:
I wanted to write a findall
-like function that partitions the indices into two sets according to a criterion:
function partition(vals, thresh)
idx_l = Vector{keytype(vals)}()
idx_h = Vector{keytype(vals)}()
for i in eachindex(vals)
(vals[i] < thresh) ? push!(idx_l, i) : push!(idx_h, i)
end
(idx_l, idx_h)
end
Note the keytype(vals)
bit. People argued, why dwell on the type of the index? Why not just use Int[]
. Fair enough.
But, I’ve recently realized that the above function works for a Dict
, as is! If you used Int
, it wouldn’t work for a Dict
.
For this particular case, the point became moot because an index-free solution was found:
function partition(v, thresh)
idx_l = findall(<(thresh), v)
idx_h = setdiff(eachindex(v), idx_l)
(idx_l, idx_h)
end
which also works for a Dict
. (I’m not comfortable with this solution, though, because the types of idx_l
and idx_h
are different.)
In general, you should assume minimum things on your variables as long as your code remains simple. For example, you shouldn’t assume that your index is one or another integer type unless doing so is necessary or it significantly simplifies your code.
Another example is Float64
. I usually write Float64
without thinking much. But, sometimes I ask myself, why am I writing Float64
when any real number will do? Should I be writing Real
instead?
But, this is hard to learn for a newcomer. There are no tutorials or textbooks to tell you something like the above. The community doesn’t seem to have consensus on the “best practice”.