Type of array index?

rocco_sprmnt21 · August 17, 2023, 1:11pm

using InvertedIndices
 function partition_vec1(v, thresh)
    idx_l = findall(<(thresh), v)
    (idx_l,  Not(idx_l))
  end

ryofurue · August 17, 2023, 3:14pm

Interesting, but the use of Not(idx_l) is too much limited:

julia> v = rand(1:10, 10)
# . . . 
julia> a, b = partition_vec1(v, 7)
([1, 2, 3, 5, 7, 8, 10], InvertedIndex{Vector{Int64}}([1, 2, 3, 5, 7, 8, 10]))

julia> a
7-element Vector{Int64}:
  1
  2
  3
  5
  7
  8
 10

julia> b
InvertedIndex{Vector{Int64}}([1, 2, 3, 5, 7, 8, 10])

julia> collect(b)
ERROR: MethodError: no method matching length(::InvertedIndex{Vector{Int64}})

The InvertedIndex object itself doesn’t know what it’s the inverse of. As a result, you can use it only as an index into the same-sized array.

I didn’t say this in my original post, but I need to use both idx_l and idx_h independently of the original array, as

for i in idx_h

tomerarnon · August 24, 2023, 11:49pm

You can do b = setdiff(eachindex(v), a) if you prefer.
An allocation free way would be b = Iterators.filter(∉(a), eachindex(v))

tomerarnon · August 25, 2023, 6:23am

Having now actually read the whole thread (sorry for contributing so irresponsibly before )
if you only need the partitioned arrays for iteration, you may prefer not allocating them at all:

function partition_indices(f, v)
    inds = eachindex(v)
    Iterators.filter(i -> f(v[i]), inds), Iterators.filter(i -> !f(v[i]), inds)
end

Technically, you’ll call f twice as many times this way, but that may very well be cheaper than allocating two arrays, depending on your use case.

If you collect the two iterators (not recommended), you will see they are what you want:

julia> v = rand(Bool, 10)'
1×10 adjoint(::Vector{Bool}) with eltype Bool:
 1  0  1  0  1  0  0  0  1  1

julia> collect.(partition_indices(>(0.5), v))
([1, 3, 5, 9, 10], [2, 4, 6, 7, 8])

As a bonus, the type of eachindex doesn’t really matter to you, as per your original question.

Andy_Zhang · July 12, 2024, 10:59pm

I have the same question and found this topic, and I found the answer in the Julia document. The answer to this topic is not very precise.

The official document answers your question in detail: Single- and multi-dimensional Arrays · The Julia Language

For “standard arrays” and “standard indices”

In the expression A[I_1, I_2, ..., I_n], each I_k may be a scalar index, an array of scalar indices, or an object that represents an array of scalar indices and can be converted to such by to_indices:

A scalar index. By default this includes:

Non-boolean integers

CartesianIndex{N}s, which behave like an N-tuple of integers spanning multiple dimensions (see below for more details)

You cannot use Boolean as the index.

ryofurue · July 13, 2024, 2:37am

That’s because my initial question wasn’t formulated well. Through this discussion, I learned that

my real question was How to find out the type of the index of the given array? and
the answer was/is keytype(v).

This was a better way to pose my question because you get a better solution to the same problem:

function partition_vals(vals, thresh)
  idx_l = Vector{keytype(vals)}()
  idx_h = Vector{keytype(vals)}()
  for i in eachindex(vals)
    (vals[i] < thresh) ? push!(idx_l, i) : push!(idx_h, i)
  end
  (idx_l, idx_h)
end

Why is it better than Vector{Int}() ? Because the above function works not only with Vectors but also with Dicts (and anything that implements indexing [ ]).

bertschi · July 13, 2024, 6:26am

Agreeing that a generic solution is better whenever possible. Seems that your function still has a bug though:

julia> A = [1 2; 3 4];

julia> partition_vals(A, 3)`
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type CartesianIndex{2}

The reason being – as alluded to above – that keytype does not necessarily match the type of eachindex. So, a proper generic version of your function could either use

eltype(eachindex(vals)) together with iterating over i in eachindex(vals)
or keytype(vals) together with iterating over i in keys(vals).

Both should work for vectors, dicts and arrays. For the latter, the meaning of indices is slightly different though as a multi-dimensional array can be accessed either by a linear index, i.e., as provided by eachindex, or a multi-dimensional index, e.g., a CartesianIndex. Here is a small example:

julia> A[1, 2]  # multi-dimensional index
2

julia> A[CartesianIndex(1, 2)]  # same as above
2

julia> A[3]  # linear index
2

julia> eachindex(A)  # gives linear indices
Base.OneTo(4)

julia> eltype(eachindex(A))  # type of linear indices, i.e., like typeof(first(eachindex(A)), but works for empty vectors
Int64

julia> keys(A)  # gives multi-dimensional indices
CartesianIndices((2, 2))

julia> keytype(A)  # type of keys
CartesianIndex{2}

Topic		Replies	Views
Writing for generality General Usage	2	257	August 17, 2023
Why does indexing into CartesianIndices lead to excessive runtime dispatch? Performance	9	276	February 20, 2024
Why does the type of an index default to Int? Internals & Design question , proposal	4	810	September 10, 2017
Any shortcut to 1:length(myVector)? General Usage	34	3086	January 12, 2021
Vector{Vector} indices General Usage indexing , arrays	22	2822	September 19, 2022

Type of array index?

Related topics