Is there a function for creating n (nearly) evenly-spaced integers for indexing?

parb · January 12, 2024, 8:07pm

Hi all,

I’d like to downsample arrays with something like

longarray = rand(100)
numsamples = 30
sampleindices = evenintegers(1, length(longarray), numsamples)
downsampled = longarray[sampleindices]

I realize its not obvious what “even” means here, since it cannot give the same exact difference between the integers, and its not clear if the first and last integer should be included.

What I do now instead is this, which feels like I’m overlooking some built-in functionality.

sampleindices = convert.(
    Int,
    round.(range(1, length(longarray), numsamples), digits=0
)

Can I reduce these last three lines to a single statement using built-in Julia?

Thanks!

rafael.guerra · January 12, 2024, 8:30pm

Those three lines could be written shortly:

round.(Int, range(1, length(longarray), numsamples))

adienes · January 12, 2024, 8:32pm

julia> collect(Iterators.partition(1:37, 5))
8-element Vector{UnitRange{Int64}}:
 1:5
 6:10
 11:15
 16:20
 21:25
 26:30
 31:35
 36:37

oheil · January 12, 2024, 8:33pm

round.(Int,LinRange(1,100,30))

shorter but basically the same.

Mason · January 12, 2024, 8:43pm

For what the OP wants here, I think you’d actually want to write

(first(chunk) for chunk in Iterators.partition(longarray, numsamples))

mbauman · January 12, 2024, 8:44pm

Even better: put this into the array indexing expression so you can use begin and end directly:

A[round.(Int, range(begin, end, numsamples))]

jar1 · January 12, 2024, 8:52pm

firstindex(xs) and lastindex(xs) can be used instead of 1 and length(xs) to avoid problems with OffsetArrays.

Dan · January 12, 2024, 8:59pm

Maybe you are looking for chunks from ChunkSplitters.jl package?

The docs are here:

Use it perhaps as (there is much more in the docs):

julia> using ChunkSplitters

julia> map(first∘first, chunks(longarray, 30, :batch))
30-element Vector{Int64}:
  1
  5
  9
 13
 17
 21
 25
 29
 33
 37
  ⋮
 92
 95
 98

might be the goal?

The indices can then be used as:

longarray[map(first∘first, chunks(longarray, 30, :batch))]

This allows a completely independent sample to be produced as:

using ChunkSplitters
using IterTools: nth
longarray[map(Base.Fix2(nth, 2)∘first, chunks(longarray, 30, :batch))]

parb · January 12, 2024, 9:08pm

The replies by @oheil, @rafael.guerra, and @mbauman are nice because they use Base functions and look simple to an occasional programmer, but I guess the answer is effectively no: There’s no (single) function to index my long arrays evenly at n points.

adienes · January 12, 2024, 9:09pm

what’s wrong with Iterators.partition ? it’s almost exactly what you’re looking for

Mason · January 12, 2024, 9:13pm

How so? It’s just breaking the array up into chunks, not sampling it.

Dan · January 12, 2024, 9:14pm

Julia trying to spread the simple functions to more packages and not have a Base monolith (AFAIK as I’m not a core-dev).

So the ChunkSplitters, IterTools usage I’ve mentioned looks pretty nice to Julians.

And can be made into a simple one-line function if needed. The name of your choice.

adienes · January 12, 2024, 9:22pm

I mean, you can partition eachindex. I guess it’s not an exact match

Dan · January 12, 2024, 9:29pm

Another nice combination with ChunkSplitters:

using ChunkSplitters
longarray = rand(100);

indices = first(getchunk(longarray, 1, length(longarray)÷30, :scatter),30)
samples = longarray[indices]

This method is also quite efficient (as the indices returned from getchunk and first are integer ranges).

parb · January 12, 2024, 9:30pm

Sure - I certainly don’t wanna speak for Julians. I’ll speak for my future self ~3 months from now when I next code something and need to down-sample a long array: I’m not gonna remember the composition of first with first, nor what the right parameters were for chunks.

I was hoping for a simple builtin because that’s something I might remember. In Python’s numpy I remember it, because it’s “just” a call to “as type Int” on the linspace.

Dan · January 12, 2024, 9:35pm

If you only need one such subsample (and not several), I think this might be good (though not superefficient because of randomness):

using StatsBase
sample(longarray, 30; replace=false)

and as the Romans use to say:

if you haven’t imported StatsBase, you haven’t done anything.

Okay, joke aside, the above is not a downsample, so maybe:

longarray[first(1:length(longarray)÷30:end, 30)]

which is short enough to remember and in Base.

lmiq · January 13, 2024, 1:26pm

Isn’t just

longarray[begin:30:end]

enough here? (Maybe just adjusting the step if the number of samples is given)

tbeason · January 13, 2024, 2:36pm

This is also what I expected to see suggested

rafael.guerra · January 13, 2024, 2:39pm

It doesn’t seem to be enough as OP wants 30 indices nearly evenly distributed, while that will give him only 4.

rocco_sprmnt21 · January 13, 2024, 9:30pm

A[range(start=begin, step=end ÷ 30, length=30)]

Topic		Replies	Views
Any shortcut to 1:length(myVector)? General Usage	34	3075	January 12, 2021
Not evently-spaced grids/ranges New to Julia question	11	1436	April 10, 2019
How to generate a sequence of numbers in julia General Usage	2	25867	January 21, 2019
Accessing every elements of an array except a certain range General Usage question , array , arrays	10	2827	February 13, 2021
How do I create an array with random unique numbers in a specific range? New to Julia question , random	20	1395	June 24, 2024

Is there a function for creating n (nearly) evenly-spaced integers for indexing?

Related topics