How simple get vector from ranges?

Martin_Florek · August 24, 2017, 12:03pm

In Matlab I have vector and vector of ranges.

a = [10., 9., 8., 7., 6., 5., 4., 3., 2., 1.]
dr = [1:2, 4:7]

If I want a new vector from ranges I will call.

mo = a(dr)
mo = 10 9 7 6 5 4

In Julia I started with the function of map() and I got this.

jHelp = map(x -> a[x], dr)
JHelp = Array{Array{Float64,1},1}[[10., 9.], [7., 6., 5., 4.]]

After much effort , I worked on the same result as Matlab with the help of mapreduce().
.

jo = mapreduce(x -> a[x], vcat, dr)
jo = [10., 9., 7., 6., 5., 4.]

There is some simpler alternative or this is the best possible solution?
Thanks.

mauro3 · August 24, 2017, 12:08pm

Does this help:

julia> [collect(1:4); collect(6:9)]                                                                                                                                                                                
8-element Array{Int64,1}:                                                                                                                                                                                          
 1                                                                                                                                                                                                                 
 2                                                                                                                                                                                                                 
 3                                                                                                                                                                                                                 
 4                                                                                                                                                                                                                 
 6                                                                                                                                                                                                                 
 7                                                                                                                                                                                                                 
 8                                                                                                                                                                                                                 
 9

collect turns the range into a vector, then the ; inside the [] concatenates. That should help you make the dr into a plain vector which you can use to index a.

tkoolen · August 24, 2017, 12:11pm

How about

using Base.Iterators: flatten
[x for (index, x) in enumerate(a) if index in flatten(dr)]

or

[a[i] for i in flatten(dr)]

although I quite like your mapreduce solution.

yuyichao · August 24, 2017, 12:50pm

julia> a = [10., 9., 8., 7., 6., 5., 4., 3., 2., 1.]
10-element Array{Float64,1}:
d 10.0
  9.0
  8.0
  7.0
  6.0
  5.0
  4.0
  3.0
  2.0
  1.0

julia> dr = [1:2; 4:7]
6-element Array{Int64,1}:
 1
 2
 4
 5
 6
 7

julia> a[dr]
6-element Array{Float64,1}:
 10.0
  9.0
  7.0
  6.0
  5.0
  4.0

yuyichao · August 24, 2017, 12:57pm

And note that you don’t have a vector of ranges in matlab. Ranges isn’t even a (user visible) type in matlab AFAICT.

tkoolen · August 24, 2017, 1:02pm

Sure, but creating a dense index vector in this manner would result in double the allocations asymptotically compared to the mapreduce or flatten approaches.

Martin_Florek · August 24, 2017, 1:30pm

@mauro3 and @yuyichao It seems that the semicolon could be a solution, even without the use of collect() function. The question remains how to create / transmit such a vector that is unnecessarily long. In the sense that I am working with big data and transfer the vector of ranges through multiple functions.

Array{UnitRange{Int64},1}[1:2, 4:7]
Array{Int64,1}[1:2; 4:7] # with semicolon ';'

A possible solution would be to transfer a vector as Array {UnitRange {Int64}, 1} and call it as Array {Int64,1} directly with vcat(). What do you think?

julia> a[vcat(dr...)]
6-element Array{Float64,1}:
 10.0
  9.0
  7.0
  6.0
  5.0
  4.0

Thanks and thank @yuyichao for correction about AFAICT.

mauro3 · August 24, 2017, 1:47pm

vcat(dr...) produces the same vector [1:2; 4:7], so it does not save memory. Also your mapreduce solution produces two vectors which then get vcated. I think @tkoolen’s second solution is best.

tkoolen · August 24, 2017, 1:49pm

Ah yes of course, I missed that.

mkborregaard · August 24, 2017, 2:03pm

EDIT: Just realized this is identical to @yuyichao’s opst
I may be misunderstanding the question completely, but is the issue not just that you use semicolon rather than comma for concatenation in Julia?

a = [10., 9, 8, 7, 6, 5, 4, 3, 2, 1]
dr = [1:2; 4:7]
a[dr]
#6-element Array{Float64,1}:
# 10.0
#  9.0
#  7.0
#  6.0
#  5.0
#  4.0

tkoolen · August 24, 2017, 3:07pm

For fun I benchmarked the vcat-the-indices approach and the flatten approach for large a and dr (with dr a Vector{UnitRange{Int}} as in the original post, to allow for fair comparison).


using BenchmarkTools
using Base.Iterators: flatten

a = rand(10_000_000)
dr = [10 : 100, 150 : 6_000, 10_000 : 8_000_000]

function f1(a, dr)
    a[vcat(dr...)]
end

function f2(a, dr)
    [a[i] for i in flatten(dr)]
end

@benchmark f1($a, $dr)

@benchmark f2($a, $dr)

Results for f1 (vcat-the-indices):

BenchmarkTools.Trial: 
  memory estimate:  122.01 MiB
  allocs estimate:  7
  --------------
  minimum time:     42.596 ms (21.99% GC)
  median time:      43.553 ms (24.15% GC)
  mean time:        44.995 ms (26.19% GC)
  maximum time:     110.922 ms (69.96% GC)

and for f2 (flatten):

BenchmarkTools.Trial: 
  memory estimate:  65.00 MiB
  allocs estimate:  28
  --------------
  minimum time:     72.419 ms (0.00% GC)
  median time:      74.665 ms (0.00% GC)
  mean time:        86.345 ms (8.86% GC)
  maximum time:     233.150 ms (54.92% GC)
  --------------
  samples:          43
  evals/sample:     1

So about double the allocations for f1 as I expected, but f1 is surprisingly faster!

tkoolen · August 24, 2017, 3:19pm

The surprising results above are because length is not defined for the Flatten object returned by flatten, which means that the size of the result cannot be determined in advance. If I define:

Base.length(fl::Base.Iterators.Flatten{<:AbstractVector{<:UnitRange}}) = sum(length, fl.it)
Base.iteratorsize(::Base.Iterators.Flatten{<:AbstractVector{<:UnitRange}}) = Base.HasLength()

then the benchmark results for the flatten approach become

BenchmarkTools.Trial: 
  memory estimate:  61.00 MiB
  allocs estimate:  5
  --------------
  minimum time:     21.147 ms (1.61% GC)
  median time:      27.357 ms (20.71% GC)
  mean time:        28.169 ms (22.88% GC)
  maximum time:     90.278 ms (76.97% GC)
  --------------
  samples:          178
  evals/sample:     1

Maybe a more general version of the methods above could make it into Base? We’re getting a lot of mileage out of this simple question.

(As an aside, how awesome is it that you can just tinker like this in Julia!)

Edit: Julia issue: https://github.com/JuliaLang/julia/issues/23431.

Martin_Florek · August 25, 2017, 7:04am

Yes, it is awesome.
Thanks, I did not even know that Iteration utilities exist. This is the new feature of Julia 0.6. In Julia 0.5, mapreduce is a good choice in term of speed, but the allocation is a little more than f1. For completeness, I add the benchmark for the mapreduce in the version 0.6.

function f3(a, dr)
    mapreduce(x -> a[x], vcat, dr)
end

julia> @benchmark f1($a, $dr)
BenchmarkTools.Trial:
  memory estimate:  122.01 MiB
  allocs estimate:  8
  --------------
  minimum time:     70.672 ms (16.28% GC)
  median time:      72.660 ms (16.37% GC)
  mean time:        76.670 ms (19.92% GC)
  maximum time:     158.054 ms (61.80% GC)
  --------------
  samples:          66
  evals/sample:     1

julia> @benchmark f2($a, $dr)
BenchmarkTools.Trial:
  memory estimate:  65.00 MiB
  allocs estimate:  28
  --------------
  minimum time:     108.974 ms (0.00% GC)
  median time:      113.256 ms (0.00% GC)
  mean time:        123.818 ms (8.49% GC)
  maximum time:     234.858 ms (49.67% GC)
  --------------
  samples:          32
  evals/sample:     1

julia> @benchmark f3($a, $dr)
BenchmarkTools.Trial:
  memory estimate:  122.05 MiB
  allocs estimate:  10
  --------------
  minimum time:     61.970 ms (18.74% GC)
  median time:      63.719 ms (18.74% GC)
  mean time:        66.872 ms (22.33% GC)
  maximum time:     144.829 ms (64.45% GC)
  --------------
  samples:          75
  evals/sample:     1

After adding hack.

julia> @benchmark f2($a, $dr)
BenchmarkTools.Trial:
  memory estimate:  61.00 MiB
  allocs estimate:  5
  --------------
  minimum time:     30.712 ms (0.91% GC)
  median time:      37.693 ms (15.98% GC)
  mean time:        40.333 ms (19.18% GC)
  maximum time:     118.468 ms (71.25% GC)
  --------------
  samples:          124
  evals/sample:     1

bionicinnovations · January 20, 2023, 8:56am

Does this help?

julia> dr = [1:2, 4:7]
2-element Vector{UnitRange{Int64}}:
 1:2
 4:7

julia> vcat(collect.(dr)...)
6-element Vector{Int64}:
 1
 2
 4
 5
 6
 7

carstenbauer · January 20, 2023, 8:59am

Please note that this thread hasn’t seen any activity for 5 years and it’s generally better to avoid reviving such old threads (and implicitly ping all people involved) without good reason.

Topic		Replies	Views
How to get vector from set of ranges? General Usage	7	643	May 24, 2022
How to break apart a UnitRange? General Usage	11	545	August 6, 2020
Range with extra end? Is posible? General Usage	5	452	February 12, 2020
Concatenate Generators Performance	18	789	April 20, 2021
+ for Unit Range. Is my REPL broken? New to Julia question	12	138	March 18, 2025

How simple get vector from ranges?

Related topics