Can I get this syntax to work using zip?

Ahmed_Salih · September 10, 2021, 9:14pm

Hello!

Suppose I have two points defined as:

p1 = (1.0,1.0)
p2 = (1.0,1.025)

Then I want to calculate the pair-wise distance between the two as such:

for (i,j) in zip(p1,p2)
    diff = i - j
end

And I want the output to be a tuple the same size as p1 and p2, with in this case values of:

(0.0,0.025)

I am aware this simple example can be done using .. I am specificially asking if this could work as I want using zip

Kind regards

mcabbott · September 10, 2021, 9:24pm

The obvious way to write it with zip doesn’t in fact return a tuple, because zip doesn’t. You could fix it by defining your own one. Or by just using the fact that map works like zip on multiple arguments:

julia> map(zip((1.0, 2.0), (3.0, 5.0))) do (i,j)
         diff = i - j
       end
2-element Vector{Float64}:
 -2.0
 -3.0

julia> _zip(ts::Tuple...) = ntuple(i -> map(t -> t[i], ts), minimum(length, ts));

julia> map(_zip((1.0, 2.0), (3.0, 5.0))) do (i,j)
          i - j  # diff
       end
(-2.0, -3.0)

julia> map((1.0, 2.0), (3.0, 5.0)) do i,j
         i - j
       end
(-2.0, -3.0)

Ahmed_Salih · September 10, 2021, 9:35pm

Thanks, that works for me! It is perhaps better to just use map in this case, yes

I have one question though, which is pretty basic, how would I assign the end result of map to something? Like I would usually do for a = 1 etc., I can’t seem to get it to work when using map.

If I put what you have wirtten in a function though it spits out what I want:

function pairwise_dist(p1,p2)
    map(p1,p2) do i,j
        diff = i - j
    end
    
end

So I am just asking to learn

Kind regards

mcabbott · September 10, 2021, 9:39pm

Writing diff = i - j inside the function body here is just a note to yourself, probably not best practice. To assign what the whole map expression returns, you need a = map(...) even if there’s a do involved. (Every expression in Julia returns something, so you will also see things like b = if x<0 ... else ... end.)

Ahmed_Salih · September 10, 2021, 10:11pm

I renamed diff to d and I see, it worked - thanks!

rafael.guerra · September 10, 2021, 11:20pm

The following looks nice too:

dif = Tuple(i-j for (i,j) in zip(p1,p2))

Ahmed_Salih · September 11, 2021, 8:06am

Thanks! map was more performant for me, so will stick to that, but that shows it is possible.

Kind regards

genkuroki · September 11, 2021, 8:31am

Input:

a = (1.0, 2.0)
b = (3.0, 5.0)
b .- a

Output:

(2.0, 3.0)

Input:

using BenchmarkTools
@btime $(Ref(b))[] .- $(Ref(a))[]

Output:

  1.400 ns (0 allocations: 0 bytes)
(2.0, 3.0)

rafael.guerra · September 11, 2021, 8:37am

Note OP’s premise:

genkuroki · September 11, 2021, 9:29am

I’m sorry for missing it.

We can type-stably create a tuple using map and zip by doing the following, but it will be less efficient because it goes through a Vector.

using BenchmarkTools

f(a::NTuple{N}, b::NTuple{N}) where N =
    NTuple{N}(map(((i, j),) -> j - i, zip(a, b)))

a = (1.0, 2.0)
b = (3.0, 5.0)
@btime f($a, $b)

Output:

 30.885 ns (1 allocation: 96 bytes)
(2.0, 3.0)

The following similar code using zip is more efficient (but too complex than b .- a).

g(a::NTuple{N}, b::NTuple{N}) where N =
    NTuple{N}(j - i for (i, j) in zip(a, b))

@btime g($(Ref(a))[], $(Ref(b))[])

Output:

  1.300 ns (0 allocations: 0 bytes)
(2.0, 3.0)

Using @code_native, I have found that the following three functions generate the same native code for a = (1.0, 2.0) and b = (2.0, 5.0):

F(a, b) = b .- a

g(a::NTuple{N}, b::NTuple{N}) where N =
    NTuple{N}(j - i for (i, j) in zip(a, b))

h(a::NTuple{N}, b::NTuple{N}) where N =
    ntuple(i -> b[i] - a[i], N)

Conclusion: Using g(a, b), we can do exactly the same as b .- a using zip.

Seif_Shebl · September 11, 2021, 12:18pm

I’m not sure if I missed something but this naive thing should be really fast:

p1 = (1.0,1.0)
p2 = (1.0,1.025)
pairwise_dist(p1,p2) = ntuple(i -> p1[i]-p2[i], length(p1))

@benchmark pairwise_dist($p1,$p2)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  0.001 ns … 0.100 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     0.001 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   0.028 ns ± 0.044 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▂
  0.001 ns       Histogram: frequency by time        0.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Ahmed_Salih · September 11, 2021, 12:57pm

Thanks for your answer. You have to benchmark like this:

@btime pairwise_dist($Ref(p1)[],$Ref(p2)[])
  163.787 ns (5 allocations: 160 bytes)
(0.0, -0.02499999999999991)

Since currently your benchmark is unrealistically fast, due to the compiler having realized that you do not use the value for anything (that is what I have been told atleast)

Kind regards

Seif_Shebl · September 11, 2021, 4:52pm

I’m not sure if Ref is very important here, but my function is as fast as the a .- b thing, so it doesn’t really matter.

pairwise_dist(p1,p2) = ntuple(i -> p1[i]-p2[i], length(p1))

function test_Pairs1(arr)
    s = 0.0
    for ps in arr 
        p1, p2 = ps
        s += sum(pairwise_dist(p1,p2))
    end 
    s 
end

function test_Pairs2(arr)
    s = 0.0
    for ps in arr 
        p1, p2 = ps
        s += sum(p1 .- p2)
    end 
    s 
end

arr = [( (rand(),rand()), (rand(),rand()) ) for i in 1:10^6]

@benchmark test_Pairs1($arr)
@benchmark test_Pairs2($arr)

@benchmark test_Pairs1($arr)
BenchmarkTools.Trial: 2314 samples with 1 evaluation.
 Range (min … max):  1.894 ms …   3.209 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.113 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.144 ms ± 114.584 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

              ▁▆██▆▅▄▃▄▁▁
  ▂▁▂▂▂▂▁▂▁▂▂▅████████████▅▄▄▄▃▃▄▃▃▃▃▃▃▃▃▃▂▂▃▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂ ▄
  1.89 ms         Histogram: frequency by time        2.61 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

@benchmark test_Pairs2($arr)
BenchmarkTools.Trial: 2318 samples with 1 evaluation.
 Range (min … max):  1.901 ms …   3.543 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.111 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.140 ms ± 115.493 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▄▆█▇██▅▅▅▂▄▁
  ▂▁▁▂▁▁▁▁▁▂▅████████████▅▅▄▃▃▃▄▃▃▃▃▃▃▂▃▃▃▃▂▃▂▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂ ▄
  1.9 ms          Histogram: frequency by time        2.62 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

rafael.guerra · September 11, 2021, 5:10pm

@Seif_Shebl, your function looks similar to @genkuroki’s, no?

mcabbott · September 11, 2021, 5:27pm

The fast functions here should all compile down to the same thing. While nothing actually happens in 0.001 ns, getting that does tell you that it’s reduced to something very simple.

julia> @code_typed pairwise_dist_ntuple(p1, p2)
CodeInfo(
1 ─ %1 = Base.getfield(p1, 1, true)::Float64
│   %2 = Base.getfield(p2, 1, true)::Float64
│   %3 = Base.sub_float(%1, %2)::Float64
│   %4 = Base.getfield(p1, 2, true)::Float64
│   %5 = Base.getfield(p2, 2, true)::Float64
│   %6 = Base.sub_float(%4, %5)::Float64
│   %7 = Core.tuple(%3, %6)::Tuple{Float64, Float64}
└──      return %7
) => Tuple{Float64, Float64}

julia> @code_typed pairwise_dist_map(p1, p2)
CodeInfo(
1 ─ %1 = Base.getfield(p1, 1, true)::Float64
│   %2 = Base.getfield(p2, 1, true)::Float64
│   %3 = Base.sub_float(%1, %2)::Float64
│   %4 = Base.getfield(p1, 2, true)::Float64
│   %5 = Base.getfield(p2, 2, true)::Float64
│   %6 = Base.sub_float(%4, %5)::Float64
│   %7 = Core.tuple(%3, %6)::Tuple{Float64, Float64}
└──      return %7
) => Tuple{Float64, Float64}

julia> @code_typed g(p1, p2)  # NTuple(generator)
CodeInfo(
1 ─ %1 = Base.getfield(a, 1, true)::Float64
│   %2 = Base.getfield(b, 1, true)::Float64
│   %3 = Base.sub_float(%2, %1)::Float64
│   %4 = Base.getfield(a, 2, true)::Float64
│   %5 = Base.getfield(b, 2, true)::Float64
│   %6 = Base.sub_float(%5, %4)::Float64
│   %7 = Core.tuple(%3, %6)::Tuple{Float64, Float64}
└──      return %7
) => Tuple{Float64, Float64}

Topic		Replies	Views
Type-stable zip to pairs General Usage question	4	1050	April 28, 2017
Performance of creating a tuple with a for loop Performance	26	2633	September 15, 2020
Type-stable zip of tuple General Usage question	3	835	October 22, 2019
Performance differences between `Array` and `Tuple` when combined with `zip` of index range in `for` loop New to Julia question	2	354	February 7, 2022
Why I cannot use a tuple in `map` directly? General Usage question	4	779	December 13, 2022

Can I get this syntax to work using zip?

Related topics