Adding list of tuples

Phuntsho · October 2, 2022, 7:07am

I have two long lists of tuples: For instance, A1 = [(2,3),(4,5)…] and A2 = [(4,6),(2,8)…]. The first element of each tuple is common to both lists. Now I want to produce a new list, A3, such that whenever the first element is matched, second elements are added. I could do it using for loop and if condition as shown below

A3 = []
for i in A1
   for j in A2
       if i[1]==j[1]
          push!(A3,(i, i[2]+j[2])
       end
   end
end

Is there any way I can achieve this same result using one line using something like filter function? Thank you in advance.

jar1 · October 2, 2022, 8:17am

You can probably use Iterators.product to replace the double loop, then filter or Iterators.filter to replace the if then map or Iterators.map to replace the push!.

lmiq · October 2, 2022, 9:39am

Use Tuple{Int,Int}[] here if performance is of any concern.

(That given and inside a function, the loop is fine imo, I don’t see a one liner making that clearer)

Dan · October 2, 2022, 9:46am

D = Dict(first(A1).=>last(A1))
[ (k,v+D[k]) for (k,v) in A2 if haskey(D,k)]

This can also fit in one line ;). It assumes first elements in tuples are unique in A1 and A2, which looks reasonable from question.

In essence, this is a database join operation (might be more efficient to consider using DB for massive A1 and A2).

rocco_sprmnt21 · October 2, 2022, 6:36pm

this could be a solution
this also works if the first element of the pair is not unique.

A1 = [(2,3),(4,5)]
A2 = [(4,6),(2,8)]
df1=DataFrame(x=first.(A1),y=last.(A1))
df2=DataFrame(x=first.(A2),y=last.(A2))

combine(groupby(vcat(df1,df2),:x), :y=>sum)




A=vcat(A1,A2)
df=DataFrame(x=first.(A),y=last.(A))
udf=unstack(df,:x,:y, valuestransform=sum)

but perhaps the most “natural” is the following


d1=Dict(Pair(e...) for e in A1)
d2=Dict(Pair(e...) for e in A2)

mergewith(+, d1, d2)

rafael.guerra · October 2, 2022, 11:16pm

Perhaps it could be written more simply as:

d = mergewith(+, Dict(A1), Dict(A2))

And then to get the output as per OP do:

A3 = [(a, d[a[1]]) for a in A1]

lmiq · October 3, 2022, 12:13am

I just want to stress that:

Dicts are almost certainly slower than vectors of tuples.
There is no reason whastosever to use any package or fancy syntax for this. The original proposal of the OP is perfectly fine if using Tuple{Int,Int}[] to initialize the resulting array and putting all that inside a function.
That above will almost certainly be faster than any of the alternatives proposed here.
IMO, the loop much is much clearer.

The fact that one can write something like the OP did and get close to the best one can get, just being explicit about the logic of what one wants to do is a fundamental feature of Julia.

edit: I’m not sure if the proposals here do the same as the OP proposal (or if they are what was expected, or not). Seems that people assumed that the first element of the tuple is equivalent to a dictionary key, which I’m not sure if is the case (are they unique?). Probably more info is necessary to actually understand what is the best approach.

jar1 · October 3, 2022, 12:39am

It’s good that Julia can make loops performant when they are needed. However, capturing the logic of the operation in a named function like map allows thinking and communicating at a higher level than state-machine operations, so I like to use these functions when I can.

Here’s a C++ perspective on it:

https://belaycpp.com/2021/06/22/dont-use-raw-loops/

lmiq · October 3, 2022, 12:49am

There are cases and cases. But I don’t generally agree with that. Very often code becomes impossible to understand after being written with clever combinations of higher level functions. Many, many times, the loop is way the most clear thing to read.

Jollywatt · October 3, 2022, 3:40am

I have nothing new to add, except that this is how I’d write the OP’s loop:

A3 = Tuple{Int,Int}[]
for i in A1, j in A2
    i[1] == j[1] && push!(A3,(i, i[2]+j[2])
end

(Pretty much the same!)

aplavin · October 3, 2022, 5:49am

The requested operation is a join of these two lists.

julia> using FlexiJoins

julia> map(p -> (p.A1, p.A1[2] + p.A2[2]), innerjoin((;A1, A2), by_key(first)))
2-element StructArray(::Vector{Tuple{Int64, Int64}}, ::Vector{Int64}) with eltype Tuple{Tuple{Int64, Int64}, Int64}:
 ((2, 3), 11)
 ((4, 5), 11)

rocco_sprmnt21 · October 3, 2022, 7:29am

putting it all together, it could come like this …

[Tuple(d) for d in mergewith(+, Dict(A1), Dict(A2))]

Topic		Replies	Views
Functional style table processing General Usage question	7	486	September 22, 2018
Julia v0.6 - adding tuples in parallel for loop General Usage	6	1265	June 25, 2017
List of pairs General Usage tuple , arrays	9	2727	February 1, 2022
List into list New to Julia	12	1760	August 6, 2020
Checking the uniqueness of a tuple General Usage	3	807	August 5, 2019

Adding list of tuples

Related topics