# Add elements to tuples or arrays

Hi everyone, do you know a way to have this working on Julia REPL?

``````>>> a = [1,2,3]
>>> a + 
[1, 2, 3, 4]
``````
``````julia> a=[1, 2, 3]

3-element Vector{Int64}:

1

2

3

julia> push!(a,4)

4-element Vector{Int64}:

1

2

3

4
``````
1 Like

Thank you for your reply, but in that case, I can’t use append or push, because I’m iterating over the array. Let me explain with an example:

``````julia> a = [(1,2), (3,4)]
julia> for (i, t) in enumerate(a)
push!(a, reverse(t))
end
``````

It will generate an infinite loop

What do you want to obtain? Is it `a = [(1,2), (3,4), (2,1), (4,3)]`?

For this example, yes, but I wanna a generic solution. For example, for this case, I know that I can solve it with the code below (just renaming the variables and separating into different lists):

``````function compound(SMILES::String)
benzene = smilestomol(SMILES)
benzene_ed = []
i_vert = []
j_vert = []
for edge in benzene.edges
push!(benzene_ed, edge, reverse(edge))
end
for ed in benzene_ed
push!(i_vert, ed)
push!(j_vert, ed)
end

neighbor_list = tuple(i_vert, j_vert)
return neighbor_list
end
``````

I really would like to know if there is a way of adding elements to arrays inside of an iteration, without using push! or append!

Without loops?

``````append!(a, reverse.(a))
``````

There are options other than `append!` and `push!` but they are generally less efficient. There is `a = [a; reverse.(a)]` or `a = vcat(a, reverse.(a))`, …etc.

No, I wanna use it inside of an iterative loop

If `push!` and the like are expensive for your application, you can pre-allocate the output array once and fill the individual elements inside the loop using the indices. Like:

``````i_vert = Vector{eltype(benzene_ed)}(undef,100)
...
i_vert[i] = benzene_ed[i]
``````

I’m not sure if I understood correctly.

`enumerate` isn’t normally what you want to use on vectors, it does not give you indices, it just counts. It’s better to use `pairs` or `eachindex`, depending on what you need. It’s unfortunate that `enumerate` is so widely known and (mis)used on vectors, and `pairs` is apparently so little known.

``````1.7.1> a = [(1,2), (3,4)]
2-element Vector{Tuple{Int64, Int64}}:
(1, 2)
(3, 4)

1.7.1> for (i, t) in pairs(a)
push!(a, reverse(t))
end

1.7.1> a
4-element Vector{Tuple{Int64, Int64}}:
(1, 2)
(3, 4)
(2, 1)
(4, 3)
``````

Or

``````1.7.1> for i in eachindex(a)
push!(a, reverse(a[i]))
end
``````

BTW, these are very negative for performance:

``````benzene_ed = []
i_vert = []
j_vert = []
``````

Untyped containers are a big red flag, if you ever have `x = []` somewhere in your code, it’s a bad sign, since they create a `Vector{Any}` for which the compiler cannot create specialized (fast) code.

5 Likes

You could also avoid the infinite loop, by looping over a view of `a`:

``````a = [(1,2), (3,4)]
for t in view(a,:)
push!(a, reverse(t))
end
``````
1 Like

It is not clear what you don’t like about those, is it the syntax only?

2 Likes

I think it’s probably a misunderstanding. The problem isn’t with `push!` or `append!`, but with `enumerate` and with `for val in itr`.

2 Likes

I think this is clearly an instance of the XY problem, you are asking for something that you think it is the solution instead of just asking how to solve your real problem. It was not clear (maybe it is not yet, but I think I understand now) what property this function that is not `push!` or `append!` should have to satisfy your requirements.

It seems to me that your problem has nothing to do with `push!` or `append!`, but the fact that the algorithm (or at least the loop inside it) is badly designed. If you wanna grow a `Vector` adding an element for each element originally inside it, then just looping over the `Vector` will not work, as you noticed this ends up as an infinite loop. The first solutions I can think of are either:

1. Make a copy of the original `Vector`, iterate over the copy, but make changes to the original, this way the loop will stop where after all the original elements were processed.
2. (Probably the best.) Just save the length of the original `Vector` in a variable, and iterate from 1 to this length, this way the loop will also stop after all the original elements were processed, even if the `Vector` grew in the meantime .
1 Like

This is the same as using `pairs` or `eachindex`, as suggested upthread, but those will also handle generic indexing (like zero-based).

For me it was a novelty that `enumerate` works that way. It is not clear in the documentation that it has such a fundamental implementation difference relative to `pairs`, etc. Not sure how to search for `for val in vec` in the docs, I think that is not clear there either.

It should not be that rare to iterate over an array adding elements to it, I’m surprised nobody came with that issue before (particularly for the `for val in vec` syntax, which is very common).

2 Likes

Yes, I know, but why these are a solution was not explained anywhere in the thread. I tried to explain the source of the problem and give basic understandable alternatives. As it seemed to me OP needed that more than "just use `eachindex` or `pairs`".

Not sure how to search for for val in vec in the docs, I think that is not clear there either.

The structure `for val in vec` calls Base.iterate. There is also “Manual > Control Flow > Repeated Iteration: Loops” but, while they show the `for val in vec` syntax, they do not mention `Base.Iterate` or the detail that the original length is not cached. Only “Manual > Interfaces > Iteration” gets closer to giving answers.

For me it was a novelty that enumerate works that way. It is not clear in the documentation that it has such a fundamental implementation difference relative to pairs, etc.

Yes, I think maybe a documentation PR would be adequate. Maybe I will start one later if nobody else want dibs. But seems that everyone (including myself) was under a false impression about `pairs`, because `pairs` documentation explicitly says:

Mutation of the bounds of the underlying array will invalidate this iterator.

So it should not be used in the case above. The `eachindex` documentation does not make the same comment.

It should not be that rare to iterate over an array adding elements to it, I’m surprised nobody came with that issue before (particularly for the for val in vec syntax, which is very common).

Yes, I agree. But almost every time I do it, I want to iterate over the newly added elements too (almost every graph algorithm will want that) so this behaviour was always what I wanted and expected.

1 Like

Okay, perhaps my post was less useful than I hoped. But I would not recommend `1:length()`, that’s a bad habit that should be ditched as soon as possible. The natural alternative to `enumerate` is `pairs`.

My real problem is in finding the chemical structure cycles recursively via DFS, the iterations problem I reported is in the find_cycles function. I want the dumb_cycles function to return an array composed of the arrays with the vertices that make up the cycle.

``````using Pkg
using LinearAlgebra
using PyCall
using Statistics
using DataStructures
``````
``````function convert_neighbor_list(nl)
n_vert = maximum.(nl) + 1
K = []
V = []

for i in 1:n_vert
#make two different lists and merge it to a dict
push!(K, i)
push!(V, Int64[])
end
new = OrderedDict(zip(K, V))

for (i_v, j_v) in zip(nl, nl)
push!(new[i_v + 1], j_v)
end

return new
end
``````
``````function find_cycles(i_vert, cnl, max_length, cur_parth, passed_edges)

if length(cur_parth) == max_length
return []
end

acc_cycles = []
sort_cycles = []
res = []

neighbs = cnl[i_vert+1]
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
if n in cur_parth[2:end]
return []
end
end
end

for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
if n == cur_parth
return [cur_parth]
end
end
end

for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
cycs = find_cycles(n, cnl, max_length, append!(cur_parth, [n]), append!(passed_edges, [edge]))
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sort_cycles
append!(sort_cycles, sorted_cyc)
append!(acc_cycles, cyc)
end
end
end
end

return acc_cycles
end
``````
``````function dumb_cycle_detection(ase_atoms_no_h, max_length)
neighborlist = pyimport("ase.neighborlist")
neighbor_list = neighborlist.neighbor_list("ij", ase_atoms_no_h, 2.0)
cycles = []
sorted_cycles = []
n_vert = maximum(neighbor_list)
cnl  = convert_neighbor_list(neighbor_list)
for i_vert in range(1, n_vert+1)
cycs = find_cycles(i_vert-1, cnl, max_length, [i_vert-1], [])
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sorted_cycles
append!(sorted_cycles, sorted_cyc)
append!(cycles, cyc)
end
end
end
return cycles
end
``````
``````ase_io = pyimport("ase.io")
ase = pyimport("ase")
ase_atoms_no_h = ase.Atoms([a for a in ase_atoms if a.symbol != 'H'])
a = dumb_cycle_detection(ase_atoms_no_h, 6) ## max_length is the maximum size of the cycle that I wanna find
print(a)
``````

The output should be:

``````[[0, 2, 3, 4, 5, 1], [1, 5, 9, 8, 7, 6], [4, 5, 9, 10, 11, 12], [6, 7, 24, 25, 26, 27], [7, 8, 20, 22, 23, 24], [8, 9, 10, 19, 21, 20], [10, 11, 16, 17, 18, 19], [11, 12, 13, 14, 15, 16], [15, 16, 17, 163, 164, 165], [23, 24, 25, 43, 44, 45], [28, 29, 33, 32, 31, 30], [29, 33, 37, 36, 35, 34], [32, 40, 39, 38, 37, 33], [34, 35, 52, 53, 54, 55], [35, 36, 48, 50, 51, 52], [36, 37, 38, 47, 49, 48], [38, 39, 44, 45, 46, 47], [39, 40, 41, 42, 43, 44], [51, 73, 72, 71, 53, 52], [56, 57, 61, 60, 59, 58], [57, 61, 65, 64, 63, 62], [60, 68, 67, 66, 65, 61], [62, 63, 80, 81, 82, 83], [63, 80, 79, 78, 76, 64], [64, 76, 77, 75, 66, 65], [66, 67, 72, 73, 74, 75], [67, 68, 69, 70, 71, 72], [79, 80, 81, 99, 100, 101], [84, 85, 89, 88, 87, 86], [85, 90, 91, 92, 93, 89], [88, 89, 93, 94, 95, 96], [90, 91, 108, 109, 110, 111], [91, 92, 104, 106, 107, 108], [92, 93, 94, 103, 105, 104], [94, 103, 102, 101, 100, 95], [95, 96, 97, 98, 99, 100], [107, 129, 128, 127, 109, 108], [112, 113, 117, 116, 115, 114], [113, 117, 121, 120, 119, 118], [116, 124, 123, 122, 121, 117], [118, 119, 136, 137, 138, 139], [119, 120, 132, 134, 135, 136], [120, 121, 122, 131, 133, 132], [122, 123, 128, 129, 130, 131], [123, 124, 125, 126, 127, 128], [135, 136, 137, 155, 156, 157], [140, 141, 145, 144, 143, 142], [141, 146, 147, 148, 149, 145], [144, 145, 149, 150, 151, 152], [146, 147, 164, 165, 166, 167], [147, 164, 163, 162, 160, 148], [148, 160, 161, 159, 150, 149], [150, 151, 156, 157, 158, 159], [151, 152, 153, 154, 155, 156]]
``````

My real problem is in finding the chemical structure cycles recursively via DFS, the iterations problem I reported is in the find_cycles function. I want the dumb_cycles function to return an array composed of the arrays with the vertices that make up the cycle.

``````using Pkg
using LinearAlgebra
using PyCall
using Statistics
using DataStructures
``````
``````function convert_neighbor_list(nl)
n_vert = maximum.(nl) + 1
K = []
V = []

for i in 1:n_vert
#make two different lists and merge it to a dict
push!(K, i)
push!(V, Int64[])
end
new = OrderedDict(zip(K, V))

for (i_v, j_v) in zip(nl, nl)
push!(new[i_v + 1], j_v)
end

return new
end
``````
``````function find_cycles(i_vert, cnl, max_length, cur_parth, passed_edges)

if length(cur_parth) == max_length
return []
end

acc_cycles = []
sort_cycles = []
res = []

neighbs = cnl[i_vert+1]
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
if n in cur_parth[2:end]
return []
end
end
end

for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
if n == cur_parth
return [cur_parth]
end
end
end

for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))

if edge ∉ passed_edges
cycs = find_cycles(n, cnl, max_length, append!(cur_parth, [n]), append!(passed_edges, [edge]))
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sort_cycles
append!(sort_cycles, sorted_cyc)
append!(acc_cycles, cyc)
end
end
end
end

return acc_cycles
end
``````
``````function dumb_cycle_detection(ase_atoms_no_h, max_length)
neighborlist = pyimport("ase.neighborlist")
neighbor_list = neighborlist.neighbor_list("ij", ase_atoms_no_h, 2.0)
cycles = []
sorted_cycles = []
n_vert = maximum(neighbor_list)
cnl  = convert_neighbor_list(neighbor_list)
for i_vert in range(1, n_vert+1)
cycs = find_cycles(i_vert-1, cnl, max_length, [i_vert-1], [])
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sorted_cycles
append!(sorted_cycles, sorted_cyc)
append!(cycles, cyc)
end
end
end
return cycles
end
``````
``````ase_io = pyimport("ase.io")
``````[[0, 2, 3, 4, 5, 1], [1, 5, 9, 8, 7, 6], [4, 5, 9, 10, 11, 12], [6, 7, 24, 25, 26, 27], [7, 8, 20, 22, 23, 24], [8, 9, 10, 19, 21, 20], [10, 11, 16, 17, 18, 19], [11, 12, 13, 14, 15, 16], [15, 16, 17, 163, 164, 165], [23, 24, 25, 43, 44, 45], [28, 29, 33, 32, 31, 30], [29, 33, 37, 36, 35, 34], [32, 40, 39, 38, 37, 33], [34, 35, 52, 53, 54, 55], [35, 36, 48, 50, 51, 52], [36, 37, 38, 47, 49, 48], [38, 39, 44, 45, 46, 47], [39, 40, 41, 42, 43, 44], [51, 73, 72, 71, 53, 52], [56, 57, 61, 60, 59, 58], [57, 61, 65, 64, 63, 62], [60, 68, 67, 66, 65, 61], [62, 63, 80, 81, 82, 83], [63, 80, 79, 78, 76, 64], [64, 76, 77, 75, 66, 65], [66, 67, 72, 73, 74, 75], [67, 68, 69, 70, 71, 72], [79, 80, 81, 99, 100, 101], [84, 85, 89, 88, 87, 86], [85, 90, 91, 92, 93, 89], [88, 89, 93, 94, 95, 96], [90, 91, 108, 109, 110, 111], [91, 92, 104, 106, 107, 108], [92, 93, 94, 103, 105, 104], [94, 103, 102, 101, 100, 95], [95, 96, 97, 98, 99, 100], [107, 129, 128, 127, 109, 108], [112, 113, 117, 116, 115, 114], [113, 117, 121, 120, 119, 118], [116, 124, 123, 122, 121, 117], [118, 119, 136, 137, 138, 139], [119, 120, 132, 134, 135, 136], [120, 121, 122, 131, 133, 132], [122, 123, 128, 129, 130, 131], [123, 124, 125, 126, 127, 128], [135, 136, 137, 155, 156, 157], [140, 141, 145, 144, 143, 142], [141, 146, 147, 148, 149, 145], [144, 145, 149, 150, 151, 152], [146, 147, 164, 165, 166, 167], [147, 164, 163, 162, 160, 148], [148, 160, 161, 159, 150, 149], [150, 151, 156, 157, 158, 159], [151, 152, 153, 154, 155, 156]]