It is not clear what you don’t like about those, is it the syntax only?
I think it’s probably a misunderstanding. The problem isn’t with push!
or append!
, but with enumerate
and with for val in itr
.
I think this is clearly an instance of the XY problem, you are asking for something that you think it is the solution instead of just asking how to solve your real problem. It was not clear (maybe it is not yet, but I think I understand now) what property this function that is not push!
or append!
should have to satisfy your requirements.
It seems to me that your problem has nothing to do with push!
or append!
, but the fact that the algorithm (or at least the loop inside it) is badly designed. If you wanna grow a Vector
adding an element for each element originally inside it, then just looping over the Vector
will not work, as you noticed this ends up as an infinite loop. The first solutions I can think of are either:
- Make a copy of the original
Vector
, iterate over the copy, but make changes to the original, this way the loop will stop where after all the original elements were processed. - (Probably the best.) Just save the length of the original
Vector
in a variable, and iterate from 1 to this length, this way the loop will also stop after all the original elements were processed, even if theVector
grew in the meantime .
This is the same as using pairs
or eachindex
, as suggested upthread, but those will also handle generic indexing (like zero-based).
For me it was a novelty that enumerate
works that way. It is not clear in the documentation that it has such a fundamental implementation difference relative to pairs
, etc. Not sure how to search for for val in vec
in the docs, I think that is not clear there either.
It should not be that rare to iterate over an array adding elements to it, I’m surprised nobody came with that issue before (particularly for the for val in vec
syntax, which is very common).
Yes, I know, but why these are a solution was not explained anywhere in the thread. I tried to explain the source of the problem and give basic understandable alternatives. As it seemed to me OP needed that more than “just use eachindex
or pairs
”.
Not sure how to search for for val in vec in the docs, I think that is not clear there either.
The structure for val in vec
calls Base.iterate. There is also “Manual > Control Flow > Repeated Iteration: Loops” but, while they show the for val in vec
syntax, they do not mention Base.Iterate
or the detail that the original length is not cached. Only “Manual > Interfaces > Iteration” gets closer to giving answers.
For me it was a novelty that enumerate works that way. It is not clear in the documentation that it has such a fundamental implementation difference relative to pairs, etc.
Yes, I think maybe a documentation PR would be adequate. Maybe I will start one later if nobody else want dibs. But seems that everyone (including myself) was under a false impression about pairs
, because pairs
documentation explicitly says:
Mutation of the bounds of the underlying array will invalidate this iterator.
So it should not be used in the case above. The eachindex
documentation does not make the same comment.
It should not be that rare to iterate over an array adding elements to it, I’m surprised nobody came with that issue before (particularly for the for val in vec syntax, which is very common).
Yes, I agree. But almost every time I do it, I want to iterate over the newly added elements too (almost every graph algorithm will want that) so this behaviour was always what I wanted and expected.
Okay, perhaps my post was less useful than I hoped. But I would not recommend 1:length()
, that’s a bad habit that should be ditched as soon as possible. The natural alternative to enumerate
is pairs
.
My real problem is in finding the chemical structure cycles recursively via DFS, the iterations problem I reported is in the find_cycles function. I want the dumb_cycles function to return an array composed of the arrays with the vertices that make up the cycle.
using Pkg
using LinearAlgebra
using PyCall
using Statistics
using DataStructures
function convert_neighbor_list(nl)
n_vert = maximum.(nl)[1] + 1
K = []
V = []
for i in 1:n_vert
#make two different lists and merge it to a dict
push!(K, i)
push!(V, Int64[])
end
new = OrderedDict(zip(K, V))
for (i_v, j_v) in zip(nl[1], nl[2])
push!(new[i_v + 1], j_v)
end
return new
end
function find_cycles(i_vert, cnl, max_length, cur_parth, passed_edges)
if length(cur_parth) == max_length
return []
end
acc_cycles = []
sort_cycles = []
res = []
neighbs = cnl[i_vert+1]
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
if n in cur_parth[2:end]
return []
end
end
end
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
if n == cur_parth[1]
return [cur_parth]
end
end
end
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
cycs = find_cycles(n, cnl, max_length, append!(cur_parth, [n]), append!(passed_edges, [edge]))
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sort_cycles
append!(sort_cycles, sorted_cyc)
append!(acc_cycles, cyc)
end
end
end
end
return acc_cycles
end
function dumb_cycle_detection(ase_atoms_no_h, max_length)
neighborlist = pyimport("ase.neighborlist")
neighbor_list = neighborlist.neighbor_list("ij", ase_atoms_no_h, 2.0)
cycles = []
sorted_cycles = []
n_vert = maximum(neighbor_list[1])
cnl = convert_neighbor_list(neighbor_list)
for i_vert in range(1, n_vert+1)
cycs = find_cycles(i_vert-1, cnl, max_length, [i_vert-1], [])
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sorted_cycles
append!(sorted_cycles, sorted_cyc)
append!(cycles, cyc)
end
end
end
return cycles
end
ase_io = pyimport("ase.io")
ase_atoms = ase_io.read("donut-6-b3lyp-opt.xyz")
ase = pyimport("ase")
ase_atoms_no_h = ase.Atoms([a for a in ase_atoms if a.symbol != 'H'])
a = dumb_cycle_detection(ase_atoms_no_h, 6) ## max_length is the maximum size of the cycle that I wanna find
print(a)
The output should be:
[[0, 2, 3, 4, 5, 1], [1, 5, 9, 8, 7, 6], [4, 5, 9, 10, 11, 12], [6, 7, 24, 25, 26, 27], [7, 8, 20, 22, 23, 24], [8, 9, 10, 19, 21, 20], [10, 11, 16, 17, 18, 19], [11, 12, 13, 14, 15, 16], [15, 16, 17, 163, 164, 165], [23, 24, 25, 43, 44, 45], [28, 29, 33, 32, 31, 30], [29, 33, 37, 36, 35, 34], [32, 40, 39, 38, 37, 33], [34, 35, 52, 53, 54, 55], [35, 36, 48, 50, 51, 52], [36, 37, 38, 47, 49, 48], [38, 39, 44, 45, 46, 47], [39, 40, 41, 42, 43, 44], [51, 73, 72, 71, 53, 52], [56, 57, 61, 60, 59, 58], [57, 61, 65, 64, 63, 62], [60, 68, 67, 66, 65, 61], [62, 63, 80, 81, 82, 83], [63, 80, 79, 78, 76, 64], [64, 76, 77, 75, 66, 65], [66, 67, 72, 73, 74, 75], [67, 68, 69, 70, 71, 72], [79, 80, 81, 99, 100, 101], [84, 85, 89, 88, 87, 86], [85, 90, 91, 92, 93, 89], [88, 89, 93, 94, 95, 96], [90, 91, 108, 109, 110, 111], [91, 92, 104, 106, 107, 108], [92, 93, 94, 103, 105, 104], [94, 103, 102, 101, 100, 95], [95, 96, 97, 98, 99, 100], [107, 129, 128, 127, 109, 108], [112, 113, 117, 116, 115, 114], [113, 117, 121, 120, 119, 118], [116, 124, 123, 122, 121, 117], [118, 119, 136, 137, 138, 139], [119, 120, 132, 134, 135, 136], [120, 121, 122, 131, 133, 132], [122, 123, 128, 129, 130, 131], [123, 124, 125, 126, 127, 128], [135, 136, 137, 155, 156, 157], [140, 141, 145, 144, 143, 142], [141, 146, 147, 148, 149, 145], [144, 145, 149, 150, 151, 152], [146, 147, 164, 165, 166, 167], [147, 164, 163, 162, 160, 148], [148, 160, 161, 159, 150, 149], [150, 151, 156, 157, 158, 159], [151, 152, 153, 154, 155, 156]]
My real problem is in finding the chemical structure cycles recursively via DFS, the iterations problem I reported is in the find_cycles function. I want the dumb_cycles function to return an array composed of the arrays with the vertices that make up the cycle.
using Pkg
using LinearAlgebra
using PyCall
using Statistics
using DataStructures
function convert_neighbor_list(nl)
n_vert = maximum.(nl)[1] + 1
K = []
V = []
for i in 1:n_vert
#make two different lists and merge it to a dict
push!(K, i)
push!(V, Int64[])
end
new = OrderedDict(zip(K, V))
for (i_v, j_v) in zip(nl[1], nl[2])
push!(new[i_v + 1], j_v)
end
return new
end
function find_cycles(i_vert, cnl, max_length, cur_parth, passed_edges)
if length(cur_parth) == max_length
return []
end
acc_cycles = []
sort_cycles = []
res = []
neighbs = cnl[i_vert+1]
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
if n in cur_parth[2:end]
return []
end
end
end
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
if n == cur_parth[1]
return [cur_parth]
end
end
end
for n in neighbs
edge = (minimum([i_vert, n]), maximum([i_vert, n]))
if edge ∉ passed_edges
cycs = find_cycles(n, cnl, max_length, append!(cur_parth, [n]), append!(passed_edges, [edge]))
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sort_cycles
append!(sort_cycles, sorted_cyc)
append!(acc_cycles, cyc)
end
end
end
end
return acc_cycles
end
function dumb_cycle_detection(ase_atoms_no_h, max_length)
neighborlist = pyimport("ase.neighborlist")
neighbor_list = neighborlist.neighbor_list("ij", ase_atoms_no_h, 2.0)
cycles = []
sorted_cycles = []
n_vert = maximum(neighbor_list[1])
cnl = convert_neighbor_list(neighbor_list)
for i_vert in range(1, n_vert+1)
cycs = find_cycles(i_vert-1, cnl, max_length, [i_vert-1], [])
for cyc in cycs
sorted_cyc = tuple(cyc)
if sorted_cyc ∉ sorted_cycles
append!(sorted_cycles, sorted_cyc)
append!(cycles, cyc)
end
end
end
return cycles
end
ase_io = pyimport("ase.io")
ase_atoms = ase_io.read("donut-6-b3lyp-opt.xyz")
ase = pyimport("ase")
ase_atoms_no_h = ase.Atoms([a for a in ase_atoms if a.symbol != 'H'])
a = dumb_cycle_detection(ase_atoms_no_h, 6) ## max_length is the maximum size of the cycle that I wanna find
print(a)
The output should be:
[[0, 2, 3, 4, 5, 1], [1, 5, 9, 8, 7, 6], [4, 5, 9, 10, 11, 12], [6, 7, 24, 25, 26, 27], [7, 8, 20, 22, 23, 24], [8, 9, 10, 19, 21, 20], [10, 11, 16, 17, 18, 19], [11, 12, 13, 14, 15, 16], [15, 16, 17, 163, 164, 165], [23, 24, 25, 43, 44, 45], [28, 29, 33, 32, 31, 30], [29, 33, 37, 36, 35, 34], [32, 40, 39, 38, 37, 33], [34, 35, 52, 53, 54, 55], [35, 36, 48, 50, 51, 52], [36, 37, 38, 47, 49, 48], [38, 39, 44, 45, 46, 47], [39, 40, 41, 42, 43, 44], [51, 73, 72, 71, 53, 52], [56, 57, 61, 60, 59, 58], [57, 61, 65, 64, 63, 62], [60, 68, 67, 66, 65, 61], [62, 63, 80, 81, 82, 83], [63, 80, 79, 78, 76, 64], [64, 76, 77, 75, 66, 65], [66, 67, 72, 73, 74, 75], [67, 68, 69, 70, 71, 72], [79, 80, 81, 99, 100, 101], [84, 85, 89, 88, 87, 86], [85, 90, 91, 92, 93, 89], [88, 89, 93, 94, 95, 96], [90, 91, 108, 109, 110, 111], [91, 92, 104, 106, 107, 108], [92, 93, 94, 103, 105, 104], [94, 103, 102, 101, 100, 95], [95, 96, 97, 98, 99, 100], [107, 129, 128, 127, 109, 108], [112, 113, 117, 116, 115, 114], [113, 117, 121, 120, 119, 118], [116, 124, 123, 122, 121, 117], [118, 119, 136, 137, 138, 139], [119, 120, 132, 134, 135, 136], [120, 121, 122, 131, 133, 132], [122, 123, 128, 129, 130, 131], [123, 124, 125, 126, 127, 128], [135, 136, 137, 155, 156, 157], [140, 141, 145, 144, 143, 142], [141, 146, 147, 148, 149, 145], [144, 145, 149, 150, 151, 152], [146, 147, 164, 165, 166, 167], [147, 164, 163, 162, 160, 148], [148, 160, 161, 159, 150, 149], [150, 151, 156, 157, 158, 159], [151, 152, 153, 154, 155, 156]]
I didn’t look at the details, but from the discussion above you should use for (_, cyc) in pairs(cycs)
here.
Thank you, have you an alternative for using ase
with pycall? I’d like to use a julia package to extract the neighborlists from xyz files
I had never heard of ase before, thus no idea.
Not really, the code does not seem to be appending to cycs
while traversing it, so I am not sure anymore if this is the reason of the infinite loop.
I did not inspect all the code but:
for (i_v, j_v) in zip(nl[1], nl[2])
push!(new[i_v + 1], j_v)
end
Is probably wrong, because you are changing the identifier of the nodes but not the references to them. This is, you are saying that now node 0
is node 1
, node 1
is node 2
and so on, but the j_v
is referring to the nodes by their old numbers, starting at 0
. I see some +1
and -1
in some variables that may be trying to fix this inconsistency, however, the ideal solution is to either change all indexes to base 1 at the start and then reduce 1 at after the end, or do not change the indexes at all (you are using an OrderedDict
not a Vector
, you do not need to start at base 1 if this is were you will keep all the references).
You should probably make the .xyz
files available to us if you want us to run the code trying to find the problem. Even if most people (I included) will probably not install Python and some Python library to run the code. Instead, a code with the object ase_atoms_no_h
hardcoded would be ideal for us to check for problems.
The xyz file
168
Properties=species:S:1:pos:R:3 a=T pbc="F F F"
C -12.39796606 5.75557728 -0.00041200
C -11.19877446 5.02970593 -0.00032500
C -12.39394703 7.14159987 -0.00041400
C -11.19681055 7.84012337 -0.00032400
C -9.96744844 7.16661422 -0.00023300
C -9.96819366 5.74383274 -0.00024000
C -11.18330163 3.57177356 -0.00031300
C -9.94599750 2.87057938 -0.00023200
C -8.72064338 3.60828242 -0.00016200
C -8.72310290 5.02638964 -0.00016200
C -7.49496079 5.73540164 -0.00008200
C -7.47119819 7.16548340 -0.00006000
C -8.69832579 7.88433709 -0.00012800
C -8.64244256 9.28197414 -0.00008900
C -7.43982971 9.95371555 0.00002900
C -6.21247989 9.27715558 0.00010500
C -6.22187177 7.86472647 0.00003800
C -4.97636979 7.12169735 0.00007500
C -5.03553510 5.75021698 0.00006300
C -6.26161667 5.01942371 -0.00000900
C -7.48271125 2.90026830 -0.00008500
C -6.28203360 3.61980199 -0.00001300
C -7.50012903 1.47302251 -0.00008200
C -8.65702774 0.73407711 -0.00014500
C -9.92446254 1.43904343 -0.00021700
C -11.14174854 0.72262286 -0.00026600
C -12.34257130 1.44521981 -0.00034900
C -12.36447512 2.82254806 -0.00037400
C -11.18344950 -7.85916442 -0.00005400
C -9.95523438 -7.18356547 -0.00004800
C -12.38177302 -7.16267824 -0.00009000
C -12.38815194 -5.77666563 -0.00012300
C -11.19019759 -5.04875581 -0.00012000
C -9.95840276 -5.76078798 -0.00008100
C -8.68488954 -7.89912934 -0.00001100
C -7.45898928 -7.17818687 -0.00000900
C -7.48518665 -5.74814774 -0.00003800
C -8.71453531 -5.04122633 -0.00007300
C -8.71448956 -3.62311770 -0.00009900
C -9.94109821 -2.88750101 -0.00014100
C -11.17720663 -3.59079847 -0.00015600
C -12.35965442 -2.84358337 -0.00020600
C -12.34009569 -1.46621958 -0.00024100
C -11.14050425 -0.74158212 -0.00022600
C -9.92200007 -1.45593064 -0.00016900
C -8.65576609 -0.74880892 -0.00014100
C -7.49761137 -1.48578514 -0.00010700
C -7.47776502 -2.91299915 -0.00008800
C -6.25306491 -5.03007198 -0.00003100
C -6.27586411 -3.63048832 -0.00005500
C -5.02574116 -5.75877619 0.00000000
C -4.96423994 -7.13015173 0.00002500
C -6.20847340 -7.87530143 0.00002300
C -6.19667499 -9.28771224 0.00005400
C -7.42286943 -9.96636153 0.00005600
C -8.62662507 -9.29666966 0.00002400
C 1.21450823 -13.61472667 0.00022800
C 1.24353338 -12.21325957 0.00019700
C 0.01216833 -14.30425732 0.00023100
C -1.19134122 -13.61676991 0.00020300
C -1.22274459 -12.21535646 0.00017100
C 0.00979104 -11.50461068 0.00016900
C 2.49840294 -11.47089481 0.00019400
C 2.48700183 -10.04876049 0.00015900
C 1.23545432 -9.35642485 0.00013300
C 0.00856839 -10.06760901 0.00013700
C -1.21952599 -9.35851261 0.00011100
C -2.46989304 -10.05297615 0.00011200
C -2.47887384 -11.47512729 0.00014000
C -3.71720311 -12.12555146 0.00013700
C -4.90025658 -11.41993218 0.00010900
C -4.92801419 -10.01873685 0.00008300
C -3.70011978 -9.32065305 0.00008300
C -3.67939065 -7.87050088 0.00005500
C -2.46207174 -7.23599761 0.00005600
C -1.21614479 -7.93241823 0.00008200
C 1.22964557 -7.93033819 0.00010200
C 0.00617072 -7.25028844 0.00007900
C 2.47438565 -7.23180047 0.00009300
C 3.69278161 -7.86423016 0.00011300
C 3.71598009 -9.31434484 0.00015200
C 4.94506338 -10.01033595 0.00018300
C 4.91968930 -11.41157691 0.00021900
C 3.73783892 -12.11920941 0.00022400
C 12.39796607 -5.75557729 0.00006900
C 11.19877447 -5.02970594 0.00003000
C 12.39394714 -7.14159888 0.00013300
C 11.19681155 -7.84012350 0.00015900
C 9.96744844 -7.16661423 0.00012300
C 9.96819366 -5.74383275 0.00006200
C 11.18330164 -3.57177357 -0.00004200
C 9.94599650 -2.87057927 -0.00006400
C 8.72064339 -3.60828242 -0.00002400
C 8.72310291 -5.02638964 0.00003200
C 7.49496079 -5.73540164 0.00005800
C 7.47119819 -7.16548341 0.00011100
C 8.69832579 -7.88433709 0.00014500
C 8.64244256 -9.28197414 0.00019700
C 7.43982971 -9.95371556 0.00021100
C 6.21247977 -9.27715658 0.00017500
C 6.22187277 -7.86472660 0.00012800
C 4.97636979 -7.12169735 0.00009600
C 5.03553511 -5.75021699 0.00004900
C 6.26161657 -5.01942471 0.00002800
C 7.48271126 -2.90026831 -0.00004800
C 6.28203249 -3.61980289 -0.00002200
C 7.50012804 -1.47302240 -0.00009600
C 8.65702674 -0.73407700 -0.00012700
C 9.92446254 -1.43904344 -0.00012900
C 11.14174854 -0.72262287 -0.00019800
C 12.34257131 -1.44521982 -0.00017100
C 12.36447512 -2.82254807 -0.00009200
C 11.18345049 7.85916430 -0.00054900
C 9.95523438 7.18356546 -0.00041400
C 12.38177390 7.16267712 -0.00065100
C 12.38815183 5.77666463 -0.00061500
C 11.19019759 5.04875580 -0.00048200
C 9.95840277 5.76078797 -0.00038900
C 8.68488966 7.89913033 -0.00029400
C 7.45898939 7.17818786 -0.00018300
C 7.48518666 5.74814773 -0.00017700
C 8.71453543 5.04122732 -0.00026800
C 8.71448858 3.62311780 -0.00023600
C 9.94109721 2.88750112 -0.00031000
C 11.17720663 3.59079846 -0.00043100
C 12.35965442 2.84358336 -0.00049100
C 12.34009569 1.46621958 -0.00042100
C 11.14050325 0.74158222 -0.00029400
C 9.92199909 1.45593074 -0.00025500
C 8.65576510 0.74880902 -0.00015400
C 7.49761050 1.48578624 -0.00008500
C 7.47776402 2.91299926 -0.00011800
C 6.25306492 5.03007198 -0.00006300
C 6.27586311 3.63048842 -0.00003700
C 5.02574017 5.75877631 0.00002100
C 4.96424006 7.13015272 0.00000800
C 6.20847340 7.87530143 -0.00006800
C 6.19667511 9.28771322 -0.00002000
C 7.42286943 9.96636153 -0.00014100
C 8.62662507 9.29666965 -0.00027900
C -1.21450933 13.61472577 0.00072000
C -1.24353337 12.21325957 0.00055900
C -0.01216931 14.30425744 0.00079100
C 1.19134024 13.61677001 0.00069700
C 1.22274459 12.21535645 0.00053600
C -0.00979104 11.50461067 0.00047600
C -2.49840294 11.47089480 0.00047000
C -2.48700182 10.04876048 0.00032700
C -1.23545432 9.35642485 0.00026700
C -0.00856839 10.06760901 0.00033000
C 1.21952599 9.35851261 0.00024600
C 2.46989315 10.05297714 0.00028200
C 2.47887385 11.47512729 0.00042100
C 3.71720312 12.12555145 0.00043800
C 4.90025670 11.41993317 0.00030300
C 4.92801419 10.01873684 0.00015300
C 3.70011990 9.32065404 0.00017000
C 3.67939076 7.87050186 0.00006700
C 2.46207284 7.23599848 0.00002800
C 1.21614479 7.93241823 0.00010800
C -1.22964458 7.93033807 0.00012700
C -0.00616972 7.25028833 0.00005400
C -2.47438477 7.23179936 0.00006600
C -3.69278061 7.86423003 0.00012300
C -3.71598008 9.31434484 0.00023600
C -4.94506239 10.01033583 0.00025100
C -4.91968830 11.41157679 0.00040300
C -3.73783892 12.11920941 0.00051400
@DNF, @Henrique_Becker, The variables
> ase_atoms_no_h
PyObject Atoms(symbols='C168', pbc=False)
> neighborlist = pyimport("ase.neighborlist")
> neighbor_list = neighborlist.neighbor_list("ij", ase_atoms_no_h, 2.0)
> print(neighbor_list)
([0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 22, 22, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 27, 27, 28, 28, 29, 29, 29, 30, 30, 31, 31, 32, 32, 32, 33, 33, 33, 34, 34, 34, 35, 35, 35, 36, 36, 36, 37, 37, 37, 38, 38, 38, 39, 39, 39, 40, 40, 40, 41, 41, 42, 42, 43, 43, 43, 44, 44, 44, 45, 45, 45, 46, 46, 47, 47, 47, 48, 48, 48, 49, 49, 50, 50, 51, 51, 51, 52, 52, 52, 53, 53, 53, 54, 54, 55, 55, 56, 56, 57, 57, 57, 58, 58, 59, 59, 60, 60, 60, 61, 61, 61, 62, 62, 62, 63, 63, 63, 64, 64, 64, 65, 65, 65, 66, 66, 66, 67, 67, 67, 68, 68, 68, 69, 69, 70, 70, 71, 71, 71, 72, 72, 72, 73, 73, 73, 74, 74, 75, 75, 75, 76, 76, 76, 77, 77, 78, 78, 79, 79, 79, 80, 80, 80, 81, 81, 81, 82, 82, 83, 83, 84, 84, 85, 85, 85, 86, 86, 87, 87, 88, 88, 88, 89, 89, 89, 90, 90, 90, 91, 91, 91, 92, 92, 92, 93, 93, 93, 94, 94, 94, 95, 95, 95, 96, 96, 96, 97, 97, 98, 98, 99, 99, 99, 100, 100, 100, 101, 101, 101, 102, 102, 103, 103, 103, 104, 104, 104, 105, 105, 106, 106, 107, 107, 107, 108, 108, 108, 109, 109, 109, 110, 110, 111, 111, 112, 112, 113, 113, 113, 114, 114, 115, 115, 116, 116, 116, 117, 117, 117, 118, 118, 118, 119, 119, 119, 120, 120, 120, 121, 121, 121, 122, 122, 122, 123, 123, 123, 124, 124, 124, 125, 125, 126, 126, 127, 127, 127, 128, 128, 128, 129, 129, 129, 130, 130, 131, 131, 131, 132, 132, 132, 133, 133, 134, 134, 135, 135, 135, 136, 136, 136, 137, 137, 137, 138, 138, 139, 139, 140, 140, 141, 141, 141, 142, 142, 143, 143, 144, 144, 144, 145, 145, 145, 146, 146, 146, 147, 147, 147, 148, 148, 148, 149, 149, 149, 150, 150, 150, 151, 151, 151, 152, 152, 152, 153, 153, 154, 154, 155, 155, 155, 156, 156, 156, 157, 157, 157, 158, 158, 159, 159, 159, 160, 160, 160, 161, 161, 162, 162, 163, 163, 163, 164, 164, 164, 165, 165, 165, 166, 166, 167, 167], [2, 1, 5, 6, 0, 3, 0, 4, 2, 5, 3, 12, 1, 4, 9, 7, 1, 27, 8, 6, 24, 9, 7, 20, 5, 10, 8, 9, 11, 19, 12, 16, 10, 4, 11, 13, 12, 14, 13, 15, 14, 16, 165, 11, 15, 17, 16, 18, 163, 19, 17, 21, 10, 18, 8, 22, 21, 19, 20, 20, 23, 24, 22, 45, 7, 23, 25, 24, 26, 43, 25, 27, 6, 26, 29, 30, 28, 33, 34, 28, 31, 30, 32, 40, 31, 33, 29, 32, 37, 29, 35, 55, 34, 36, 52, 35, 37, 48, 36, 38, 33, 37, 39, 47, 40, 38, 44, 39, 32, 41, 42, 40, 41, 43, 25, 44, 42, 39, 45, 43, 23, 46, 44, 47, 45, 38, 46, 49, 36, 49, 50, 47, 48, 48, 51, 50, 73, 52, 53, 51, 35, 71, 54, 52, 55, 53, 54, 34, 57, 58, 61, 56, 62, 56, 59, 58, 60, 59, 68, 61, 60, 57, 65, 57, 63, 83, 80, 62, 64, 63, 76, 65, 66, 61, 64, 65, 67, 75, 66, 68, 72, 60, 67, 69, 68, 70, 69, 71, 53, 70, 72, 67, 71, 73, 74, 72, 51, 73, 75, 66, 74, 77, 77, 78, 64, 76, 75, 76, 79, 78, 80, 101, 79, 81, 63, 80, 82, 99, 83, 81, 62, 82, 85, 86, 84, 90, 89, 87, 84, 86, 88, 87, 89, 96, 88, 93, 85, 91, 111, 85, 90, 92, 108, 91, 93, 104, 89, 92, 94, 103, 93, 95, 94, 96, 100, 88, 95, 97, 96, 98, 97, 99, 81, 98, 100, 95, 99, 101, 79, 100, 102, 103, 101, 94, 102, 105, 106, 105, 92, 104, 103, 107, 104, 129, 106, 108, 91, 107, 109, 108, 110, 127, 109, 111, 90, 110, 113, 114, 112, 117, 118, 112, 115, 114, 116, 124, 115, 117, 113, 116, 121, 113, 119, 139, 118, 120, 136, 119, 121, 132, 117, 120, 122, 121, 123, 131, 124, 122, 128, 123, 116, 125, 126, 124, 127, 125, 126, 128, 109, 123, 129, 127, 107, 128, 130, 129, 131, 122, 130, 133, 120, 133, 134, 131, 132, 132, 135, 134, 136, 157, 137, 135, 119, 136, 138, 155, 137, 139, 118, 138, 141, 142, 140, 146, 145, 140, 143, 142, 144, 143, 145, 152, 149, 141, 144, 141, 147, 167, 164, 148, 146, 160, 149, 147, 150, 145, 148, 151, 149, 159, 150, 152, 156, 144, 153, 151, 152, 154, 153, 155, 137, 154, 156, 151, 155, 157, 135, 156, 158, 157, 159, 161, 158, 150, 148, 162, 161, 159, 160, 163, 160, 164, 162, 17, 163, 165, 147, 15, 166, 164, 167, 165, 166, 146])
> a = convert_neighbor_list(neighbor_list)
> print(a)
OrderedDict{Any, Any}(1 => [2, 1], 2 => [5, 6, 0], 3 => [3, 0], 4 => [4, 2], 5 => [5, 3, 12], 6 => [1, 4, 9], 7 => [7, 1, 27], 8 => [8, 6, 24], 9 => [9, 7, 20], 10 => [5, 10, 8], 11 => [9, 11, 19], 12 => [12, 16, 10], 13 => [4, 11, 13], 14 => [12, 14], 15 => [13, 15], 16 => [14, 16, 165], 17 => [11, 15, 17], 18 => [16, 18, 163], 19 => [19, 17], 20 => [21, 10, 18], 21 => [8, 22, 21], 22 => [19, 20], 23 => [20, 23], 24 => [24, 22, 45], 25 => [7, 23, 25], 26 => [24, 26, 43], 27 => [25, 27], 28 => [6, 26], 29 => [29, 30], 30 => [28, 33, 34], 31 => [28, 31], 32 => [30, 32], 33 => [40, 31, 33], 34 => [29, 32, 37], 35 => [29, 35, 55], 36 => [34, 36, 52], 37 => [35, 37, 48], 38 => [36, 38, 33], 39 => [37, 39, 47], 40 => [40, 38, 44], 41 => [39, 32, 41], 42 => [42, 40], 43 => [41, 43], 44 => [25, 44, 42], 45 => [39, 45, 43], 46 => [23, 46, 44], 47 => [47, 45], 48 => [38, 46, 49], 49 => [36, 49, 50], 50 => [47, 48], 51 => [48, 51], 52 => [50, 73, 52], 53 => [53, 51, 35], 54 => [71, 54, 52], 55 => [55, 53], 56 => [54, 34], 57 => [57, 58], 58 => [61, 56, 62], 59 => [56, 59], 60 => [58, 60], 61 => [59, 68, 61], 62 => [60, 57, 65], 63 => [57, 63, 83], 64 => [80, 62, 64], 65 => [63, 76, 65], 66 => [66, 61, 64], 67 => [65, 67, 75], 68 => [66, 68, 72], 69 => [60, 67, 69], 70 => [68, 70], 71 => [69, 71], 72 => [53, 70, 72], 73 => [67, 71, 73], 74 => [74, 72, 51], 75 => [73, 75], 76 => [66, 74, 77], 77 => [77, 78, 64], 78 => [76, 75], 79 => [76, 79], 80 => [78, 80, 101], 81 => [79, 81, 63], 82 => [80, 82, 99], 83 => [83, 81], 84 => [62, 82], 85 => [85, 86], 86 => [84, 90, 89], 87 => [87, 84], 88 => [86, 88], 89 => [87, 89, 96], 90 => [88, 93, 85], 91 => [91, 111, 85], 92 => [90, 92, 108], 93 => [91, 93, 104], 94 => [89, 92, 94], 95 => [103, 93, 95], 96 => [94, 96, 100], 97 => [88, 95, 97], 98 => [96, 98], 99 => [97, 99], 100 => [81, 98, 100], 101 => [95, 99, 101], 102 => [79, 100, 102], 103 => [103, 101], 104 => [94, 102, 105], 105 => [106, 105, 92], 106 => [104, 103], 107 => [107, 104], 108 => [129, 106, 108], 109 => [91, 107, 109], 110 => [108, 110, 127], 111 => [109, 111], 112 => [90, 110], 113 => [113, 114], 114 => [112, 117, 118], 115 => [112, 115], 116 => [114, 116], 117 => [124, 115, 117], 118 => [113, 116, 121], 119 => [113, 119, 139], 120 => [118, 120, 136], 121 => [119, 121, 132], 122 => [117, 120, 122], 123 => [121, 123, 131], 124 => [124, 122, 128], 125 => [123, 116, 125], 126 => [126, 124], 127 => [127, 125], 128 => [126, 128, 109], 129 => [123, 129, 127], 130 => [107, 128, 130], 131 => [129, 131], 132 => [122, 130, 133], 133 => [120, 133, 134], 134 => [131, 132], 135 => [132, 135], 136 => [134, 136, 157], 137 => [137, 135, 119], 138 => [136, 138, 155], 139 => [137, 139], 140 => [118, 138], 141 => [141, 142], 142 => [140, 146, 145], 143 => [140, 143], 144 => [142, 144], 145 => [143, 145, 152], 146 => [149, 141, 144], 147 => [141, 147, 167], 148 => [164, 148, 146], 149 => [160, 149, 147], 150 => [150, 145, 148], 151 => [151, 149, 159], 152 => [150, 152, 156], 153 => [144, 153, 151], 154 => [152, 154], 155 => [153, 155], 156 => [137, 154, 156], 157 => [151, 155, 157], 158 => [135, 156, 158], 159 => [157, 159], 160 => [161, 158, 150], 161 => [148, 162, 161], 162 => [159, 160], 163 => [163, 160], 164 => [164, 162, 17], 165 => [163, 165, 147], 166 => [15, 166, 164], 167 => [167, 165], 168 => [166, 146])
That neighbor list is distance based (easy) or is it based on the chemical nature and valence of each atom?
Unfortunately, I found the original code to be too confusing, and in the end I rewrote the whole algorithm. One thing I was not sure during the implementation is if you are interested only on cycles of exactly length 6? Or any length smaller than 6 will do? My implementation only gets the cycles of size 6, but is easy to change this.
using LinearAlgebra
using Statistics
using DataStructures
const edges = ([0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 22, 22, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 27, 27, 28, 28, 29, 29, 29, 30, 30, 31, 31, 32, 32, 32, 33, 33, 33, 34, 34, 34, 35, 35, 35, 36, 36, 36, 37, 37, 37, 38, 38, 38, 39, 39, 39, 40, 40, 40, 41, 41, 42, 42, 43, 43, 43, 44, 44, 44, 45, 45, 45, 46, 46, 47, 47, 47, 48, 48, 48, 49, 49, 50, 50, 51, 51, 51, 52, 52, 52, 53, 53, 53, 54, 54, 55, 55, 56, 56, 57, 57, 57, 58, 58, 59, 59, 60, 60, 60, 61, 61, 61, 62, 62, 62, 63, 63, 63, 64, 64, 64, 65, 65, 65, 66, 66, 66, 67, 67, 67, 68, 68, 68, 69, 69, 70, 70, 71, 71, 71, 72, 72, 72, 73, 73, 73, 74, 74, 75, 75, 75, 76, 76, 76, 77, 77, 78, 78, 79, 79, 79, 80, 80, 80, 81, 81, 81, 82, 82, 83, 83, 84, 84, 85, 85, 85, 86, 86, 87, 87, 88, 88, 88, 89, 89, 89, 90, 90, 90, 91, 91, 91, 92, 92, 92, 93, 93, 93, 94, 94, 94, 95, 95, 95, 96, 96, 96, 97, 97, 98, 98, 99, 99, 99, 100, 100, 100, 101, 101, 101, 102, 102, 103, 103, 103, 104, 104, 104, 105, 105, 106, 106, 107, 107, 107, 108, 108, 108, 109, 109, 109, 110, 110, 111, 111, 112, 112, 113, 113, 113, 114, 114, 115, 115, 116, 116, 116, 117, 117, 117, 118, 118, 118, 119, 119, 119, 120, 120, 120, 121, 121, 121, 122, 122, 122, 123, 123, 123, 124, 124, 124, 125, 125, 126, 126, 127, 127, 127, 128, 128, 128, 129, 129, 129, 130, 130, 131, 131, 131, 132, 132, 132, 133, 133, 134, 134, 135, 135, 135, 136, 136, 136, 137, 137, 137, 138, 138, 139, 139, 140, 140, 141, 141, 141, 142, 142, 143, 143, 144, 144, 144, 145, 145, 145, 146, 146, 146, 147, 147, 147, 148, 148, 148, 149, 149, 149, 150, 150, 150, 151, 151, 151, 152, 152, 152, 153, 153, 154, 154, 155, 155, 155, 156, 156, 156, 157, 157, 157, 158, 158, 159, 159, 159, 160, 160, 160, 161, 161, 162, 162, 163, 163, 163, 164, 164, 164, 165, 165, 165, 166, 166, 167, 167], [2, 1, 5, 6, 0, 3, 0, 4, 2, 5, 3, 12, 1, 4, 9, 7, 1, 27, 8, 6, 24, 9, 7, 20, 5, 10, 8, 9, 11, 19, 12, 16, 10, 4, 11, 13, 12, 14, 13, 15, 14, 16, 165, 11, 15, 17, 16, 18, 163, 19, 17, 21, 10, 18, 8, 22, 21, 19, 20, 20, 23, 24, 22, 45, 7, 23, 25, 24, 26, 43, 25, 27, 6, 26, 29, 30, 28, 33, 34, 28, 31, 30, 32, 40, 31, 33, 29, 32, 37, 29, 35, 55, 34, 36, 52, 35, 37, 48, 36, 38, 33, 37, 39, 47, 40, 38, 44, 39, 32, 41, 42, 40, 41, 43, 25, 44, 42, 39, 45, 43, 23, 46, 44, 47, 45, 38, 46, 49, 36, 49, 50, 47, 48, 48, 51, 50, 73, 52, 53, 51, 35, 71, 54, 52, 55, 53, 54, 34, 57, 58, 61, 56, 62, 56, 59, 58, 60, 59, 68, 61, 60, 57, 65, 57, 63, 83, 80, 62, 64, 63, 76, 65, 66, 61, 64, 65, 67, 75, 66, 68, 72, 60, 67, 69, 68, 70, 69, 71, 53, 70, 72, 67, 71, 73, 74, 72, 51, 73, 75, 66, 74, 77, 77, 78, 64, 76, 75, 76, 79, 78, 80, 101, 79, 81, 63, 80, 82, 99, 83, 81, 62, 82, 85, 86, 84, 90, 89, 87, 84, 86, 88, 87, 89, 96, 88, 93, 85, 91, 111, 85, 90, 92, 108, 91, 93, 104, 89, 92, 94, 103, 93, 95, 94, 96, 100, 88, 95, 97, 96, 98, 97, 99, 81, 98, 100, 95, 99, 101, 79, 100, 102, 103, 101, 94, 102, 105, 106, 105, 92, 104, 103, 107, 104, 129, 106, 108, 91, 107, 109, 108, 110, 127, 109, 111, 90, 110, 113, 114, 112, 117, 118, 112, 115, 114, 116, 124, 115, 117, 113, 116, 121, 113, 119, 139, 118, 120, 136, 119, 121, 132, 117, 120, 122, 121, 123, 131, 124, 122, 128, 123, 116, 125, 126, 124, 127, 125, 126, 128, 109, 123, 129, 127, 107, 128, 130, 129, 131, 122, 130, 133, 120, 133, 134, 131, 132, 132, 135, 134, 136, 157, 137, 135, 119, 136, 138, 155, 137, 139, 118, 138, 141, 142, 140, 146, 145, 140, 143, 142, 144, 143, 145, 152, 149, 141, 144, 141, 147, 167, 164, 148, 146, 160, 149, 147, 150, 145, 148, 151, 149, 159, 150, 152, 156, 144, 153, 151, 152, 154, 153, 155, 137, 154, 156, 151, 155, 157, 135, 156, 158, 157, 159, 161, 158, 150, 148, 162, 161, 159, 160, 163, 160, 164, 162, 17, 163, 165, 147, 15, 166, 164, 167, 165, 166, 146])
function edges2neighbors(xs, ys)
# The input is base-zero, this is the reason for the "+1".
max_vertex = max(maximum(xs), maximum(ys)) + 1
neighbors = Vector{Int}[Int[] for i in 1:max_vertex]
for (i_v, j_v) in zip(xs, ys)
# Again, we updated every node number from base zero to base one.
push!(neighbors[i_v + 1], j_v + 1)
end
return neighbors
end
function inner_dfs_cycle_detection!(cycles, candidate, neighbors, cycle_length)
@assert length(candidate) <= cycle_length
for next_vertex in neighbors[last(candidate)]
if next_vertex == first(candidate)
# The cycle needs to have exactly length 6?
if length(candidate) < cycle_length
continue
else
push!(cycles, copy(candidate))
end
elseif next_vertex in candidate
# Do not cycle back to the middle of candidate.
continue
elseif length(candidate) < cycle_length
# If the candidate does not have a subcycle and it is smaller
# than the desired cycle length, then recurse.
push!(candidate, next_vertex)
inner_dfs_cycle_detection!(cycles, candidate, neighbors, cycle_length)
end
end
return
end
function dfs_cycle_detection(neighbors, cycle_length)
cycles = Vector{eltype(neighbors)}()
for initial_vertex in eachindex(neighbors)
inner_dfs_cycle_detection!(cycles, [initial_vertex], neighbors, cycle_length)
end
return cycles
end
neighbors = edges2neighbors(edges[1], edges[2])
cycles_base_1 = dfs_cycle_detection(neighbors, 6)
cycles_base_0 = map(vertices -> vertices .- 1, cycles_base_1)
println(cycles_base_0)
and the output
[[0, 2, 3, 4, 5, 1], [0, 2, 3, 4, 5, 1], [1, 5, 4, 3, 2, 0], [2, 3, 4, 5, 1, 6], [3, 4, 5, 1, 6, 7], [4, 5, 1, 6, 7, 8], [5, 1, 6, 7, 8, 9], [6, 7, 8, 9, 5, 1], [6, 7, 8, 9, 5, 1], [7, 8, 9, 5, 1, 6], [7, 8, 9, 5, 1, 6], [8, 9, 5, 1, 6, 7], [8, 9, 5, 1, 6, 7], [9, 5, 1, 6, 7, 8], [9, 5, 1, 6, 7, 8], [10, 9, 5, 1, 6, 7], [11, 12, 4, 5, 1, 6], [12, 4, 5, 1, 6, 7], [13, 12, 4, 5, 1, 6], [14, 13, 12, 4, 5, 1], [15, 14, 13, 12, 4, 5], [16, 11, 12, 4, 5, 1], [17, 16, 11, 12, 4, 5], [18, 19, 21, 20, 8, 9], [20, 8, 9, 5, 1, 6], [22, 20, 8, 9, 5, 1], [23, 24, 7, 8, 9, 5], [24, 7, 8, 9, 5, 1], [25, 24, 7, 8, 9, 5], [26, 25, 24, 7, 8, 9], [27, 6, 7, 8, 9, 5], [30, 28, 29, 33, 32, 40], [31, 30, 28, 29, 33, 32], [31, 30, 28, 29, 33, 32], [32, 40, 39, 38, 37, 36], [33, 29, 28, 30, 31, 32], [33, 29, 28, 30, 31, 32], [34, 29, 28, 30, 31, 32], [35, 34, 29, 28, 30, 31], [36, 35, 34, 29, 28, 30], [37, 36, 35, 34, 29, 28], [38, 37, 36, 35, 34, 29], [43, 25, 24, 7, 8, 9], [44, 39, 40, 32, 31, 30], [45, 23, 24, 7, 8, 9], [46, 47, 38, 37, 36, 35], [47, 38, 37, 36, 35, 34], [48, 36, 35, 34, 29, 28], [49, 47, 38, 37, 36, 35], [50, 48, 36, 35, 34, 29], [51, 50, 48, 36, 35, 34], [52, 53, 71, 70, 69, 68], [56, 57, 61, 60, 59, 58], [56, 57, 61, 60, 59, 58], [57, 61, 60, 59, 58, 56], [57, 61, 60, 59, 58, 56], [58, 56, 57, 61, 60, 59], [58, 56, 57, 61, 60, 59], [59, 58, 56, 57, 61, 60], [59, 58, 56, 57, 61, 60], [60, 59, 58, 56, 57, 61], [60, 59, 58, 56, 57, 61], [61, 60, 59, 58, 56, 57], [61, 60, 59, 58, 56, 57], [62, 57, 61, 60, 59, 58], [63, 80, 79, 78, 76, 77], [64, 63, 80, 79, 78, 76], [64, 63, 80, 79, 78, 76], [67, 66, 65, 61, 60, 59], [68, 60, 59, 58, 56, 57], [69, 68, 60, 59, 58, 56], [70, 69, 68, 60, 59, 58], [72, 67, 66, 65, 61, 60], [75, 66, 65, 61, 60, 59], [78, 76, 77, 75, 66, 65], [79, 78, 76, 77, 75, 66], [80, 79, 78, 76, 77, 75], [81, 80, 79, 78, 76, 77], [82, 83, 62, 57, 61, 60], [83, 62, 57, 61, 60, 59], [85, 84, 86, 87, 88, 89], [88, 87, 86, 84, 85, 90], [89, 88, 87, 86, 84, 85], [89, 88, 87, 86, 84, 85], [91, 90, 111, 110, 109, 108], [92, 91, 90, 111, 110, 109], [93, 89, 88, 87, 86, 84], [95, 94, 103, 102, 101, 79], [96, 88, 87, 86, 84, 85], [97, 96, 88, 87, 86, 84], [98, 97, 96, 88, 87, 86], [99, 81, 80, 79, 78, 76], [100, 95, 94, 103, 102, 101], [100, 95, 94, 103, 102, 101], [101, 79, 78, 76, 77, 75], [102, 103, 94, 93, 89, 88], [104, 106, 107, 129, 128, 123], [105, 104, 106, 107, 129, 128], [106, 107, 129, 128, 123, 124], [108, 91, 90, 111, 110, 109], [108, 91, 90, 111, 110, 109], [109, 108, 91, 90, 111, 110], [109, 108, 91, 90, 111, 110], [110, 109, 108, 91, 90, 111], [110, 109, 108, 91, 90, 111], [111, 90, 91, 92, 93, 89], [114, 112, 113, 117, 116, 124], [115, 114, 112, 113, 117, 116], [115, 114, 112, 113, 117, 116], [116, 124, 123, 122, 121, 117], [116, 124, 123, 122, 121, 117], [117, 113, 112, 114, 115, 116], [117, 113, 112, 114, 115, 116], [118, 113, 112, 114, 115, 116], [119, 118, 113, 112, 114, 115], [120, 119, 118, 113, 112, 114], [121, 117, 113, 112, 114, 115], [122, 121, 117, 113, 112, 114], [125, 126, 127, 128, 123, 124], [125, 126, 127, 128, 123, 124], [128, 123, 124, 116, 115, 114], [130, 129, 107, 106, 104, 105], [131, 122, 121, 117, 113, 112], [132, 120, 119, 118, 113, 112], [133, 131, 122, 121, 117, 113], [134, 132, 120, 119, 118, 113], [135, 134, 132, 120, 119, 118], [138, 137, 136, 135, 134, 132], [139, 118, 113, 112, 114, 115], [141, 140, 142, 143, 144, 145], [142, 140, 141, 146, 147, 164], [143, 142, 140, 141, 146, 147], [144, 143, 142, 140, 141, 146], [145, 149, 150, 151, 152, 144], [145, 149, 150, 151, 152, 144], [146, 141, 140, 142, 143, 144], [147, 164, 163, 162, 160, 148], [147, 164, 163, 162, 160, 148], [149, 150, 151, 152, 144, 143], [152, 144, 143, 142, 140, 141], [153, 152, 144, 143, 142, 140], [154, 153, 152, 144, 143, 142], [155, 137, 136, 135, 134, 132], [156, 151, 150, 149, 145, 141], [157, 135, 134, 132, 120, 119], [158, 157, 135, 134, 132, 120], [159, 161, 160, 148, 149, 150], [162, 163, 164, 165, 15, 14], [165, 15, 14, 13, 12, 4]]
EDIT: I just perceived that I get all classes of symmetry for each cycle, but I think it is better to remove this in postprocessing. Transforming each cycle in a Set
of edges
and calling unique
should reduce the cycle list to only one representation of each cycle.
Hey, @Henrique_Becker thank you for rewriting the code:
using MolecularGraph
using LinearAlgebra
using Statistics
using DataStructures
function compound(SMILES::String)
mol = smilestomol(SMILES)
mol_ed = [] # molecule_edges extracted from MolecularGraph
i_vert = []
j_vert = []
for edge in mol.edges
push!(mol_ed, edge, reverse(edge))
end
for ed in mol_ed
push!(i_vert, ed[1])
push!(j_vert, ed[2])
end
neighbour_list = tuple(i_vert, j_vert)
return neighbour_list
end
function edges2neighbors(xs, ys)
max_vertex = max(maximum(xs), maximum(ys))
neighbours = Vector{Int}[Int[] for i in 1:max_vertex]
for (i_v, j_v) in zip(xs, ys)
push!(neighbours[i_v], j_v)
end
return neighbours
end
a = edges2neighbors(compound("C1=CC=CC=C1")[1], compound("C1=CC=CC=C1")[2])
print(a)
function inner_dfs_cycle_detection!(cycles, candidate, neighbours, cycle_length)
@assert length(candidate) <= cycle_length
for next_vertex in neighbours[last(candidate)]
if next_vertex == first(candidate)
if length(candidate) < cycle_length
continue
else
push!(cycles, copy(candidate))
end
elseif next_vertex in candidate
# Do not cycle back to the middle of candidate.
continue
elseif length(candidate) < cycle_length
# If the candidate does not have a subcycle and it is smaller
# than the desired cycle length, then recurse.
push!(candidate, next_vertex)
inner_dfs_cycle_detection!(cycles, candidate, neighbours, cycle_length)
end
end
return
end
function dfs_cycle_detection(neighbours, cycle_length)
cycles = Vector{eltype(neighbours)}()
for initial_vertex in eachindex(neighbours)
inner_dfs_cycle_detection!(cycles, [initial_vertex], neighbours, cycle_length)
end
return cycles
end
cycles_base_1 = dfs_cycle_detection(a, 6)
print(cycles_base_1)
# output: [[1, 2, 3, 4, 5, 6], [2, 1, 6, 5, 4, 3], [3, 2, 1, 6, 5, 4], [3, 2, 1, 6, 5, 4], [4, 3, 2, 1, 6, 5], [4, 3, 2, 1, 6, 5], [5, 4, 3, 2, 1, 6], [5, 4, 3, 2, 1, 6], [6, 5, 4, 3, 2, 1], [6, 5, 4, 3, 2, 1]]
I have added the compound
function, that converts SMILES to a list of edges. But now, I have another issue: this code you wrote returns the same cycle with different orders, as you can see in the output:
[[1, 2, 3, 4, 5, 6], [2, 1, 6, 5, 4, 3], [3, 2, 1, 6, 5, 4], [3, 2, 1, 6, 5, 4], [4, 3, 2, 1, 6, 5], [4, 3, 2, 1, 6, 5], [5, 4, 3, 2, 1, 6], [5, 4, 3, 2, 1, 6], [6, 5, 4, 3, 2, 1], [6, 5, 4, 3, 2, 1]]
I would like to return an array with only different cycles, because I’m dealing with molecules, which are instances of undirected graphs.
And I would like to identify cycles with less than 4 vertices