What's wrong with this ordering function?

Hi, I have a vector of tuples like this:

julia> results
75-element Vector{NamedTuple{(:program, :mean, :stddev, :score), Tuple{String, Float64, Float64, Float64}}}:
 (program = "fac_ite_#0", mean = 0.1477079562429519, stddev = 8.965478287926575e-5, score = -0.06097735542175287)
 (program = "fac_ite_#1", mean = 0.13010204848206308, stddev = 6.923264304409717e-5, score = 0.009826443216060715)
 (program = "fac_ite_#10", mean = 0.10947370857251733, stddev = 6.976366587666448e-5, score = 0.14378340152471253)
 (program = "fac_ite_#15", mean = 0.11059185070945224, stddev = 7.254103233143866e-5, score = 0.15766269973779562)
 (program = "fac_ite_#2", mean = 0.12221405698367882, stddev = 7.477544607742762e-5, score = 0.05580326600855921)
 (program = "fac_ite_#20", mean = 0.10772543458615948, stddev = 6.960714418039649e-5, score = 0.1653078690858445)
 (program = "fac_ite_#25", mean = 0.10650477327554919, stddev = 6.857422631766746e-5, score = 0.17065332566869135)
 (program = "fac_ite_#3", mean = 0.11824904662880127, stddev = 7.429346888172688e-5, score = 0.07889582241278754)
 (program = "fac_ite_#30", mean = 0.10625841189197173, stddev = 6.959355064254072e-5, score = 0.17426188827933753)
 (program = "fac_ite_#4", mean = 0.11433780060046601, stddev = 7.299977613589191e-5, score = 0.09862263576748763)
 (program = "fac_ite_#5", mean = 0.11272528955290134, stddev = 7.064808980164764e-5, score = 0.10949668028816667)
 (program = "fac_ite_#6", mean = 0.11126031712652233, stddev = 7.088068504976988e-5, score = 0.1213827294786406)
 (program = "fac_ite_#7", mean = 0.11073424242720002, stddev = 7.002658613610886e-5, score = 0.1284360016392515)
 (program = "fac_ite_#8", mean = 0.109135500404454, stddev = 6.966014017507641e-5, score = 0.1348827831740274)
 (program = "fac_ite_#9", mean = 0.10900493206100066, stddev = 6.790336528418506e-5, score = 0.13809270029953202)
 (program = "fac_rec_#0", mean = 0.12155528397719238, stddev = 7.506046842635152e-5, score = -0.10031619914589512)
 â‹®
 (program = "fib_rec_#9", mean = 0.10037452585794363, stddev = 8.255459694197043e-5, score = -0.0752284296467097)
 (program = "sum_ite_#0", mean = 0.14880711744945885, stddev = 7.98402310775401e-5, score = -0.05049779547456698)
 (program = "sum_ite_#1", mean = 0.1295029928557673, stddev = 7.523374334382274e-5, score = 0.0320259875759747)
 (program = "sum_ite_#10", mean = 0.10418347662606556, stddev = 6.527527454615916e-5, score = 0.17511419809617046)
 (program = "sum_ite_#15", mean = 0.10245673050054228, stddev = 6.64395742113198e-5, score = 0.18899282630454417)
 (program = "sum_ite_#2", mean = 0.12077588344188206, stddev = 7.45998297684706e-5, score = 0.08224123433723218)
 (program = "sum_ite_#20", mean = 0.10090606902161019, stddev = 6.557520246018644e-5, score = 0.19728101642830714)
 (program = "sum_ite_#25", mean = 0.10056196982588159, stddev = 6.744626622462643e-5, score = 0.20170497131118817)
 (program = "sum_ite_#3", mean = 0.11569219451102845, stddev = 7.083805286022843e-5, score = 0.10756296803872933)
 (program = "sum_ite_#30", mean = 0.10059041558688528, stddev = 6.894619614006776e-5, score = 0.20553708503652035)
 (program = "sum_ite_#4", mean = 0.11214372821887807, stddev = 6.880032768868152e-5, score = 0.127896132461527)
 (program = "sum_ite_#5", mean = 0.11009824692575194, stddev = 6.474106908381416e-5, score = 0.14032455804124885)
 (program = "sum_ite_#6", mean = 0.10811706080709338, stddev = 6.670300472668178e-5, score = 0.15169113139251172)
 (program = "sum_ite_#7", mean = 0.10683071076707522, stddev = 6.64957367053159e-5, score = 0.1583529542786568)
 (program = "sum_ite_#8", mean = 0.10531787359937127, stddev = 6.588181532743541e-5, score = 0.16527795163401604)
 (program = "sum_ite_#9", mean = 0.10471148561664549, stddev = 6.791208027962913e-5, score = 0.16912543669787075)

and I want to sort it by its program field. Specifically, I want to sort it first by the “alphabetical” part of the string, then by the numerical one, so to have

 (program = "fac_ite_#0", mean = 0.1477079562429519, stddev = 8.965478287926575e-5, score = -0.06097735542175287)
 (program = "fac_ite_#1", mean = 0.13010204848206308, stddev = 6.923264304409717e-5, score = 0.009826443216060715)
 (program = "fac_ite_#2", mean = 0.12221405698367882, stddev = 7.477544607742762e-5, score = 0.05580326600855921)
 (program = "fac_ite_#3", mean = 0.11824904662880127, stddev = 7.429346888172688e-5, score = 0.07889582241278754)
 (program = "fac_ite_#4", mean = 0.11433780060046601, stddev = 7.299977613589191e-5, score = 0.09862263576748763)
 (program = "fac_ite_#5", mean = 0.11272528955290134, stddev = 7.064808980164764e-5, score = 0.10949668028816667)
 (program = "fac_ite_#6", mean = 0.11126031712652233, stddev = 7.088068504976988e-5, score = 0.1213827294786406)
 (program = "fac_ite_#7", mean = 0.11073424242720002, stddev = 7.002658613610886e-5, score = 0.1284360016392515)
 (program = "fac_ite_#8", mean = 0.109135500404454, stddev = 6.966014017507641e-5, score = 0.1348827831740274)
 (program = "fac_ite_#9", mean = 0.10900493206100066, stddev = 6.790336528418506e-5, score = 0.13809270029953202)
 (program = "fac_ite_#10", mean = 0.10947370857251733, stddev = 6.976366587666448e-5, score = 0.14378340152471253)
 (program = "fac_ite_#15", mean = 0.11059185070945224, stddev = 7.254103233143866e-5, score = 0.15766269973779562)
 (program = "fac_ite_#20", mean = 0.10772543458615948, stddev = 6.960714418039649e-5, score = 0.1653078690858445)
 (program = "fac_ite_#25", mean = 0.10650477327554919, stddev = 6.857422631766746e-5, score = 0.17065332566869135)
 (program = "fac_ite_#30", mean = 0.10625841189197173, stddev = 6.959355064254072e-5, score = 0.17426188827933753)
 (program = "fac_rec_#0", mean = 0.12155528397719238, stddev = 7.506046842635152e-5, score = -0.10031619914589512)
 â‹®

Now, I thought I could use sort() with a custom lt function, like this:

function program_lt(t1, t2)
    p1 = t1[:program]
    p2 = t2[:program]
    regexp = r"(?<program>.+)_#(?<arg>\d+)"
    ms1 = match(regexp, p1)
    ms2 = match(regexp, p2)
    if ms1[:program] < ms2[:program]
        return true
    elseif parse(Int, ms1[:arg]) < parse(Int, ms2[:arg])
        return true
    else
        return false
    end
end

results = sort(results, lt = program_lt)

that is, I extract the program name, then I match it with a regex, and I test it like this: if the alphabetical part is ordered, return true, else check the numerical ones. However the result is not the one I want, and if I call sort() multiple times the results toggle between two orders, like:

julia> results = sort(results, lt = program_lt)
75-element Vector{NamedTuple{(:program, :mean, :stddev, :score), Tuple{String, Float64, Float64, Float64}}}:
 (program = "fac_ite_#0", mean = 0.1477079562429519, stddev = 8.965478287926575e-5, score = -0.06097735542175287)
 (program = "fac_rec_#0", mean = 0.12155528397719238, stddev = 7.506046842635152e-5, score = -0.10031619914589512)
 (program = "fib_ite_#0", mean = 0.1458960827370358, stddev = 8.554196354742933e-5, score = -0.05039813508627159)
 (program = "fib_rec_#0", mean = 0.14228406115315737, stddev = 8.544853312883077e-5, score = -0.10133855204596641)
 (program = "sum_ite_#0", mean = 0.14880711744945885, stddev = 7.98402310775401e-5, score = -0.05049779547456698)
 (program = "fac_ite_#1", mean = 0.13010204848206308, stddev = 6.923264304409717e-5, score = 0.009826443216060715)
 (program = "fac_rec_#1", mean = 0.12158719919580394, stddev = 7.517642334769933e-5, score = -0.10583091579590989)
 (program = "fib_ite_#1", mean = 0.14595474206397455, stddev = 8.452173753036977e-5, score = -0.05515017169425661)
 (program = "fib_rec_#1", mean = 0.142340762308992, stddev = 8.460140844216113e-5, score = -0.10685326869598119)
 (program = "sum_ite_#1", mean = 0.1295029928557673, stddev = 7.523374334382274e-5, score = 0.0320259875759747)
 (program = "fac_ite_#2", mean = 0.12221405698367882, stddev = 7.477544607742762e-5, score = 0.05580326600855921)
 (program = "fac_rec_#2", mean = 0.11750301626096656, stddev = 7.386295199634167e-5, score = -0.08400628736602754)
 (program = "fib_ite_#2", mean = 0.12860894640174395, stddev = 7.755149182771144e-5, score = 0.010880984966395824)
 (program = "fib_rec_#2", mean = 0.11003817104692622, stddev = 7.492814443160189e-5, score = -0.08131677139403698)
 (program = "sum_ite_#2", mean = 0.12077588344188206, stddev = 7.45998297684706e-5, score = 0.08224123433723218)
 (program = "fac_ite_#3", mean = 0.11824904662880127, stddev = 7.429346888172688e-5, score = 0.07889582241278754)
 â‹®
 (program = "sum_ite_#15", mean = 0.10245673050054228, stddev = 6.64395742113198e-5, score = 0.18899282630454417)
 (program = "fac_ite_#20", mean = 0.10772543458615948, stddev = 6.960714418039649e-5, score = 0.1653078690858445)
 (program = "fac_rec_#20", mean = 0.11573289604106572, stddev = 7.102680322718689e-5, score = -0.07287845990829539)
 (program = "fib_ite_#20", mean = 0.10230674410585487, stddev = 8.722624702189476e-5, score = 0.19723920974149012)
 (program = "fib_rec_#20", mean = 0.10027904087785651, stddev = 0.0001321510164681812, score = -0.0785518018394757)
 (program = "sum_ite_#20", mean = 0.10090606902161019, stddev = 6.557520246018644e-5, score = 0.19728101642830714)
 (program = "fac_ite_#25", mean = 0.10650477327554919, stddev = 6.857422631766746e-5, score = 0.17065332566869135)
 (program = "fac_rec_#25", mean = 0.11568707504228318, stddev = 7.125371756640872e-5, score = -0.07206233613001)
 (program = "fib_ite_#25", mean = 0.10310719640375986, stddev = 9.070924672145898e-5, score = 0.20159143303661267)
 (program = "fib_rec_#25", mean = 0.10005283909201772, stddev = 0.00010831775241116769, score = -0.07855180183947576)
 (program = "sum_ite_#25", mean = 0.10056196982588159, stddev = 6.744626622462643e-5, score = 0.20170497131118817)
 (program = "fac_ite_#30", mean = 0.10625841189197173, stddev = 6.959355064254072e-5, score = 0.17426188827933753)
 (program = "fac_rec_#30", mean = 0.11554067804395603, stddev = 7.285678318161415e-5, score = -0.07935789218401491)
 (program = "fib_ite_#30", mean = 0.10433860458767284, stddev = 8.387413952198711e-5, score = 0.205081843084081)
 (program = "fib_rec_#30", mean = 0.1001192711105907, stddev = 0.00021624437745492892, score = -0.08301788805013971)
 (program = "sum_ite_#30", mean = 0.10059041558688528, stddev = 6.894619614006776e-5, score = 0.20553708503652035)

julia> results = sort(results, lt = program_lt)
75-element Vector{NamedTuple{(:program, :mean, :stddev, :score), Tuple{String, Float64, Float64, Float64}}}:
 (program = "fac_ite_#0", mean = 0.1477079562429519, stddev = 8.965478287926575e-5, score = -0.06097735542175287)
 (program = "fac_ite_#1", mean = 0.13010204848206308, stddev = 6.923264304409717e-5, score = 0.009826443216060715)
 (program = "fac_ite_#2", mean = 0.12221405698367882, stddev = 7.477544607742762e-5, score = 0.05580326600855921)
 (program = "fac_ite_#3", mean = 0.11824904662880127, stddev = 7.429346888172688e-5, score = 0.07889582241278754)
 (program = "fac_ite_#4", mean = 0.11433780060046601, stddev = 7.299977613589191e-5, score = 0.09862263576748763)
 (program = "fac_ite_#5", mean = 0.11272528955290134, stddev = 7.064808980164764e-5, score = 0.10949668028816667)
 (program = "fac_ite_#6", mean = 0.11126031712652233, stddev = 7.088068504976988e-5, score = 0.1213827294786406)
 (program = "fac_ite_#7", mean = 0.11073424242720002, stddev = 7.002658613610886e-5, score = 0.1284360016392515)
 (program = "fac_ite_#8", mean = 0.109135500404454, stddev = 6.966014017507641e-5, score = 0.1348827831740274)
 (program = "fac_ite_#9", mean = 0.10900493206100066, stddev = 6.790336528418506e-5, score = 0.13809270029953202)
 (program = "fac_ite_#10", mean = 0.10947370857251733, stddev = 6.976366587666448e-5, score = 0.14378340152471253)
 (program = "fac_ite_#15", mean = 0.11059185070945224, stddev = 7.254103233143866e-5, score = 0.15766269973779562)
 (program = "fac_ite_#20", mean = 0.10772543458615948, stddev = 6.960714418039649e-5, score = 0.1653078690858445)
 (program = "fac_ite_#25", mean = 0.10650477327554919, stddev = 6.857422631766746e-5, score = 0.17065332566869135)
 (program = "fac_ite_#30", mean = 0.10625841189197173, stddev = 6.959355064254072e-5, score = 0.17426188827933753)
 (program = "fac_rec_#0", mean = 0.12155528397719238, stddev = 7.506046842635152e-5, score = -0.10031619914589512)
 â‹®
 (program = "fib_rec_#30", mean = 0.1001192711105907, stddev = 0.00021624437745492892, score = -0.08301788805013971)
 (program = "sum_ite_#0", mean = 0.14880711744945885, stddev = 7.98402310775401e-5, score = -0.05049779547456698)
 (program = "sum_ite_#1", mean = 0.1295029928557673, stddev = 7.523374334382274e-5, score = 0.0320259875759747)
 (program = "sum_ite_#2", mean = 0.12077588344188206, stddev = 7.45998297684706e-5, score = 0.08224123433723218)
 (program = "sum_ite_#3", mean = 0.11569219451102845, stddev = 7.083805286022843e-5, score = 0.10756296803872933)
 (program = "sum_ite_#4", mean = 0.11214372821887807, stddev = 6.880032768868152e-5, score = 0.127896132461527)
 (program = "sum_ite_#5", mean = 0.11009824692575194, stddev = 6.474106908381416e-5, score = 0.14032455804124885)
 (program = "sum_ite_#6", mean = 0.10811706080709338, stddev = 6.670300472668178e-5, score = 0.15169113139251172)
 (program = "sum_ite_#7", mean = 0.10683071076707522, stddev = 6.64957367053159e-5, score = 0.1583529542786568)
 (program = "sum_ite_#8", mean = 0.10531787359937127, stddev = 6.588181532743541e-5, score = 0.16527795163401604)
 (program = "sum_ite_#9", mean = 0.10471148561664549, stddev = 6.791208027962913e-5, score = 0.16912543669787075)
 (program = "sum_ite_#10", mean = 0.10418347662606556, stddev = 6.527527454615916e-5, score = 0.17511419809617046)
 (program = "sum_ite_#15", mean = 0.10245673050054228, stddev = 6.64395742113198e-5, score = 0.18899282630454417)
 (program = "sum_ite_#20", mean = 0.10090606902161019, stddev = 6.557520246018644e-5, score = 0.19728101642830714)
 (program = "sum_ite_#25", mean = 0.10056196982588159, stddev = 6.744626622462643e-5, score = 0.20170497131118817)
 (program = "sum_ite_#30", mean = 0.10059041558688528, stddev = 6.894619614006776e-5, score = 0.20553708503652035)

julia> results = sort(results, lt = program_lt)
75-element Vector{NamedTuple{(:program, :mean, :stddev, :score), Tuple{String, Float64, Float64, Float64}}}:
 (program = "fac_ite_#0", mean = 0.1477079562429519, stddev = 8.965478287926575e-5, score = -0.06097735542175287)
 (program = "fac_rec_#0", mean = 0.12155528397719238, stddev = 7.506046842635152e-5, score = -0.10031619914589512)
 (program = "fib_ite_#0", mean = 0.1458960827370358, stddev = 8.554196354742933e-5, score = -0.05039813508627159)
 (program = "fib_rec_#0", mean = 0.14228406115315737, stddev = 8.544853312883077e-5, score = -0.10133855204596641)
 (program = "sum_ite_#0", mean = 0.14880711744945885, stddev = 7.98402310775401e-5, score = -0.05049779547456698)
 (program = "fac_ite_#1", mean = 0.13010204848206308, stddev = 6.923264304409717e-5, score = 0.009826443216060715)
 (program = "fac_rec_#1", mean = 0.12158719919580394, stddev = 7.517642334769933e-5, score = -0.10583091579590989)
 (program = "fib_ite_#1", mean = 0.14595474206397455, stddev = 8.452173753036977e-5, score = -0.05515017169425661)
 (program = "fib_rec_#1", mean = 0.142340762308992, stddev = 8.460140844216113e-5, score = -0.10685326869598119)
 (program = "sum_ite_#1", mean = 0.1295029928557673, stddev = 7.523374334382274e-5, score = 0.0320259875759747)
 (program = "fac_ite_#2", mean = 0.12221405698367882, stddev = 7.477544607742762e-5, score = 0.05580326600855921)
 (program = "fac_rec_#2", mean = 0.11750301626096656, stddev = 7.386295199634167e-5, score = -0.08400628736602754)
 (program = "fib_ite_#2", mean = 0.12860894640174395, stddev = 7.755149182771144e-5, score = 0.010880984966395824)
 (program = "fib_rec_#2", mean = 0.11003817104692622, stddev = 7.492814443160189e-5, score = -0.08131677139403698)
 (program = "sum_ite_#2", mean = 0.12077588344188206, stddev = 7.45998297684706e-5, score = 0.08224123433723218)
 (program = "fac_ite_#3", mean = 0.11824904662880127, stddev = 7.429346888172688e-5, score = 0.07889582241278754)
 â‹®
 (program = "sum_ite_#15", mean = 0.10245673050054228, stddev = 6.64395742113198e-5, score = 0.18899282630454417)
 (program = "fac_ite_#20", mean = 0.10772543458615948, stddev = 6.960714418039649e-5, score = 0.1653078690858445)
 (program = "fac_rec_#20", mean = 0.11573289604106572, stddev = 7.102680322718689e-5, score = -0.07287845990829539)
 (program = "fib_ite_#20", mean = 0.10230674410585487, stddev = 8.722624702189476e-5, score = 0.19723920974149012)
 (program = "fib_rec_#20", mean = 0.10027904087785651, stddev = 0.0001321510164681812, score = -0.0785518018394757)
 (program = "sum_ite_#20", mean = 0.10090606902161019, stddev = 6.557520246018644e-5, score = 0.19728101642830714)
 (program = "fac_ite_#25", mean = 0.10650477327554919, stddev = 6.857422631766746e-5, score = 0.17065332566869135)
 (program = "fac_rec_#25", mean = 0.11568707504228318, stddev = 7.125371756640872e-5, score = -0.07206233613001)
 (program = "fib_ite_#25", mean = 0.10310719640375986, stddev = 9.070924672145898e-5, score = 0.20159143303661267)
 (program = "fib_rec_#25", mean = 0.10005283909201772, stddev = 0.00010831775241116769, score = -0.07855180183947576)
 (program = "sum_ite_#25", mean = 0.10056196982588159, stddev = 6.744626622462643e-5, score = 0.20170497131118817)
 (program = "fac_ite_#30", mean = 0.10625841189197173, stddev = 6.959355064254072e-5, score = 0.17426188827933753)
 (program = "fac_rec_#30", mean = 0.11554067804395603, stddev = 7.285678318161415e-5, score = -0.07935789218401491)
 (program = "fib_ite_#30", mean = 0.10433860458767284, stddev = 8.387413952198711e-5, score = 0.205081843084081)
 (program = "fib_rec_#30", mean = 0.1001192711105907, stddev = 0.00021624437745492892, score = -0.08301788805013971)
 (program = "sum_ite_#30", mean = 0.10059041558688528, stddev = 6.894619614006776e-5, score = 0.20553708503652035)

What am I doing wrong? Thanks

You want to check for the #part only if the former string part is equal.
Try this (not optimized, just rearranged your code a bit):

function program_lt(t1, t2)
    p1 = t1[:program]
    p2 = t2[:program]
    regexp = r"(?<program>.+)_#(?<arg>\d+)"
    ms1 = match(regexp, p1)
    ms2 = match(regexp, p2)
    if ms1[:program] > ms2[:program]
        return false
    elseif ms1[:program] == ms2[:program]
        if parse(Int, ms1[:arg]) < parse(Int, ms2[:arg])
            return true
        else
            return false
        end
    else
        return true
    end
end
1 Like

Thanks! You’re right, I just figure that out, I was about to answer. I feel a bit awkward now haha.

1 Like

That’s not necessary. It happens often enough to all of us.

1 Like

You might find the NaturalSort package useful.

3 Likes

Awesome!!!
( I really wonder why people are reasoning about the Julia Ecosystem… )