Can this code be made faster

Sanji_Vinsmoke · August 2, 2020, 8:11am

I attempted the Advent od Code 2015 day 9 solution in julia and here is my code for the first part (for this input 9.txt file refer to the link to the problem):

using Combinatorics, BenchmarkTools

distances = readlines("9.txt")

cities = unique!(vcat(map(string,unique!([split(dist)[1] for dist in distances])),map(string,unique!([split(dist)[3] for dist in distances]))))

function find_distance(a::String,b::String)::Int64
    for dist in distances
        if (a == split(dist)[1] && b == split(dist)[3]) ||  (a == split(dist)[3] && b == split(dist)[1])
            return parse(Int64,last(split(dist)))
        end
    end
end

function find_tour_length(a)::Int64
    dist = 0
    for i in 1:(length(a) - 1)        
        dist += find_distance(a[i],a[i+1])        
    end
    return dist
end

all_tours = permutations(cities)

solve() = println(minimum([find_tour_length(tour) for tour in all_tours]))
@btime solve()

Running this code, it takes 7.232 secs.

Can this code be made any faster?
Thank you in advance.

GunnarFarneback · August 2, 2020, 8:34am

There is much that can be done to make it faster but what really strikes the eye is the repeated looping, splitting, and parsing done in find_distance. You really don’t want to be doing all that over and over again. Parsing the distances once and storing the results in a dict should give you a huge improvement to get started with.

Sanji_Vinsmoke · August 2, 2020, 8:37am

Thanks a lot. I will change the code right away.
Also, can you please tell me more on how can this code me made more faster?

DNF · August 2, 2020, 11:11am

You should read the performance tips section of the manual: Performance Tips · The Julia Language

Tip number one is to avoid global variables, in particular you are reading distances from a global variable inside the function find_distance. Instead, you should pass distances as an input to the function.

Also avoid unnecessarily redoing the same thing over and over and over and over, as @GunnarFarneback points out.

DNF · August 2, 2020, 11:20am

If you would like more help on speeding up the code, you can provide a snippet from your input text file. That would make it possible to run your code and identify speedup options.

Sanji_Vinsmoke · August 2, 2020, 11:28am

Here you go,

Tristram to AlphaCentauri = 34
Tristram to Snowdin = 100
Tristram to Tambi = 63
Tristram to Faerun = 108
Tristram to Norrath = 111
Tristram to Straylight = 89
Tristram to Arbre = 132
AlphaCentauri to Snowdin = 4
AlphaCentauri to Tambi = 79
AlphaCentauri to Faerun = 44
AlphaCentauri to Norrath = 147
AlphaCentauri to Straylight = 133
AlphaCentauri to Arbre = 74
Snowdin to Tambi = 105
Snowdin to Faerun = 95
Snowdin to Norrath = 48
Snowdin to Straylight = 88
Snowdin to Arbre = 7
Tambi to Faerun = 68
Tambi to Norrath = 134
Tambi to Straylight = 107
Tambi to Arbre = 40
Faerun to Norrath = 11
Faerun to Straylight = 66
Faerun to Arbre = 144
Norrath to Straylight = 115
Norrath to Arbre = 135
Straylight to Arbre = 127

This is the entire text file.

anon37204545 · August 2, 2020, 12:24pm

using Combinatorics, BenchmarkTools

const distances = split.(readlines("9.txt"))
const cities = unique(
    vcat(map(x -> getindex(x, [1, 3]), distances)...)
)

function find_distance(a::AbstractString, b::AbstractString)::Int64
    for dist in distances
        if (a == dist[1] && b == dist[3]) || (a == dist[3] && b == dist[1])
            return parse(Int64,last(dist))
        end
    end
end

function find_tour_length(a)::Int64
    dist = 0
    for i in 1:(length(a) - 1)
        dist += find_distance(a[i], a[i+1])
    end
    return dist
end

const all_tours = permutations(cities)

solve() = println(minimum(find_tour_length, all_tours))
@btime solve()

Before:

  8.306 s (145555211 allocations: 5.61 GiB)

After:

 60.115 ms (201607 allocations: 14.77 MiB)

I get the same result, i.e. 251. Are you satisfied with the answer?

Sanji_Vinsmoke · August 2, 2020, 1:45pm

Wow!
Thank you so much .
Also, was my code consuming 5.7GB of space earlier and now it is consuming 14MB?
If yes, then why was it consuming so much memory?

anon37204545 · August 2, 2020, 4:45pm

The major thing to do is split. the distances on the beginning and use dist instead of split(dist) in every instance after. That way you avoid doing a costy operation of splitting a string repeatedly.
Doing that gets the performance to 867.532 ms (22639690 allocations: 979.76 MiB).
Passing a function to run while finding a minimum, as in minimum(find_tour_length, all_tours)) loops the array only once (to apply the function and compare the values at the same time), instead of twice, to apply the function, and then compare.
Some optimizations help to unclutter the loops in cities. You don’t need to map(string, ...) (i.e. convert to type String) because without it you get type SubString, which is basically just a string. The problem was that you set find_distance’s argument types to String instead of AbstractString, which would accept both String and SubString.

You can even drop ::Int64 for function’s return argument, since compiler deduces that automatically.

I don’t understand completely how the memory management works in Julia, so someone else probably can explain better why is my code so memory-efficient.

If I solved your issue, you can also mark my post as “Solution”, so other people with similar problem can find it more easily.

Oscar_Smith · August 2, 2020, 4:58pm

The reason for the difference in memory allocation is that prior to 1.5 (which will be out very soon) each SubString requires allocating an object. Each object is very small (probably around 24 bytes), but you were creating about 100 million of these objects.

Topic		Replies	Views
Improving performance of mapping algorithm Performance question	17	914	April 28, 2020
Making a Function FASTER! New to Julia question	26	2365	November 14, 2020
How to make a function 120 times faster General Usage performance , scope , parameters	15	825	July 7, 2022
Can this loop be optimized further? Performance question	7	744	December 20, 2019
Best way to optimize code Performance question	29	2279	May 21, 2020

Can this code be made faster

Related topics