Avoid recalculating X^q when f(p): X.^1 ... X.^p


#1

Wasn’t sure if this belongs in first-steps or performance - happy to move it…

What is best practice implementation of the following in Julia.
The objective is to minimize run time. There is no requirement that this be thread safe, but if it can be that would be a benefit.

You have a (large) array/vector, X.
The make|break requirement is that X^q (in julian: X.^q) is not allowed to be recalculated, 1 \le q \le p.

A function f(p) calculates: X^1,X^2, X^3...X^p
A user script is: f(4) then later f(7), etc.
When the user resets X the calculated values X^q, q=1..p are also reset.

What is Julia best practice implementation to avoid recalculating X^q, q=1...4, when the user script comes to run f(7)

In an OO language, where f(p)is a method of an instance of a class, you’d possibly do something like store X and the calculated values, X^q, in Class attributes. Managing when those class values should be wiped is delegated to class level methods, e.g. on some file read event when X is created.

I’m trying to work out best practice way of mapping this to Julia.

Appreciate any hints or tips.


#2

You can use https://github.com/JuliaCollections/Memoize.jl, for example. But basically this just amounts to having a global dictionary or array caching previously-computed quantities, which for a simple case like your f(p) is easy to implement yourself.

(You could make it thread-safe just by putting a lock on the cache when updating it, assuming this happens infrequently.)


#3

Below is one suggestion using a simple custom cache. I’ve focused on the design here, I believe we discussed the performance and how to reduce memory allocations in a previous question of yours (Vandermonde).

struct PowerCache{T}
    x::Vector{T}
    powers::Vector{Vector{T}}

    PowerCache(x) = new{eltype(x)}(x, [x])
end

function powers(pc::PowerCache, p::Int)
    for n = length(pc.powers)+1:p
        println("Calculating power $n")
        push!(pc.powers, pc.x .* pc.powers[n-1])
    end
    pc.powers[1:p]
end

Sample usage:

julia> pc = PowerCache([1, 3, 2, 4, 5])
PowerCache{Int64}([1, 3, 2, 4, 5], Array{Int64,1}[[1, 3, 2, 4, 5]])

julia> powers(pc, 3)
Calculating power 2
Calculating power 3
3-element Array{Array{Int64,1},1}:
 [1, 3, 2, 4, 5]
 [1, 9, 4, 16, 25]
 [1, 27, 8, 64, 125]

julia> powers(pc, 5)
Calculating power 4
Calculating power 5
5-element Array{Array{Int64,1},1}:
 [1, 3, 2, 4, 5]
 [1, 9, 4, 16, 25]
 [1, 27, 8, 64, 125]
 [1, 81, 16, 256, 625]
 [1, 243, 32, 1024, 3125]

julia> powers(pc, 4)
4-element Array{Array{Int64,1},1}:
 [1, 3, 2, 4, 5]
 [1, 9, 4, 16, 25]
 [1, 27, 8, 64, 125]
 [1, 81, 16, 256, 625]

julia> f(p) = powers(pc, p)
f (generic function with 1 method)

julia> f(7)
Calculating power 6
Calculating power 7
7-element Array{Array{Int64,1},1}:
 [1, 3, 2, 4, 5]
 [1, 9, 4, 16, 25]
 [1, 27, 8, 64, 125]
 [1, 81, 16, 256, 625]
 [1, 243, 32, 1024, 3125]
 [1, 729, 64, 4096, 15625]
 [1, 2187, 128, 16384, 78125]

#4

If you are re-using this cache lots of times, the performance considerations are different. In particular, if you only get cache misses infrequently, then the performance of the ^n exponentiation is probably irrelevant, and I would tend to just use pc.x .^ n to maximize accuracy.


#5

Thanks for such a comprehensive example.
I’m still internalizing it, but it appears to achieve the same effect as the OO approach would.


#6

Thank you for the heads-up about performance and accuracy, for those following along, see @stevengj’s efficient X^p, p=1...n script here.

Just to be clear, the ‘cache’ misses you mention refer to the PowerCache in Max’s code above?


#7

What he means is that if most of the time spent interacting with this PowerCache is for querying already calculated powers (“cache hit”), and not calculating new powers (“cache miss”), you should optimize for querying, and could perhaps afford to spend more time calculating the powers to begin with. In that case, you can use Julia’s built-in exponentiation which is slower but more accurate than repeated multiplication.

What I meant by that statement was that I focused on a high-level design that should allow high performance, but didn’t make an effort to optimize the code itself. That doesn’t feel useful at this point, since we don’t know the details of your problem. For example we don’t know in which format you expect the result; matrix (if so, which transpose?), or vector of vectors? In your other thread, I think people were also curious about what the application is that would necessitate calculating all these powers so quickly?


#8

Thanks again Max for sharing your insights. Apologies for being vague about the requirements. At the moment I’m not fully sure the desired end result is possible (analytically). The implementation will be MIT licensed package. If the idea works out, hopefully the implementation delivers performance.

I’m using this problem/package as a real world exercise in putting the Julia ecosystem through its paces.
You are correct the performance considerations may change as the implementation progresses.

It does appear that for this project Julia provides a elegant way of expressing ideas without sacrificing human comprehension.