Optimize benchmark-code

Johannes · December 10, 2019, 4:36pm

Hey all, I found this benchmark comparing different languages in economics and I wonder whether the underlying Julia code can be optimized further (full repository is available here):

## Basic RBC model with full depreciation
#
# Jesus Fernandez-Villaverde
# Haverford, July 29, 2013

function main()

    ##  1. Calibration

    α = 1/3     # Elasticity of output w.r.t. capital
    β = 0.95    # Discount factor

    # Productivity values
    vProductivity = [0.9792 0.9896 1.0000 1.0106 1.0212]

    # Transition matrix
    mTransition   = [0.9727 0.0273 0.0000 0.0000 0.0000;
                     0.0041 0.9806 0.0153 0.0000 0.0000;
                     0.0000 0.0082 0.9837 0.0082 0.0000;
                     0.0000 0.0000 0.0153 0.9806 0.0041;
                     0.0000 0.0000 0.0000 0.0273 0.9727]

    # 2. Steady State

    capitalSteadyState = (α*β)^(1/(1-α))
    outputSteadyState = capitalSteadyState^α
    consumptionSteadyState = outputSteadyState-capitalSteadyState

    println("Output = ",outputSteadyState," Capital = ",capitalSteadyState," Consumption = ",consumptionSteadyState)

    # We generate the grid of capital
    vGridCapital = collect(0.5*capitalSteadyState:0.00001:1.5*capitalSteadyState)

    nGridCapital = length(vGridCapital)
    nGridProductivity = length(vProductivity)

    # 3. Required matrices and vectors

    mOutput           = zeros(nGridCapital,nGridProductivity)
    mValueFunction    = zeros(nGridCapital,nGridProductivity)
    mValueFunctionNew = zeros(nGridCapital,nGridProductivity)
    mPolicyFunction   = zeros(nGridCapital,nGridProductivity)
    expectedValueFunction = zeros(nGridCapital,nGridProductivity)

    # 4. We pre-build output for each point in the grid

    mOutput = (vGridCapital.^α)*vProductivity;

    # 5. Main iteration

    maxDifference = 10.0
    tolerance = 0.0000001
    iteration = 0

    while(maxDifference > tolerance)
        expectedValueFunction = mValueFunction*mTransition';

        for nProductivity in 1:nGridProductivity

            # We start from previous choice (monotonicity of policy function)
            gridCapitalNextPeriod = 1

            for nCapital in 1:nGridCapital

                valueHighSoFar = -1000.0
                capitalChoice  = vGridCapital[1]

                for nCapitalNextPeriod in gridCapitalNextPeriod:nGridCapital

                    consumption = mOutput[nCapital,nProductivity]-vGridCapital[nCapitalNextPeriod]
                    valueProvisional = (1-β)*log(consumption)+β*expectedValueFunction[nCapitalNextPeriod,nProductivity]

                    if (valueProvisional>valueHighSoFar)
                	       valueHighSoFar = valueProvisional
                	       capitalChoice = vGridCapital[nCapitalNextPeriod]
                	       gridCapitalNextPeriod = nCapitalNextPeriod
                    else
                	       break # We break when we have achieved the max
                    end

                end

                mValueFunctionNew[nCapital,nProductivity] = valueHighSoFar
                mPolicyFunction[nCapital,nProductivity] = capitalChoice

            end

        end
        
        maxDifference     = maximum(abs.(mValueFunctionNew-mValueFunction))
        mValueFunction, mValueFunctionNew = mValueFunctionNew, mValueFunction
        
        iteration = iteration+1
        if mod(iteration,10)==0 || iteration == 1
            println(" Iteration = ", iteration, " Sup Diff = ", maxDifference)
        end

    end

    println(" Iteration = ", iteration, " Sup Diff = ", maxDifference)
    println(" ")
    println(" My check = ", mPolicyFunction[1000,3])
    println(" My check = ", mValueFunction[1000,3])

end

Right now, Julia 1.0 is around 40%-50% slower than both Fortran and C++ (in the paper, as well as on my machine).

As far as I can see, the code is written properly in column-major-order and I found no obvious type-instabilities. I also tried Traceur and @code_warntype, but that didn’t get me more insights.

I have the highest optimization level enabled and removed some printing, so that Juno’s slow printing doesn’t mess up the results. I know that one could play around with some libraries for the matrix multiplication, but that will affect the results only marginally.

I would appreciate any tips!

kristoffer.carlsson · December 10, 2019, 4:54pm

Seems that this is mostly bottlenecked on the log call so there doesn’t seem to be anything super-simple that can be done to significantly speed things up (without being smart with how often log gets called).

Johannes · December 10, 2019, 6:03pm

But is the log actually slower than the log in C++ and Fortran? It get’s called around 60 Mio. times, so this could be indeed quite significant (the runtime itself is around 1.3s on my machine).

kristoffer.carlsson · December 10, 2019, 6:16pm

You could remove the log call and re-time julia and the rest.

tbeason · December 10, 2019, 6:28pm

This was posted here quite a while back. I think a few people already implemented some optimizations on the code, that’s why you don’t see any low-hanging ones left.

favba · December 10, 2019, 6:44pm

Changing

to

maxDifference = maximum(abs.(mValueFunctionNew .- mValueFunction))

prevents one extra allocation in the outer loop and seems to help a tiny bit in my laptop.

tbeason · December 10, 2019, 6:55pm

It looks like you could just create a log consumption array outside the loops entirely. That should help.

favba · December 10, 2019, 7:11pm

This greatly reduce the number of allocations. In my laptop it is 18% faster than the original code.

using LinearAlgebra
function newmain()

    ##  1. Calibration

    α = 1/3     # Elasticity of output w.r.t. capital
    β = 0.95    # Discount factor

    # Productivity values
    vProductivity = [0.9792 0.9896 1.0000 1.0106 1.0212]

    # Transition matrix
    mTransition   = [0.9727 0.0273 0.0000 0.0000 0.0000;
                     0.0041 0.9806 0.0153 0.0000 0.0000;
                     0.0000 0.0082 0.9837 0.0082 0.0000;
                     0.0000 0.0000 0.0153 0.9806 0.0041;
                     0.0000 0.0000 0.0000 0.0273 0.9727]

    # 2. Steady State

    capitalSteadyState = (α*β)^(1/(1-α))
    outputSteadyState = capitalSteadyState^α
    consumptionSteadyState = outputSteadyState-capitalSteadyState

    println("Output = ",outputSteadyState," Capital = ",capitalSteadyState," Consumption = ",consumptionSteadyState)

    # We generate the grid of capital
    vGridCapital = collect(0.5*capitalSteadyState:0.00001:1.5*capitalSteadyState)

    nGridCapital = length(vGridCapital)
    nGridProductivity = length(vProductivity)

    # 3. Required matrices and vectors

    mOutput           = zeros(nGridCapital,nGridProductivity)
    mValueFunction    = zeros(nGridCapital,nGridProductivity)
    mValueFunctionNew = zeros(nGridCapital,nGridProductivity)
    mPolicyFunction   = zeros(nGridCapital,nGridProductivity)
    expectedValueFunction = zeros(nGridCapital,nGridProductivity)

    # 4. We pre-build output for each point in the grid

    mOutput = (vGridCapital.^α)*vProductivity;

    # 5. Main iteration

    maxDifference = 10.0
    tolerance = 0.0000001
    iteration = 0

    while(maxDifference > tolerance)
        mul!(expectedValueFunction, mValueFunction, mTransition');

        for nProductivity in 1:nGridProductivity

            # We start from previous choice (monotonicity of policy function)
            gridCapitalNextPeriod = 1

            for nCapital in 1:nGridCapital

                valueHighSoFar = -1000.0
                capitalChoice  = vGridCapital[1]

                for nCapitalNextPeriod in gridCapitalNextPeriod:nGridCapital

                    consumption = mOutput[nCapital,nProductivity]-vGridCapital[nCapitalNextPeriod]
                    valueProvisional = (1-β)*log(consumption)+β*expectedValueFunction[nCapitalNextPeriod,nProductivity]

                    if (valueProvisional>valueHighSoFar)
                	       valueHighSoFar = valueProvisional
                	       capitalChoice = vGridCapital[nCapitalNextPeriod]
                	       gridCapitalNextPeriod = nCapitalNextPeriod
                    else
                	       break # We break when we have achieved the max
                    end

                end

                mValueFunctionNew[nCapital,nProductivity] = valueHighSoFar
                mPolicyFunction[nCapital,nProductivity] = capitalChoice

            end

        end
        
        maxDifference     = maximum(x->(abs(x[1]-x[2])), zip(mValueFunctionNew, mValueFunction))
        mValueFunction, mValueFunctionNew = mValueFunctionNew, mValueFunction
        
        iteration = iteration+1
        if mod(iteration,10)==0 || iteration == 1
            println(" Iteration = ", iteration, " Sup Diff = ", maxDifference)
        end

    end

    println(" Iteration = ", iteration, " Sup Diff = ", maxDifference)
    println(" ")
    println(" My check = ", mPolicyFunction[1000,3])
    println(" My check = ", mValueFunction[1000,3])

end

Johannes · December 10, 2019, 7:26pm

That makes sense, I didn’t know that. But it nevertheless useful to repost, since this was pre-1.0…

longemen3000 · December 10, 2019, 7:34pm

maybe threading the code can help a little?

Johannes · December 10, 2019, 7:46pm

Yeah, but the point is the relative speed to C++ or Fortran (which are BTW still quite a bit faster): You could thread them as well.

Oscar_Smith · December 10, 2019, 8:25pm

By patching around julia’s log being slow, and only calculating it when it might be needed, I’ve gotten a 30% speedup. Here’s the inner loop. (I use log(2)*Base.Math.exponent(consumption) as a cheap overestimate for log(x) (floating point tricks). Here’s the new inner loop.

for nCapitalNextPeriod in gridCapitalNextPeriod:nGridCapital
    consumption = mOutput[nCapital,nProductivity]-vGridCapital[nCapitalNextPeriod]
    cheap = β*expectedValueFunction[nCapitalNextPeriod,nProductivity]

     if (1-β)*log(2)*Base.Math.exponent(consumption) +cheap >valueHighSoFar
         valueProvisional = (1-β)*log(consumption)+cheap
         if (valueProvisional>valueHighSoFar)
              valueHighSoFar = valueProvisional
              capitalChoice = vGridCapital[nCapitalNextPeriod]
              gridCapitalNextPeriod = nCapitalNextPeriod
          else
              break # We break when we have achieved the max
         end
     else
         break
     end
 end

Topic		Replies	Views
A Comparison of Programming Languages in Economics Community	62	6490	August 7, 2018
Julia slower than Matlab & Python? No Performance economics , tensorflow , matlab , pytorch	120	17374	March 16, 2021
Nice blogpost comparing Julia w/ 6 other languages to solve a Logistic DE New to Julia	24	1527	May 26, 2022
My julia code is somehow much slower than the matlab code New to Julia question , performance , matlab	55	4589	December 30, 2022
Help me optimize dynamic programming code General Usage	7	1628	October 12, 2019

Optimize benchmark-code

Related topics