Thanks so much. This helps to improve the performance. But I have a related question. Why does the @btime
result change in the following code?
#---------------------------------------
# housekeeping
#---------------------------------------
using FiniteDifferences, Profile, LinearAlgebra, BenchmarkTools
#---------------------------------------
# internal functions
#---------------------------------------
# function about equilibrium condition (excess labor demand)
@views function es_loop(wzet,y,x,kappa,sig,N,tau_mat,cma,cld)
# define the matrix dimension
broadcast!(.^, tau_mat, wzet[N+1], kappa)
# armington condition
# 1. compute the change in the market access (cma)
cma_mat = x .* (tau_mat .* wzet[1:N]) .^ (1-sig)
for i = 1:N
cma .+= cma_mat[i:i,:]
end
# 2. sum over the change in labor demand across destination (cld)
cld_mat = y .* (tau_mat .* wzet[1:N]) .^ (1-sig) .* wzet[1:N]' ./ cma
for j = 1:N
cld .+= cld_mat[:,j:j]
end
excess_loop = wzet[1:N] - cld
return excess_loop
end
#---------------------------------------
# parameters and data
#---------------------------------------
N = 3; # number of countries
sig = 4; # elasticity of substitution
homebias = 0.9; # the size of the domestic consumption tendency
X = homebias * Matrix{Float64}(I,N,N) + (1-homebias) * 1/N * ones(N,N);
y = X./sum(X,dims=2);
x = X./sum(X,dims=1);
# trade cost setting
kappa = vcat(zeros(N-1,N), [ones(N-1); 0]'); # the kernel of the trade cost change
tau_mat = ones(N,N); # pre-allocate the tau matrix
wzet = ones(N+1); # initial point
# pre-allocate
cma = zeros(1,N); # change in the market access
cld = zeros(N,1); # change in labor demand
cma_mat = zeros(N,N);
cld_mat = zeros(N,N);
#---------------------------------------
# check performance
#---------------------------------------
es_loop(wzet,y,x,kappa,sig,N,tau_mat,cma,cld)
Specifically, the last line es_loop(wzet,y,x,kappa,sig,N,tau_mat,cma,cld)
returns a correct answer as follows:
3×1 Matrix{Float64}:
0.0
0.0
0.0
However, when I change the last line into @btime es_loop(wzet,y,x,kappa,sig,N,tau_mat,cma,cld)
, it returns a wrong answer as follows:
240.423 ns (3 allocations: 336 bytes)
3×1 Matrix{Float64}:
-12.700585535625594
-12.700585535625592
-12.700585535625592