I am writing some code that performs a simple time-history integration solver (Newmark method for structural dynamics) and I looking for some suggestions in how I can improve the performance of my code. The below snippet of code is part of a larger function that is called from another script.
Currently, I am calculating a parameter called displacements for many time steps (think on the order of 5,000,000 steps). From observing the code and running ProfileView to profile the code it is apparent that the matrix multiplication seems to be slowing down the script in the for-loop. Some details about the input:
dofs = number of degrees of freedom in the system (at most 1282)
num_time_steps = 5,000,000
yF = vector of size (num_steps, 1)
loading_vec = vector of size (dofs-2, 1)
a1, a2, a3, k_hat_inv = matrices of size (dofs-2, dofs-2) (a1, a2, a3 are sparsely populated matrices, k_hat_inv is a full matrix)
vel_const1, vel_const2, vel_const3 = Float64 constant
acc_const1, acc_const1, acc_const1= Float64 constant
For a simple example when dofs = 12 this code runs very quickly (~3-4s) but as dofs increases to 1282 the run time increases very significantly (still running after 1/2 day) and I would like to find some ways to decrease this run time. Are there any other ways that I can improve the performance of this code?
You can also see in the for-loop some commented out lines, these are all things I have tried but resulted in poorer performance compared to the uncommented lines.
# Initial state, forst time step at t = 0s cdisp = zeros(dofs - 2) cvel = zeros(dofs - 2) #caccel = m_inv * (loading_vec .- (c * cvel) .- k * cdisp) -> Always zero for zero initial conditions caccel = zeros(dofs - 2) u_i_p1 = zeros(dofs-2) udot_i_p1 = zeros(dofs-2) u2dot_i_p1 = zeros(dofs-2) a1_vec = zeros(dofs-2) a2_vec = zeros(dofs-2) a3_vec = zeros(dofs-2) P_i_p1_hat = zeros(dofs-2) # Initialize the displacement vector displacements = zeros(dofs - 2, disp_steps) # Start the time history analysis as per Chopra, 2017 disp_counter = 1 println(" Time history started") for ii = 1:num_time_steps-1 # Obtain the force for the next time step @inbounds loading_vec[impact_dof] = yF[ii+1] # Obtain P_i_p1_hat #P_i_p1_hat = loading_vec + a1 * cdisp + a2 * cvel + a3 * caccel P_i_p1_hat .= loading_vec .+ mul!(a1_vec, a1, cdisp) .+ mul!(a2_vec, a2, cvel) .+ mul!(a3_vec, a3, caccel) # Calculate the displacement due to this force, u_i+1 = k_hat_inv * P_i_p1 #u_i_p1 = k_hat_inv * P_i_p1_hat mul!(u_i_p1, k_hat_inv, P_i_p1_hat) #u_i_p1 = k_hat_inv * (loading_vec + a1 * cdisp + a2 * cvel + a3 * caccel) #mul!(u_i_p1, k_hat_inv, (loading_vec .+ (a1 * cdisp) .+ (a2 * cvel) .+ (a3 * caccel))) #mul!(u_i_p1, k_hat_inv, (loading_vec .+ mul!(a1_vec, a1, cdisp) .+ mul!(a2_vec, a2, cvel) .+ mul!(a3_vec, a3, caccel))) # Calculate the velocity due to this displacement, u_dot_i+1 #udot_i_p1 = vel_const1 * (u_i_p1 - cdisp) + vel_const2 * cvel + vel_const3 * caccel udot_i_p1 .= (vel_const1 .* (u_i_p1 .- cdisp)) .+ (vel_const2 .* cvel) .+ (vel_const3 .* caccel) # Calculate the acceleration due to this displacement, u_2dot_i+1 #u2dot_i_p1 = acc_const1 * (u_i_p1 - cdisp) - acc_const2 * cvel - acc_const3 * caccel u2dot_i_p1 .= (acc_const1 .* (u_i_p1 .- cdisp)) .- (acc_const2 .* cvel) .- (acc_const3 .* caccel) # Get current values for the next step cdisp .= u_i_p1 cvel .= udot_i_p1 caccel .= u2dot_i_p1 # Write the displacement vector if ii % disp_save == 0 disp_counter += 1 displacements[:, disp_counter] .= u_i_p1 end end