Finite element method interpolation performance

I have a question about the performance in interpolation in finite element method.
In the Official Document, it is suggested to do small size array operations with StaticArrays.jl.
In my finite element program, the global node displacement is calculated first and stored in a Vector, and I want to calculate the displacement in the integral points inside the element. For each integral point, I get the node displacement of the element from the global displacement vector and store it as a MVector like

@views poidis.disfre_nod .= nodfre_nod[ithdim_poi,poidis.elecon_nod]
@views poidis.noddis_ele .= MVector{numnod_ele}(noddis_nod[poidis.disfre_nod])

The shape function is also a MVector. Then I calculate the displacement of integral point by

poidis.poidis_poi[ithdim_poi,ithpoi_poi] = dot(poidis.shafun_ele,poidis.noddis_ele)

I need to loop over all the elements, then I test the memory allocation ,the result is

        0             @views poidis.disfre_nod .= nodfre_nod[ithdim_poi,poidis.elecon_nod]
 13747184             @views poidis.noddis_ele .= MVector{numnod_ele}(noddis_nod[poidis.disfre_nod])
  3436800             poidis.poidis_poi[ithdim_poi,ithpoi_poi] = dot(poidis.shafun_ele,poidis.noddis_ele)

As we can see, it is a huge number. I thought maybe the type conversion from Vector to MVector is time consuming, then I make a little change like

        0             @views poidis.disfre_nod .= nodfre_nod[ithdim_poi,poidis.elecon_nod]
        -             # @views poidis.noddis_ele .= MVector{numnod_ele}(noddis_nod[poidis.disfre_nod])
 10310352             @views poidis.noddis_ele .= noddis_nod[poidis.disfre_nod]
  3436800             poidis.poidis_poi[ithdim_poi,ithpoi_poi] = dot(poidis.shafun_ele,poidis.noddis_ele)

The memory allocation is lower, but still a big number. What is the possible reason?
Another question is that why the dot of two MVector need to allocate so much memory?
Thanks for your suggestions! If necessary, I will post the code.

I just solve it! I preallocated the poidis.disfre_nod and poidis.noddis_ele as MVector, but did not claim the dimention and numerical type, so the complier need to judge the type. After I claim the type of poidis.disfre_nod as MVector{4,Float64} and poidis.noddis_ele as MVector{4,Int64}, the memory allocation becomes little.

What is the “Official Document”?

Just the Documentation in See Julia Documentation · The Julia Language.