I ran it with 8 samples instead of the 16 from the original MVE and got nearly identical results, at least for those two allocation types. As far as the laptop being underpowered or overtaxed, it’s certainly possible, but I have run other models on it that seemed to perform reasonably and weren’t too different from what I’m doing here. At least, didn’t seem all that different…
Flat Flat% Sum% Cum Cum% Name Inlined?
22 12.94% 12.94% 22 12.94% Alloc: Base.IntrusiveLinkedList{Task}
18 10.59% 23.53% 18 10.59% Alloc: Base.Threads.SpinLock
16 9.41% 32.94% 16 9.41% Alloc: Task
16 9.41% 42.35% 16 9.41% Alloc: NNlib.var\"#539#540\"{NNlib.var\"#conv_part#538\"{Array{Float32, 3}, Float32, Float32, SubArray{Float32, 5, Array{Float32, 5}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, SubArray{Float32, 5, Array{Float32, 5}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}, SubArray{Float32, 5, Array{Float32, 5}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}, NNlib.DenseConvDims{3, 3, 3, 6, 3}, Int64, Int64, Int64}, UnitRange{Int64}, Int64}
16 9.41% 51.76% 16 9.41% Alloc: Base.GenericCondition{Base.Threads.SpinLock}
12 7.06% 58.82% 12 7.06% Alloc: Memory{Float32}
10 5.88% 64.71% 10 5.88% Alloc: Matrix{Float32}
10 5.88% 70.59% 10 5.88% Alloc: Array{Float32, 4}
8 4.71% 75.29% 8 4.71% Alloc: Array{Float32, 5}
5 2.94% 78.24% 5 2.94% Alloc: Profile.Allocs.BufferType
4 2.35% 80.59% 4 2.35% Alloc: Vector{UInt64}
4 2.35% 82.94% 4 2.35% Alloc: Memory{UInt64}
4 2.35% 85.29% 4 2.35% Alloc: Memory{Any}
4 2.35% 87.65% 4 2.35% Alloc: BitVector
3 1.76% 89.41% 3 1.76% Alloc: Tuple{Int64, Int64, Int64}
2 1.18% 90.59% 2 1.18% Alloc: Vector{Int64}
2 1.18% 91.76% 2 1.18% Alloc: Vector{Any}
2 1.18% 92.94% 2 1.18% Alloc: ReentrantLock
2 1.18% 94.12% 2 1.18% Alloc: Memory{Int64}
2 1.18% 95.29% 2 1.18% Alloc: InvalidStateException
2 1.18% 96.47% 2 1.18% Alloc: Channel{Any}
2 1.18% 97.65% 2 1.18% Alloc: Array{Float32, 3}
1 0.59% 98.24% 1 0.59% Alloc: NTuple{6, Int64}
1 0.59% 98.82% 1 0.59% Alloc: NNlib.PoolDims{3, 3, 3, 6, 3}
1 0.59% 99.41% 1 0.59% Alloc: Flux.Chain{Tuple{Flux.Chain{Tuple{Flux.Conv{2, 4, typeof(NNlib.leakyrelu), Array{Float32, 4}, Bool}, Flux.MeanPool{2, 4}, Flux.Conv{2, 4, typeof(NNlib.leakyrelu), Array{Float32, 4}, Bool}, Flux.Chain{Tuple{Flux.Dense{typeof(NNlib.leakyrelu), Matrix{Float32}, Vector{Float32}}, Main.var\"workspace#22\".var\"#init_cvn##0#init_cvn##1\", Flux.Dense{typeof(NNlib.leakyrelu), Matrix{Float32}, Vector{Float32}}, Main.var\"workspace#22\".var\"#init_cvn##2#init_cvn##3\", Flux.Dense{typeof(tanh), Matrix{Float32}, Vector{Float32}}}}}}, Flux.Chain{Tuple{Flux.Dense{typeof(NNlib.leakyrelu), Matrix{Float32}, Vector{Float32}}, Flux.Dense{typeof(NNlib.σ), Matrix{Float32}, Vector{Float32}}}}}}
1 0.59% 100.00% 1 0.59% Alloc: @NamedTuple{alpha::Int64, beta::Int64}
0 0.00% 100.00% 170 100.00% with_logstate
0 0.00% 100.00% 170 100.00% with_logger_and_io_to_logs
0 0.00% 100.00% 170 100.00% with_logger
0 0.00% 100.00% 170 100.00% with_io_to_logs
0 0.00% 100.00% 4 2.35% vect
0 0.00% 100.00% 2 1.18% sync_end(::Channel{Any})
0 0.00% 100.00% 170 100.00% start_task
0 0.00% 100.00% 29 17.06% similar
0 0.00% 100.00% 170 100.00% run_inside_trycatch
0 0.00% 100.00% 170 100.00% run_expression
0 0.00% 100.00% 18 10.59% reshape
0 0.00% 100.00% 4 2.35% put_buffered(::Channel{Any}, ::Task)
0 0.00% 100.00% 4 2.35% put!
0 0.00% 100.00% 4 2.35% push!
0 0.00% 100.00% 6 3.53% permutedims!
0 0.00% 100.00% 16 9.41% permutedims
0 0.00% 100.00% 17 10.00% new_as_memoryref
0 0.00% 100.00% 6 3.53% meanpool_direct!
0 0.00% 100.00% 8 4.71% meanpool!
0 0.00% 100.00% 11 6.47% meanpool
0 0.00% 100.00% 170 100.00% maybe_record_alloc_to_profile
0 0.00% 100.00% 170 100.00% macro expansion
0 0.00% 100.00% 170 100.00% jl_toplevel_eval_flex
0 0.00% 100.00% 170 100.00% jl_interpret_toplevel_thunk
0 0.00% 100.00% 36 21.18% jl_gc_alloc_
0 0.00% 100.00% 170 100.00% jl_f_invokelatest
0 0.00% 100.00% 170 100.00% jl_f__apply_iterate
0 0.00% 100.00% 170 100.00% jl_apply
0 0.00% 100.00% 25 14.71% jl_alloc_genericmemory_unchecked
0 0.00% 100.00% 12 7.06% isperm
0 0.00% 100.00% 8 4.71% insert_singleton_spatial_dimension
0 0.00% 100.00% 170 100.00% ijl_toplevel_eval_in
0 0.00% 100.00% 170 100.00% ijl_toplevel_eval
0 0.00% 100.00% 16 9.41% ijl_new_task
0 0.00% 100.00% 129 75.88% ijl_gc_small_alloc
0 0.00% 100.00% 5 2.94% ijl_gc_managed_malloc
0 0.00% 100.00% 12 7.06% falses
0 0.00% 100.00% 170 100.00% eval_value
0 0.00% 100.00% 170 100.00% eval_stmt_value
0 0.00% 100.00% 170 100.00% eval_body
0 0.00% 100.00% 170 100.00% eval(::Module, ::Any)
0 0.00% 100.00% 170 100.00% do_call
0 0.00% 100.00% 106 62.35% conv_im2col!
0 0.00% 100.00% 106 62.35% conv_group
0 0.00% 100.00% 112 65.88% conv!
0 0.00% 100.00% 118 69.41% conv
0 0.00% 100.00% 170 100.00% compute
0 0.00% 100.00% 2 1.18% close
0 0.00% 100.00% 6 3.53% checkdims_perm
0 0.00% 100.00% 4 2.35% array_new_memory
0 0.00% 100.00% 12 7.06% _isperm
0 0.00% 100.00% 4 2.35% _growend!
0 0.00% 100.00% 169 99.41% _applychain
0 0.00% 100.00% 48 28.24% _Task
0 0.00% 100.00% 170 100.00% [unknown function]
0 0.00% 100.00% 4 2.35% Val
0 0.00% 100.00% 80 47.06% Task
0 0.00% 100.00% 18 10.59% SpinLock
0 0.00% 100.00% 6 3.53% ReentrantLock
0 0.00% 100.00% 11 6.47% MeanPool
0 0.00% 100.00% 22 12.94% IntrusiveLinkedList
0 0.00% 100.00% 27 15.88% GenericMemory
0 0.00% 100.00% 40 23.53% GenericCondition
0 0.00% 100.00% 20 11.76% Dense
0 0.00% 100.00% 118 69.41% Conv
0 0.00% 100.00% 14 8.24% Channel
0 0.00% 100.00% 169 99.41% Chain
0 0.00% 100.00% 12 7.06% BitArray
0 0.00% 100.00% 43 25.29% Array
0 0.00% 100.00% 10 5.88% *
0 0.00% 100.00% 4 2.35% (::Base.var\"#_growend!##0#_growend!##1\"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}})()
0 0.00% 100.00% 170 100.00% #with_logger_and_io_to_logs#121
0 0.00% 100.00% 170 100.00% #with_io_to_logs#125
0 0.00% 100.00% 170 100.00% #run_expression#28
0 0.00% 100.00% 6 3.53% #meanpool_direct!#564
0 0.00% 100.00% 11 6.47% #meanpool#377
0 0.00% 100.00% 8 4.71% #meanpool!#361
0 0.00% 100.00% 6 3.53% #meanpool!#346
0 0.00% 100.00% 10 5.88% #init_cvn##2
0 0.00% 100.00% 10 5.88% #init_cvn##0
0 0.00% 100.00% 170 100.00% #handle##0
0 0.00% 100.00% 100 58.82% #conv_im2col!#536
0 0.00% 100.00% 118 69.41% #conv#124
0 0.00% 100.00% 106 62.35% #conv!#181
0 0.00% 100.00% 112 65.88% #conv!#143
0 0.00% 100.00% 170 100.00% #36
0 0.00% 100.00% 170 100.00% #32
0 0.00% 100.00% 170 100.00% #123
0 0.00% 100.00% 170 100.00% ##function_wrapped_cell#665