That’s a pretty inelegant failure! I get something similar.
For smaller sizes it does run. But I wouldn’t expect it to be efficient, it’s completely unaware of sparsity & just works through every element. It shouldn’t be hard to write a fast logsumexp(:: SparseMatrixCSC ) though.
julia> Tullio.storage_type(sprand(10,10,0.1)) # not <: Array{<:BlasFloat}, hence will not use LoopVectorization
SparseMatrixCSC{Float64,Int64}
julia> A[3,4] # but you can index, so fallback seems OK?
0.0
julia> logsumexp(A)
julia(43313,0x700003cdf000) malloc: *** error for object 0x18a8f9000: pointer being freed was not allocated
julia(43313,0x700003cdf000) malloc: *** set a breakpoint in malloc_error_break to debug
signal (6): Abort trap: 6
in expression starting at REPL[73]:1
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 1296329819 (Pool: 1296254828; Big: 74991); GC: 596
Abort trap: 6