Const FileData = AbstractVector{UInt8}

JuergenD · November 14, 2020, 3:06pm

Hi,

I thought const FileData = AbstractVector{UInt8} it is just an alias, but afaik it is not.

buf = obj.buf  # buf is defined as ::FileData
while start < size
    ...
    fn(@view buf[start:stop])
    ...
end

The version above allocates memory in every @view invocation. In my case, it is several GBs of data.
The version below, allocates (almost) nothing. I expected this also with the version above. Any explanation why that is not the case?

buf::Vector{UInt8} = obj.buf  # buf is defined as FileData
while start < size
   ...
   fn(@view buf[start:stop])
   ...
end

thanks, Juergen

Henrique_Becker · November 14, 2020, 3:52pm

AbstractVector{UInt8} and Vector{UInt8} are two very different things, why did you not compare using const FileData = Vector{UInt8} instead of const FileData = AbstractVector{UInt8}?

JuergenD · November 14, 2020, 4:14pm

Now I completely removed const FileData = AbstractVector{UInt8} from the source code.

This is still slow

buf = obj.buf  # buf is defined as FileData
while start < size
   ...
   fn(@view buf[start:stop])
   ...
end

This is 30x times faster (and has almost no allocations):

buf::Vector{UInt8} = obj.buf  # buf is defined as FileData
while start < size
   ...
   fn(@view buf[start:stop])
   ...
end

johnmyleswhite · November 14, 2020, 4:33pm

Did you replace const FileData = AbstractVector{UInt8} with const FileData = Vector{UInt8}? That was the suggestion @Henrique_Becker made. You seem to be doing something different.

Skoffer · November 14, 2020, 4:56pm

Try wrapping your calculation in a function. You are working in a global scope and it is not a very good idea for performance.

JuergenD · November 15, 2020, 10:55am

@Skoffer: this is just a code snippet, actually taken from a function

@johnmyleswhite / @Henrique_Becker I now reverted the latest changes and only changed const FileData = AbstractVector{UInt8} with const FileData = Vector{UInt8}. I had to make some changes to our test code, which is using b"…" a lot, which creates CodeUnits{UInt8,String}. That was my original reason to use AbstractVector. With this change the code now runs much faster and much fewer allocations.

And even though I understand the performance increase, I still don’t understand the reason why Julia needs to do so many allocations (copies I assume) when passing abstract types around. I thought variables and function arguments are just binding to the same “slot”, no matter whether abstract or not. At least as long as I’m not converting to another concrete type.

-Juergen

Henrique_Becker · November 15, 2020, 3:54pm

Hmmm, I am not sure this has anything with copies. Often the allocations come from the fact the type cannot be unboxed this is, just stored as it is, it is instead behind a pointer which also stores the current concrete type saved in the same field (this indirection layer needs to be separately allocated many times). Also, if you had an struct that is concrete, but had a boxed/abstract field inside, any function receiving a struct will specialize for this concrete struct, but when it has to deal with the inner field then it has to work around any possible type that may be there what is know to generate allocation. One way to avoid that is not manipulating the boxed field directly in the function that takes your struct, but instead grouping all computation over that field inside a function and just passing the field to it, so this inner function will be specialized to the right type. This technique is called “function barrier” and described in the docs.

Tamas_Papp · November 16, 2020, 7:47am

Note that it would be much easier to help with a self-contained MWE.

Topic		Replies	Views
Declare vector type or not / views / subarrays Performance	5	774	June 24, 2020
Is operating a vector that have abstract type really slow? General Usage	11	768	November 29, 2019
Accessing abstract array struct field allocates memory? Internals & Design	3	412	February 28, 2023
Memory allocations when returning vectors General Usage array , memory-allocation	15	1491	June 6, 2018
Julia allocates when passing variable to function instead of constant New to Julia question	15	807	June 2, 2024

Const FileData = AbstractVector{UInt8}

Related topics