I defined a composite type Product tho store relevant characteristics of a set of products. I then construct an array of Product and I am ultimately interested in accessing the field values of this array as efficient as possible. I have tried different ways of doing so and I am getting by far the best performance using list comprehension. I was hoping someone could give intuition as to way this is happening and/or provide with the optimal way. Thanks a lot!
using DataFrames
using BenchmarkTools
# define a product structure
struct Product{T}
id::Int64
X0::T
X1::T
X2::T
end
function initialize_products(data::DataFrame, number_products::Int64)
# initialize product array
products = Array{Product}(undef, number_products)
for i in 1:number_products
products[i] = Product(collect(data[i, :])...)
end
return products
end
function comprehension(products)
return [product.X0 for product in products]
end
function get_fields(products)
return getfield.(products, :X0)
end
function using_map(products)
return map(product -> product.X0, products)
end
function main()
df = DataFrame(product_id = [2, 3, 4, 6, 8, 9, 10],
X0 = ones(7),
X1 = rand(7),
X2 = rand(7),
)
products = initialize_products(df, 7)
println(@btime comprehension($products))
println(@btime get_fields($products))
println(@btime using_map($products))
end
main()
Noticing that you manually specified an abstract type for the array’s element type, which will get in the way of inference of the X_::T fields. Comprehensions, broadcasting, and map don’t infer element types and initialize output arrays the same way, but the printouts suggest that they all did some work to infer Vector{Float64} outputs. The more difficult Array{Product} input could be exacerbating the performance discrepancy.
T::Float64 for your example’s inputs, so you could try Array{Product{Float64}}. If you want to make it a bit more flexible, you can first collect the Array{Product,1} first, but check in the loop if all the Product(...) calls result in the same type as the first element’s, let’s call it F, in which case you can copy to a more specific Array{F,1}.
Do this bit of type parameter optimization first and show how the benchmark changes on your system.
Another optimization you could do to get around the automatic output allocation in broadcasting and map is to use in-place broadcasting and map! to mutate an output array you allocate manually.
You code very well for one! I don’t know what you need to know to understand my previous comment, but I’m mainly talking about Julia’s type system, and type inference is an attempt to narrow down types of variables, elements, and expressions, often by the compiler or instantiation of containers. Poor type inference shouldn’t change runtime values, but the compiled code must spend extra time to figure out types and appropriate methods at runtime. @code_warntype and similar reflection methods and macros can help you see type inference, though I’ll warn these assume the runtime types of the input call’s arguments are inferred perfectly (aka specializing), which you may manually opt out of or the compiler intentionally does not do in rare circumstances.
That is quite a bit of reading to do, so if you have any more targeted questions that can clear things up, feel free to ask.
Aside from the advice above, you may want to look into using a structure-of-arrays instead of the array-of-structures form you have here. See Overview · StructArrays for more information, as the StructArrays.jl package has a bunch of convenient features, as well as generally being quite efficient - I’m on my phone so can’t benchmark on your code, but I’ve generally had good results (or at the very least, haven’t noticed a slowdown that would outweigh the convenience).
I think the compiler doesn’t “know” the field name when you do getfield.(products, :X0). Here, :X0 is a regular runtime value, which requires dispatch when actually running the code.
Depending on how you get the original dataset, you may not need dataframes at all: just read your data into a Vector or StructArray in the first place.
I don’t use DataFrames, so running your MWE is a bit difficult. And, really, DataFrames is totally irrelevant to the question, so I suggest you initialize your data like this
products = [Product(i, 1.0, rand(), rand()) for i in [2, 3, 4, 6, 8, 10]]
As for broadcasting, you can make it equally fast as the other methods, just remember to move the broadcasting to the outermost level:
Oh wow, interesting. In terms of dataframes I am constrained by the application but you are right, this is irrelevant. In terms of the broadcasting: do you know why this solution works? Thanks a lot!
I’m not totally sure, but it might perhaps be more difficult to constant propagate through the broadcasting call, while getfield(p, :X0) is sure to be optimized.
To elaborate, with the exception of a few types (Function, Type, Vararg), Julia’s compiler specializes method calls on the argument types by default. That means if you have a very flexible method function foo(x::Any), and you call foo(1), the compiler will compile that foo method specifically for an Int input. Narrowing down types throughout the method is important for efficiency.
Thing is, that’s not good enough for something like getfield(p, :X0). The runtime type of :X0 is Symbol, and that doesn’t narrow down any field or its type. So, the compiler effectively specializes on the constant value of :X01 when it sees the getfield(p, :X0) call in another method it’s compiling, like getX0. BenchmarkTools actually wraps the benchmarked code in a method so there is compilation going on, but it doesn’t help getfield.(products, :X0) because it is transformed to an indirect broadcast(getfield, products, :X0), the getfield(p, :X0) call is gone. It’s not impossible, but it’s more complicated to say that broadcast or whatever related methods will specialize on the constant value of the 3rd argument if the 1st argument is getfield, and the rule is too indirect to identify and work on different methods that internally call getfield.
Thankfully it’s not too hard to put the :X0 into a direct getfield call for broadcasting. As an alternative to defining a named function getX0, you could inline an anonymous one: (p -> getfield(p, :X0)).(products) or the sugar syntax (p -> p.X0).(products), which technically calls getproperty which may fall back to getfield.