This week I gave a tutorial on performance in Julia and helped a few beginners optimize their code. One of the most frequent mistakes was structs with abstractly-typed fields.
As a pedagogical tool, @serenity4 and I wrote a small (unregistered) package to help diagnose and fix these issues. Here it is:
A few examples:
julia> using CheckConcreteStructs
julia> @check_concrete struct A
x
end
ERROR: AbstractFieldError in struct `A`: field `x` with
declared type `Any` is not concretely typed.
julia> @check_concrete struct B
x::AbstractVector{Float64}
end
ERROR: AbstractFieldError in struct `B`: field `x` with
declared type `AbstractVector{Float64}` is not concretely typed.
julia> @check_concrete struct C
x::Vector{<:Real}
end
ERROR: AbstractFieldError in struct `C`: field `x` with
declared type `Vector{<:Real}` is not concretely typed.
julia> @check_concrete struct D{T,V<:AbstractVector}
x::V
end
Feel free to take it out for a spin, test it on your own fancy structs and corner cases, open issues if needed! We’re mostly curious to see:
Oh, I actually wrote something similar recently! Unfortunately it’s on our private repo at work and would take a little bit of effort to get permission to open source. I wish I would have put it in ConcreteStructs instead, now! Maybe this or something like it should live there.
As for the implementation of the one I wrote, I actually went for something that operates on modules and searches through the defined structs for non-concreteness. You have to explicitly opt out like:
struct MyStruct{R<:Real}
an_abstract_field::Vector{Real}
another_abstract_field::AbstractVector{R}
a_concrete_field::Vector{R}
end
# Allow any fields to be abstract...
@allow_abstract_fields MyStruct
# ...or only allow certain fields
@allow_abstract_fields MyStruct: an_abstract_field, another_abstract_field
Since many of the people touching our Julia code are new to the language or only rarely have to interact with it, having this be opt-out made the most sense as we could put checks in CI and people wouldn’t have to know to add checks for their structs. But I think for pedagogical purposes, having an opt-in approach instead makes a lot of sense.
I was thinking the @check_concrete macro could apply both to struct definitions or whole modules (got the idea from @stable in DispatchDoctor.jl). That way you get the fine-grained control if necessary, or you can go brute force.
that eagerly expands any includes and places the check on any structs that are defined? That would have made things a lot easier than what I had to do to check after the module’s already been created.
The following definitions will execute without error:
...
@check_concrete struct GoodType2{T<:Real}
x::Vector{T}
end
@check_concrete struct GoodType3{T<:Real,V<:AbstractVector{T}}
x::V
end
Since these also allow parametric types with abstract type parameters and possibly direct abstract type fields, would it be worth including a function to check fields of particular parametric types as well? Or is the idea that other reflection and type inference tools like @code_warntype would be enough to catch those, and this is intended as training wheels for beginners who want specialized performance but don’t quite grasp what type annotations do where yet?
The per-type macro should exist for per-type designation especially interactive use, but I imagine it’s possible for a beginner to want to do this check across a file. @check_concrete_all begin ... end could be put in each file, but I don’t like scattering boilerplate and I’d be concerned about niche “top-level” stuff cropping up. The per-module macro would be nice, but it doesn’t inherently transform what will be evaluated in include calls and doesn’t address scripts not encapsulated by modules. A function for tagging definitions with the macro can be put into include, and it could look more or less like this not-a-working example:
function checkmaybestruct(e)
if !(e isa Expr && e.head == :struct)
return e
# missing module check
else
return :(@check_concrete $e)
end
end
include(checkmaybestruct, "customtypes.jl")
Note that this doesn’t address module expressions, so the per-module macro would be helpful, gets a bit recursive. The per-module macro finding include calls in top-level control flow would be tougher, but maybe that’s not necessary for training wheels.
That looks like a check after MyModule was already evaluated, so wouldn’t a function be more straightforward check_concrete_structs_in(MyModule)? That would neatly get around dealing with unevaluated includes, but how reliable is it to find every type manually defined in that module, as opposed to those imported or referenced from other modules or automatically defined for closures? The latter falls under names(mod; all=true) and parentmodule:
julia> module A
function foo(x)
()->x
end
end
Main.A
julia> typeof(A.foo(1))
Main.A.var"#1#2"{Int64}
julia> Symbol("#1#2") in names(A;all=true)
true
julia> parentmodule(typeof(A.foo(1)))
Main.A
wouldn’t a function be more straightforward check_concrete_structs_in(MyModule) ?
That’s how I implemented it, but it just took a lot of effort to get it working right and I’m still not 100% sure it’s always correct. Having access to the AST of the struct definition would have been pretty beneficial.
Worth mentioning that on the other hand, some evaluation is useful for figuring out when a type is concrete (isconcretetype), so I don’t expect a macro working on a module to figure everything out itself, rather transform to code that does. I also imagine that checking many type definitions in a module or file should emit warnings or append to a log instead of error at the first type.
Rather the second one. If you construct a GoodType2 with, say, a x::Vector{AbstractFloat}, this is not a problem in the struct definition, and it can be fixed without redefining GoodType2 itself. On the other hand, if the type annotations or parameters are insufficient in struct GoodType2, there is nothing you can do but redefine it, and that is impossible for people who can’t modify the source code.
That would be very handy, but as @jonniedie said, having access to the AST (i.e. doing this when the struct definition is parsed) makes things much easier. You would think that all it takes is checking fieldtypes for concreteness, but that’s not sufficient in the parametric case:
In this example you would have to check fieldtpes(A{Float64}) instead of fieldtypes(A). But in the general case, structs may have many type parameters, and it’s not obvious wihch combination should be checked for field concreteness. It’s not even obvious how to find a combination that is allowed by subtyping constraints (in the general case I suspect it’s NP-hard).
Yeah I still need to look at that to figure out how to handle the includes. It may not be perfect at first.