Hi all, I recently made a little package StaticModules.jl (awaiting registration in 3 days).
The idea is that a StaticModule is kinda like a module, but can be created in the local scope, is immutable and doesn’t support things like using. Basically a fancy NamedTuple that you can ‘run code inside’.
For example,
julia> using StaticModules
julia> @staticmodule Foo begin
x = 1; y = x^2 + 2x - 1
f(z) = (x^2 + 2y)/z
end
StaticModule Foo containing
f = f
y = 2
x = 1
julia> f(x/y)
ERROR: UndefVarError: x not defined
Stacktrace:
[1] top-level scope at REPL[7]:1
julia> @with Foo begin
f(x/y)
end
10.0
Furthermore, the @with macro will work with any type that’ll give you values from getproperty, e.g.
julia> nt = (;a=1, b="hi")
(a = 1, b = "hi")
julia> @with nt begin
a + length(b)
end
3
Importantly, you should be able to use @staticmodule and @with without suffering any runtime performance penalty, so this can be used even in things like tight-loops if you want to namespace some code.
I kinda see this as offering similar (though different) features from packages like Parameters.jl. One advantage this has is that StaticModules.@with will work on arbitrary structs and NamedTuples, whereas Parameters.@unpack requires that you either register a struct with Parameters.jl, or tell it what symbols to unpack from a struct or named tuple. @mauro3, if you’re interested, I think StaticModules.@with can be lifted for Parameters.jl pretty easily, though it’d add new dependancies.
All comments, questions, suggestions or bikeshedding welcome!
Yeah, good suggestion I was actually thinking about that one this morning.
Currently, the way @with works is that
@with foo begin
x = 1 + y
z = f(x)
end
it’ll detect that that :y and :f are symbols from outside the scope of the block, so it’ll turn this into
let y = (:y in propertynames(foo) ? foo.y : y), f = (:f in propertynames(foo) ? foo.f : f)
x = 1 + y
z = f(x)
end
Hence, I think the easiest thing to do would be to make it turn
@with (foo, bar) begin
x = 1 + y
z = f(x)
end
into
let y = (:y in propertynames(foo) ? foo.y : :y in propertynames(bar) ? bar.y : y), f = (:f in propertynames(foo) ? foo.f : :f in propertynames(bar) ? bar.f : f)
x = 1 + y
z = f(x)
end
If everything is inferrable, it should be possible to eliminate these if/else blocks at compile time, but the more names in the @with the higher the chances are that the compiler gives up.
The way this would handle name collision is that whichever thing comes first has priority (e.g. foo has priority over bar)
Just a note with regards to DataFrames, unfortunately this isn’t possible to make performant with DataFrames. See this post outlining a similar feature yesterday.
Since the property names of a dataframe aren’t inferrable, any expression whose final form depends on the types in an if/else way won’t have all the optimizations available that normal functions have. If you don’t want special designations for columns, i.e. df.x referenced by the Symbol:x then you would need to treat every “variable” in the expression as a column. This can get complicated very quickly.
Yes, how could it be otherwise? This is a general problem with dataframes, they might as well be a Dict{Symbol, Any}. Any sort of zero-cost static abstraction like this is going to pay a performance price for untyped stuff like DataFrames or Dicts.
Of course, if you’re working on a dataframe, presumably things like getproperty shouldn’t be your bottleneck anyways and if something like this is causing you a significant bottleneck, that’s a good indication that a dataframe is the wrong tool for the job you’re doing.
Just to be clear, in current DataFramesMeta (on master and the release branch), we always know which parts of the expression represent columns in the data frame. So the code-generation is as fast as just taking out the columns individually and using a function that is defined as compile time. So the benefit of DataFramesMeta is that you can use a data frame for this and we get to pretent, as much as possible, that the propertynames and types are known.
Right, but in DataFramesMeta.@with the user is expressedly marking for you which symbols are keys of the DataFrame. And there’s still a runtime penalty because you need to actually run getproperty(df, :x).
Yes, exactly. That should be very cheap, of course, especially relative to the computation. Additionally in @byrow you only pay that penalty once, even though we loop through all rows.
Can static modules be created dynamically, and will they be garbage collected? I ask because I’ve been doing some program synthesis, which involves repeatedly generating modules. Since normal modules are not garbage collected I have to do a lot of hackery to handle the memory leaks.
Yes and yes. In the case of garbage collection, there’s not actually anything to garbage collect unless you allocate objects inside the static module that needs to be garbage collected. Those objects will be garbage collected as normal once they’re not needed by anything.
I actually originally was going to call it Namespaces.jl, but changed over to StaticModules.jl because I think it’s a useful analogy to StaticArrays.jl. Just like how a StaticArray is backed a Tuple, a StaticModule is backed by a NamedTuple. StaticModules are hence immutable and the names of the values defined in them are compile time constants as well as the types of the variables those names refer to. e.g.
julia> using StaticModules
julia> @staticmodule Foo begin
x = 1
f(y) = x^2 + 2y
end
StaticModule Foo containing
f = f
x = 1
julia> typeof(Foo)
StaticModule{:Foo,(:f, :x),Tuple{var"#f#1"{Int64},Int64}}
Just like how StaticArrays will cause long compile times and even runtime problems if they get too long, a StaticModule will cause the same problems if you store too many variables in it, so I figured the name StaticModule was more appropriate than Namespace.
Perhaps I should have been more clear about this in the README.
Yeah maybe. I’ll think about this one. Maybe I should spell it like
I could be wrong, but I don’t see how I could not need to eval? The synthesis algorithms (dynamically) generate code, as in Expr values, it evals them, then it execute them.
That depends greatly on your actual use case and needs. Just saying ‘program synthesis’ is sufficiently vague that it’s hard to make any substantive comments.
If you want to open a separate Discourse or Zulip thread on the issues you’re experiencing with some MWEs, I’d be happy to offer advice on this. I’ve thought a fair amount about dynamic code generation (though others have thought far more than me)
There are a lot of devious techniques out there to avoid eval.