Hi! I’m working on a machine learning project in multiple steps. Each step may require as input the output of some of the previous steps. I’m doing everything in a functional, immutable style, using keyword arguments to keep tracking of how the result of each step is plugged in to subsequent steps. Wondering if there’s some way to avoid repeatedly assigning keyword arguments, and capturing the local variables with the same name as the keyword argument instead. Is there maybe a macro package for that?
Something like this:
load_data() = ...
clean_data(; data) = ... (do something with data)
calculate_property_xyz(; clean_data) = ... (do some calculations)
which would be called like this:
let
local data = load_data()
local clean_data = clean_data(data=data)
local property_xyz = calculate_property_xyz(clean_data=clean_data)
property_xyz
end
This results in code that is too repetitive, so I’d like to have a way to capture the parameter values in some way, let’s say like this:
let
local data = load_data()
local clean_data = @capture clean_data()
local property_xyz = @capture calculate_property_xyz()
property_xyz
end
where @capture is some hypothetical macro that assigns keyword arguments from local variables with the same name.
This could be done with positional instead of keyword arguments, which would reduce verbosity in the original code, but still not eliminate it, and also introduce a possibility of error by passing the arguments in the wrong order.
Is there any nice macro package that can do something like this? Or perhaps I’m looking at the problem from the wrong perspective: how would I best code a multi-step process, where outputs of previous steps need to be correctly plugged in as inputs to later steps, avoiding verbosity or the introduction of a mega-struct that contains everything?
let data = load_data(),
clean_data = clean_data(data),
grid = calculate_grid(clean_data),
property_xyz = calculate_property_xyz(clean_data, grid)
property_xyz
end
while what would be desirable would be something like this:
let data = load_data(),
clean_data = @capture clean_data(), # captures 1 value
grid = @capture calculate_grid(), # captures 1 value
property_xyz = @capture calculate_property_xyz(), # captures 2 values
property_xyz
end
where arguments are passed automatically as long as they have the same name. Something like this could reduce code verbosity and decrease the possibility of errors while avoiding defining a mega-struct or mega-class that contains everything and making sure steps are called in the right order (since it would be an error to use a value before it’s defined otherwise).
Is there any package or language feature I’m missing that could do something like that?
Honestly, I would prefer to write your first example (without @capture), and I would definitely prefer to read code in that style. Your example without @capture is perfectly clear–any Julia user in the world can understand what it’s doing. Your hypothetical example with @capture is completely opaque–there’s no indication of how data moves around and no possible way to understand it without looking up an esoteric macro. Are you sure this is actually a problem you need to solve?
Actually, there is an upcoming language feature in 1.5 that might help. In Julia 1.5, you’ll be able to do:
That’s great, there’s a lot of argument passing like this in my code. Funny coincidence that the new feature is coming up soon, that’s really going to make the code cleaner.
I think this new feature is a good solution and something I’ll keep an eye on when it’s released soon. Thanks a lot!