Coming from the R world, I’m wondering how to translate the following task in a more Julian way. I often need to run a function, say my_model(x, a, b, c)
for various combinations of parameters a,b,c, where the output of my_model
will typically be a DataFrame with length(x) rows. Here’s a R version of this workflow,
library(purrr)
library(dplyr)
library(tidyr)
my_model <- function(x=seq(0,10, length=100), a=1, b=1, c=1, fun = cos){
# dummy example here
data.frame(x=x, y = a*sin(b*x) + c, z = a*fun(b*x) + c)
}
head(my_model())
# x y z
# 1 0.0000000 1.000000 2.000000
# 2 0.1010101 1.100838 1.994903
# 3 0.2020202 1.200649 1.979663
# 4 0.3030303 1.298414 1.954437
# 5 0.4040404 1.393137 1.919480
# 6 0.5050505 1.483852 1.875150
params <- expand.grid(a=c(0.1,0.2,0.3), b = c(1,2,3), c = c(0,0.5))
head(params)
# a b c
# 1 0.1 1 0
# 2 0.2 1 0
# 3 0.3 1 0
# 4 0.1 2 0
# 5 0.2 2 0
# 6 0.3 2 0
all <- pmap_df(params, my_model, fun = tanh, .id = 'id')
str(all)
# 'data.frame': 1800 obs. of 4 variables:
# $ id: chr "1" "1" "1" "1" ...
# $ x: num 0 0.101 0.202 0.303 0.404 ...
# $ y: num 0 0.0101 0.0201 0.0298 0.0393 ...
# $ z: num 0.1 0.111 0.122 0.135 0.15 ...
# join with the 'metadata'
params$id <- as.character(1:nrow(params))
d <- left_join(params, all, by='id')
str(d)
# 'data.frame': 1800 obs. of 7 variables:
# $ a : num 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 ...
# $ b : num 1 1 1 1 1 1 1 1 1 1 ...
# $ c : num 0 0 0 0 0 0 0 0 0 0 ...
# $ id: chr "1" "1" "1" "1" ...
# $ x : num 0 0.101 0.202 0.303 0.404 ...
# $ y : num 0 0.0101 0.0201 0.0298 0.0393 ...
# $ z : num 0.1 0.111 0.122 0.135 0.15 ...
# optional: reshape to long format for visualisation
m <- pivot_longer(d, cols = c('y','z'))
library(ggplot2)
ggplot(m, aes(x, value, colour=a, linetype=factor(c), group=interaction(a,c))) +
facet_grid(name~b, scales='free_y', labeller = label_both) +
geom_line()
I find this workflow very handy and extensible, and because it’s typically for interactive analyses the raw efficiency isn’t too much of a concern (a more realistic my_model
may be slow for each iteration, so any slight overhead of manipulating the data this way is negligible).
In Julia, I would likely use a comprehension to loop over the combinations of parameters, e.g.
all = [my_model(x, a,b,c) for a=..., b = ..., c = ...]
and then splat the results together, add repeated versions of the parameters a,b,c, but it’s less streamlined. Am I missing an equivalent to purrr::pmap_df()
and dplyr::left_join
?
I saw some uses of Base.Cartesian, but it doesn’t feel super intuitive to me.
Many thanks.