Generalized approach for plottting derived quantities

Hi fellows,

I have a code design problem for handling derived variables in my plotting package.

Say I already have a function plot that can generate figures for variable a. Let’s assume we now have another variable of the same type, b, and we want to plot a+b. The simplest way is of course

c = a+b
plot(c)

However, is it possible to have something like

plot(a+b)
plot("a+b") # or

?

Or in a more generalized scenario, can I parse any valid operations, e.g. a+(b*c)/2? Two things I can think of are

  1. function handles
  2. metaprogramming

But I have no clue about the procedures. Can you provide me some suggestions and hints on how to do this? Thanks!

EDIT: The variables here are structs that I define for the actual variable values + meta data. So my real problem is probably how to define operations for my own type. Should I just import every possible basic operations like +, -, etc., and use multiple dispatch to define each operation on my own type? What makes things more complicated is that the struct looks like

struct myType
header::String
variable::Vector{String}
data::Array{Float64}
end

and data is a multi-dimensional array, the first index of which corresponds to variable name in the vector order. So for example, the return object from reading a file may be myData of type myType, with the variable vector of string being ["a", "b"], and actual data being ones(2,3), with the first row representing “a” and the second row representing “b”. In this case, initially there is no variable name a or b, and I may have to first define a method for obtaining the corresponding data like a = get_variable(myData, "a"). This is giving me trouble when dealing with derived quantities.

If this works:

c = a+b
plot(c)

Then this will work too:

plot(a+b)

Julia will first calculate a+b, and call plot with the result, so it really makes no difference for plot if you call it with c or a+b. It just gets the same value (of the same type). But maybe I misunderstood the question.

Sorry I didn’t explain it clearly, but you are right. The problem for me is probably that I need to define what + means for variable a and b, which are self-defined types. Same for all other numerical operations.

Ah yes you need to define + for your types. The + function is defined in base so to overload it you must first import it explicitely:

import Base.+

Then you can do something like:

+(a::MyType, b::MyOtherType) = a.something + b.something

I have edited the initial question to make it closer to my real problem. One of biggest issue I have is how to link an expression (of type String) like “(a+b)*c” to an actual operation on the elements on data.

Actually now I have some vague ideas using expressions. For instance, if I pass an expression ex = :(a+b)/2 to the plot function, maybe I can say something like d = eval(ex) inside the my plot function and plot d thereafter. Then I need another function to map the expressions :a and :b to to the actual data array and call it.

In that case, I would try something like this:

  • Make a module such as m = Module(:mymodule)
  • Use Base.eval(m, :($var = $(value)) to fill the module with the values that can appear in the formula. For example with var = :a and value = [1,2,3], the eval call will set m.a to [1,2,3].
  • Use expr = Meta.parse(str) to convert the formula string to an Expr.
  • Use Base.eval(m, expr) to evaluate the formula using the values defined in the module.

But I have very little experience with metaprogramming so it would be good to get someone else’s input!

1 Like

With more searching online, I found some discussions and recommendations for not using eval() if possible at runtime. Maybe I should look for some alternative options, like string into function as an expression.

There are some nice alternatives there! But I doubt you can avoid calling eval at runtime if you receive the formulas dynamically in strings. The solutions at your link also call eval at runtime.

A quick comparison of the performance:

function Evaluate1(formula, variables)
  names = Expr(:tuple, keys(variables)...)
  expr = Meta.parse(formula)
  f = eval(:($names -> $expr))
  return Base.invokelatest(f, values(variables)...)
end

interpolate_from_dict(ex::Expr, dict) = Expr(ex.head, interpolate_from_dict.(ex.args, Ref(dict))...)
interpolate_from_dict(ex::Symbol, dict) = get(dict, ex, ex)
interpolate_from_dict(ex::Any, dict) = ex
function Evaluate2(formula, variables)
  expr = Meta.parse(formula)
  return eval(interpolate_from_dict(expr, variables))
end

function Evaluate3(formula, variables)
  m = Module(:evaluator)
  for (k, v) in variables
    Base.eval(m, :($k = $v))
  end
  return Base.eval(m, Meta.parse(formula))
end

gives the following results:

formula = "a+b"
variables = Dict(:a=>2, :b=>3)

@btime Evaluate1($formula, $variables)
@btime Evaluate2($formula, $variables)
@btime Evaluate3($formula, $variables)

# Results:
  3.307 ms (862 allocations: 54.99 KiB)
  114.705 μs (75 allocations: 4.38 KiB)
  324.798 μs (118 allocations: 8.00 KiB)

And with alternate versions that reuse the processing of the formula as much as possible:

function Evaluate1b(f, variables)
  return f(values(variables)...)
end

function Evaluate2b(expr, variables)
  return eval(interpolate_from_dict(expr, variables))
end

function Evaluate3b(m, expr, variables)
  for (k, v) in variables
    Base.eval(m, :($k = $v))
  end
  return Base.eval(m, expr)
end

we get:

names = Expr(:tuple, keys(variables)...)
expr = Meta.parse(formula)
f = eval(:($names -> $expr))
m = Module(:evaluator)

@btime Evaluate1b($f, $variables)
@btime Evaluate2b(expr, $variables)       # $expr not working for some reason
@btime Evaluate3b($m, expr, $variables)

# Results:
  150.107 ns (3 allocations: 80 bytes)
  80.055 μs (65 allocations: 3.97 KiB)
  189.257 μs (103 allocations: 6.80 KiB)

So it looks like the first method is quite slow for a single evaluation, but way faster if you can reuse the formula function.

The version that binds the variables in a dedicated module is 2-3x slower than the version that evaluates in the current module. So the dedicated module is only worth it if you fear the formula could be ill-formed and you don’t want it to have side effects in the current module.

1 Like

Before going down the eval() rabbit hole, can we make sure that’s actually necessary? What is the source of the expressions you want to plot? That is, are they coming from other Julia code, or are you trying to build some kind of front-end? Or, in other words, why do you have strings at all?

2 Likes

This is for building a customized visualization tool for data coming from a numerical model. It comes with its own data structure, with meta data as strings and actual variables as 1D arrays. I have created some structs for storing the data in Julia, and have been able to generate plots using PyPlot for raw data like those explicit names in the meta data.

However, I am considering a more complicated task for plotting derived quantities based on these first hand raw data. I want to somehow provide users with the capability of typing the formula directly as an argument to the plotting functions and directly visualize the results. This would be useful if I want to quickly scan through the results. For example, say we have a magnetic field data as an array in the output. I may want to quickly check if the magnitude of the field, Instead of defining B = @. sqrt(Bx^2+By^2+Bz^2) and plot B, maybe it is more useful to do something like plot(formula="sqrt(Bx^2+By^2+Bz^2)").

As far as I know, some visualization softwares like ParaView provides similar functionalities through Python interfaces. In Python, this can be achieved with exec(). I am just wondering if this is the convenient and fast way to do this.

Ok, that makes sense. How concerned are you about users being able to do something like plot(formula="run(`rm -rf /`); 1.0") ?

2 Likes

I would be worried if scientists were hackers. However, I doubt if they have the permission to do that. Since Julia is open source, you can always add run(``rm -rf /``) to your package and execute as root right?