Parsed anonymous functions, type inference, and Base.invokelatest

type

#1

I have a situation where at some point during runtime a string is parsed and eval’d to an anonymous function, e.g.

fa = eval(parse("x -> mean(x)"))

Using Julia v0.5, I could then call fa([1,2,3]) later in the same routine without a problem. In v0.6 however, such a call results in a MethodError mentioning the “world age”. Reading the docs suggests using:

Base.invokelatest(fa, [1,2,3])

which works, but, according to the docstrings for invokelatest the type of the output from from the call to fa is unable to be inferred by the compiler. Maybe this was also the case in v0.5 and I was just never aware of it.

So, my question: is it always the case that fa will exhibit type instability if called later in the same routine? Can we do anything to help in this situation, e.g. provide type information in the input string, perhaps "x::Vector{Float64} -> mean(x)::Float64"?


#2

This is usually a bad thing to do.

Correct.

No. Not before and not now. The caller will have type instability, the function itself is fine. Nothing you can do on the function itself will make any difference at all. You can certainly assert on the caller though, e.g. Base.invokelatest(fa, [1,2,3])::Float64


#3

Yeah I know :slight_smile: . The code in question is, and will only ever be, used by me. I’d strenuously avoid doing anything like this in a package for other users.

Understood. This makes it very clear what is happening internally. Thank you very much for responding.

Cheers,

Colin


#4

Please elaborate why it’s a bad thing to do. Since in some situations it might be appropriate and you think it’s generally a bad thing to do, it would be more useful if you could back this statement up with an argument.

Without any argument to it, it’s only an opinion and not very scientific.


#5

It’s a bad thing to do because it combines two things that are bad together, eval at runtime and parse a string to get code for evaluation. There are countless number of mentions everywhere why these two are bad so I’m not going to repeat.


#6

To me, Julia is the ultimate metaprogramming language, basically the only thing I ever dream of doing with Julia constantly is to think up of ways to make robust meta-programs that transform their own code and then do code generation for scientific computing purposes. If done with enough wisdom, I think it can definitely be pulled off properly and with high performance results. The Fatou.jl package I made is a demonstration of this, since the user can essentially input a string, which then is used to construct the ideal function to use based on the optional keyword arguments and other options used. The resulting code is extremely fast and is robust. At the moment it relies on SymPy for the symbolic computation aspect, but my Reduce.jl metaprogramming package will be tested in its place soon. So far it has always been possible to figure out a way to solve its issues, including with invokelatest.

So I’d say, if used with wisdom, this technique of programming can be applied. But it’s something that requires careful experimentation and code design to get right.


#7

This is one of the primary objections, since you can never be sure a user won’t input a really, really stupid string, eg a system call to rm -r / would really ruin your day…

But yes, parsing strings to code at run-time can make life much easier in certain frameworks. Note, Relevant highly-upvoted StackOverflow question (for JavaScript, but same principle)


#8

My program does not make user specified system calls, so not applicable in that context. It only generates code for mathematical functions and evaluates those for use in an automated construction of a data object.

If you enter any non-mathematical input into it, you are definitely using it wrong, likely cause an error before eval ever gets reached.


#9

The point is that if you just parse it and execute it, there’ll almost certainly be ways to work around it and crash your system or get very strange results.


#10

Then it’s very trivial to create a recursive function that checks for invalid calls, with Julia’s AST support:

"""
    evalcheck(e)

Recursively checks `Expr` objects for dangerous calls
"""
function evalcheck(e)
    if typeof(e) == Expr
        if e.head == :call && e.args[1] == :run
            error("Invalid `eval` with system call")
        elseif condition
            # check for other invalid calls
        else
            for i ∈ 1:length(e.args)
                evalcheck(e.args[i])
            end
        end
    end
    return e
end

Then you can do

julia> "y = run(`hey`)" |> parse |> evalcheck |> eval
ERROR: Invalid `eval` with system call
...

Is this something worth adding to base? If there are other types of calls you want to error out on, those only need to be added to the conditional in evalcheck.

Made this a pull request, if it’s worth looking into: https://github.com/JuliaLang/julia/pull/24209


#11

Try

evalcheck(:(eval(parse("burn everything"))))

Sanitizing expressions in a general language is a hard problem. I would just avoid it and run everything in a sandbox.


#12

Not unless you also control the evaluation. The names are nothing special and this check and easily be worked arounded by assignments. It’s also much better to have a white list instead of a black list.


#13

One way to work around this is to build this into the eval function itself, so that if it is recursively called in the way you propose, it automatiaclly catches this. Then all you need is some toggle switch to activate it on your outermost eval statement so that all the nested evals have the check.


#14

@Tamas_Papp, one way to work around it is by redefining the eval itself:

evalsafety = false

function eval(m::Module, @nospecialize(e))
    evalsafety && evalcheck(e)
    ccall(:jl_toplevel_eval_in, Any, (Any, Any), m, e)
end

function evalsafe(m::Module, @nospecialize(e))
    evalsafety = true
    eval(m, e)
    evalsafety = false
end

This should prevent a situation like the one you proposed in your example, with the recrusive eval and parse.


#15

No please don’t. It still doesn’t catch anything useful (just make some assignment like f = run)


#16

Did you look at my original example?

julia> "y = run(`hey`)" |> parse |> evalcheck |> eval
ERROR: Invalid `eval` with system call
...

It catches that, because it recursively checks all sub-expressions in the AST


#17

evalcheck(:((f = run; f(`hey`))))


#18

Very well then, it’s a cat and mouse game that never ends.


#19

@yuyichao althought, it is technically possible to solve that too by preventing any re-assignments of the run function.


#20

That’s the whole point. It’s practically impossible to come up with a blacklist for this to work and especially when you don’t control the evaluation context / rules. A white list that only allow things you are interested in is almost certainly possible but making it useful (i.e. don’t error on common useful input) and safe is almost equivalent to come up with another domain specific language. Julia parser can certainly be used to help with it but it’s by all mean not trivial.