Capture the current definition of a method

what are you trying to protect against? The methods you “captured” is useless if user don’t also do invoke_in_world when they are not calling your public-facing function. And say they are calling your function, what’s stopping a later package from changing your outer-most functions’ methods?

you understand they could just override your whole module right?

use a internal function to do the un-changing part of your function. Don’t dynamically time-tarvel with invoke, compiler can’t optimize that.

In the end you can do whatever you want, but know that the package produced with this recipe will be hard to integrate into the ecosystem and likely slow (if performance matters)

1 Like

hum… thank for your reply but I still don’t really get it :sweat_smile:

Don’t dynamically time-tarvel with invoke, compiler can’t optimize that.

I get it… written a few compilers in a previous life

use a internal function to do the un-changing part of your function.

it is not unchanging, it is un-changed. They there is an exported, public API method in Base. Let’s call it Base.foo(…). It is 100 lines long. I want to have, say, a custom REPL experience. The REPL do call Base.foo(…). I can override it to customize. But I need only to change a tiny part of Base.foo(…), say a little case depending on some input value. I think it is still better if by override of Base.foo(…) do delegate to the original implementation. (actually I don’t see any other solution but to copy/past the full origin code of Base.foo(…))

if your new foo rely on old foo, just call the old foo_foo. This is a common pattern, where the foo is high level / flexible, _foo is directly doing the fixed set of heavy lifting.

1 Like

you understand they could just override your whole module right?

Yeah… but for scenario 1) I’m in the sysimage. Otherwise I couldn’t do anything meaningful anyway. I don’t care what the code calling my code will do. I just want strongest guarantee that if my code do run, then it does what it was designed to do. E.g. equivalently, all calls from my code could be inlined, full transitive closure downtown Core! (bad idea, but this would be equivalent to what I tried to describe in point 1)

but this is not gonna happen if you use invoke everywhere?.. I’m probably not understanding your complex use-case

if your new foo rely on old foo , just call the old foo_foo . This is a common pattern, where the foo is high level / flexible, _foo is directly doing the fixed set of heavy lifting.

I’m afraid I’m missing something important here…
The old foo is in Base (the actual Base of the Julia distribution).
How can I do this foo_foo ? Really I don’t get it…

which function do you have in mind in Base and what do you want to do with it that you think you need invoke?

Again, there are two very different cases

  1. Say Base.rm or anything like that that could have catastrophic side effects

  2. Say Base.repl_cmd (to continue the example given above)

# My Code
function Base.repl_cmd(cmd, out)
   if cmd.exec == ["foo-bar"]
       # special handling
   else
       orginal_base_repl_cmd(cmd, out)
   end
end

How can I define orginal_base_repl_cmd without invoke_in_world ?

You cannot.

That in and of itself is not a problem, no? What do you want to protect yourself against by forcing an older world age?

1 Like

thank you… So I’m not mad :smiley:
invoke_in_world is indeed what I was looking for… and the only possible way given the fully dynamic nature of the overrides (with back-link and code re-generation).

Question: why is it not in the doc ? I understand that users could abuse it… But the use cas defined in 2) above seems… well useful ?

1 Like

the reason is that usually you want to use the latest methods, everywhere… if somehow X is designed to be customizable but you can’t customize it without doing this hack, then X is poorly designed.

https://github.com/JuliaInterop/Cxx.jl/blob/ce1bbf4a79d348252f07f8a5774e90da75ebbf44/src/CxxREPL/replpane.jl#L243-L252

I mean at least you definitely don’t need invoke for making a new REPL mode…

2 Likes

Because “world age” is an implementation detail to achieve compiled functions in the context of a dynamic programming language with eval.

You don’t have to use invoke_in_world for that.

@eval Base function foo()
     [.. old definition of foo]
     new_var = 5
     do_with_func(new_var)
     [... old definition of foo]
end

Getting the old definition itself can be done by looking up its source code, parsing the code and manipulating the resulting AST (though of course you have no guarantee that the source on disk is the same one that was used to compile the original function in the first place).

That aside, Cassette.jl is one tool doing something like this, but it comes with HUGE compiler and performance drawbacks and on top of that it’s a very brittle tool for all the right reasons.

1 Like

I’ve looked at some of the rewriting tools (Cassette, macro + MacroTools or MLStyle, what was done in the context of the source-to-source autodiff, etc…). It’s very nice.

But it’s heavy for a little thing like…

# My Code
function Base.repl_cmd(cmd, out)
   if cmd.exec == ["foo-bar"]
       # special handling
   else
       orginal_base_repl_cmd(cmd, out)
   end
end

after thinking a tiny bit about it… it is not clear what semantics orginal_base_repl_cmd could have without a global world age. A chain of method re-definition (a “function age” for given fixed symbol) seems tricky with the specialization… Could maybe be defined for monomorphic functions… anyway… thanks for all the info.

Conceptually, what you’re asking about is similar to taking some existing compiled code (which does not have source information anymore (unless you save it next to the compiled code, which is done right now but may not always be done in the future and isn’t something to rely on)), cutting it apart, splicing in your own stuff and then pretend you never did anything like that in the first place (when other code looks at the existing function). That… doesn’t mesh well with julias semantics (well, the “cutting it apart and splicing” part at least - overwriting old functions is just eval).

You’re in luck though - your true goal of writing a custom REPL mode does not require this.

I believe REPLMaker.jl is the tool you’d really want to look at, instead of code splicing tools:

https://juliahub.com/ui/Packages/ReplMaker/TRU6e/0.2.5

1 Like

You’re in luck though - your true goal of writing a custom REPL

Well actually all this came along only because I wanted to override Base.repl_cmd. I’m on Windows, but Julia explicitly ignores JULIA_SHELL with a (documented) @static Sys.iswindows… but there are shells now on Windows (ps, bash, etc…). It was only my little weekend fight against discrimination by Unixians :rofl:

More seriously, I posted because I thought it was a good learning exercise for me. My understanding of Julia semantics lead me to seek something like invoke_in_world because my conclusion was that it is the only possible way to do it. I worried I was missing something very important here given the first replies…

writing a custom REPL mode does not require this .

yeah… will want to do that at some point in this project. Thank for the link.

Conceptually, what you’re asking about is similar to taking some existing compiled code…

I respectfully disagree. At code generation time of # My Code function Base.repl_cmd... we could have a primitive that build a closure of the currently existing Base.repl_cmd (kind of invoke_latest) but not for invocation. Only to grab a function pointer and all others metadata needed to issue a LLVM call. This could return a Function (with full internal type information) that could be used latter. Of course this Function would need to be opaque to method redefinition, handled as-if it had been defined with a different unique name. There are semantics difficulties, IMHO, only when the enclosing context is polymorphe and could be specialized many times. If it is monomorphe, you don"t need more that the super.f(…) of class-based object language (such as C++, Java, C#).

Hum… alternatively you could maybe specialize the captured function setting all polymorphic arguments to Any (making it monomorphic by supertype abstraction) and then rely on the dynamic dispatch at call sites inside it. I think it could work :slight_smile:

TLDR: At the end of day, you just want to do an indirect call to a function pointer! That you must have since you would have execute this call anyway.

actually I think opaque closure should do this

1 Like

Holly fu…

I’ve read about them… and listened to the talk at JuliaCon…
I had a use for them, but I fully missed the:

Except they always run in the world age they were created

So yes! There will be a mean to detach a call site from the back-link of method redefinitions. This is much more important (semantically) that what I had understood. Thank you for pointing this out.

Something like:

const prev_repl_cmd = @opaque (cmd,out) -> Base.repl_cmd(cmd, out) # world N
function Base.repl_cmd(cmd, out)   
   if cmd.exec == ["foo-bar"]
       # special handling
   else
       prev_repl_cmd(cmd, out)
   end
end
# world N+1 after redefinition of Base.repl_cmd, without this new Base.repl_cmd itself evalued

(edit: pseudo code more likely to be ok… need to think about this more)

If this is where your security perimeter is it’s already way too late.
Even if this is just one of your security layers.
This is less security in depth and more running trip-wires inside your living room.

There are reasons you might want to do this.
But security isn’t one of them.

At the point this could possibly matter that means the attacker has already had a chance to run arbitrary code on your machine.
And as they say at that point it is no longer your machine.

Nothing anywhere near this is security hardened.

But like go ahead and use invoke in world age, or even opaque closures if you are going to do something useful with them.
But definitely don’t think of them as tools for the security toolbox.
They are barely tools for the advanced user toolbox

3 Likes

I fully and completely agree with all you’ve said above :smiley:
(cue the scare quotes on security I’ve used)
again… learning exercice for me.

(although I was more thinking like protecting some sensitive data… in a context where I generate the sysimage and General is not allowed - controlled LAN only)

I’m well aware you cannot do any sandboxing in Julia, or anything that downloads and run native artifacts from outside… must… not… think… about this… :japanese_ogre: make me afraid :rofl:

Right, but all that requires changes to how julia works internally right now, which I assumed wasn’t subject to change in this discussion. If we’re allowed to make arbitrary changes to how julia works, yeah of course anything goes.

As I understand it, julias functions are always polymorph. A function is just a name. Type restrictions only come into play when talking about a specific method/implementation of that function on a given number of argument types. That’s what makes this tricky semantically, because as I see it, this directly contradicts your desire for monomorphic behavior.

I hear you saying “well what about a subtype relationship between a function and its methods” but that’s not part of julias semantics. Methods don’t have types, functions do.

…Aaaaaand I think you’ve just killed inlining (if you want to have this available generically) :sweat_smile: Having this sort of dynamicness available at all times kills static dispatch, a core reason why julia has dynamic semantics while still being able to compile & inline called functions instead of having to actually do dynamic dispatch all of the time.

Not if the “call” was inlined without compiling the inlined code seperately first, no. It’s not generally useful to think of “function pointers” when thinking about julia functions and methods. Julia is not C.


Have you checked out the Phd thesis by jeff bezanson? It talks in great detail about the type system. There’s also World Age in Julia - Optimizing Method Dispatch in the Presence of Eval, an analysis of how the whole world age mechanism works, what its consequences are in regards to eval, what kind of semantics follow etc.

1 Like