Redefining structs without restart?

lmiq · November 25, 2020, 6:09pm

I am certainly not the wright person to elaborate on that. I just want to point out that the “solution” needs to deal with the names of the structs somehow, otherwise to what an “old” method would be specialized to? Seems that the name of the type is what defines the specialization.

Also it does not seem to be impossible, from the issue thread that I linked above, it just seems that there is no one available with the time and knowledge to do that exactly now.

FedericoStra · November 25, 2020, 6:17pm

I agree with you that it is not trivial to implement, nor is it completely clear and uncontroversial what the semantic should be. I simply believe that we should discuss what we expect the language to do, and just saying “Oh it’s impossible because @eval” is not a very constructive way of exploring the landscape in my opinion.
As a side note, in the REPL we are already allowed to redefine both modules and constants, and the behavior is not that far off from what I would expect from redefining structures:

const c = 0
f() = println(c)
f() # prints 0
c = 1
f() # still prints 0
f() = println(c) # forces an update
f() # prints 1

PetrKryslUCSD · November 25, 2020, 6:34pm

What would you expect from redefining structures? That means redefining types, and what should happen to functions that already depend on that old type? Should they all be recompiled?

FedericoStra · November 25, 2020, 6:57pm

Ok, let me try to be very explicit with the words to avoid ambiguity.

When you say “redefining structures”, I interpret it as executing twice some code like struct S ... end. I do not interpret it as modifying the internal representation in Julia of an already existing type.

One possible way to deal with such event could be to define a new type called S, independent from the previous type also called S. In the namespace of the current module, the name S would now start referring to the new type instead of the old type. If you build an object of this new type and feed it to a function that was already specialized for the old type, then yes of course the function would have to specialize again! It is a new type after all.

For instance, in the hypothetical code

abstract type A end
f(::A) = 42
struct S <: A end
s1 = S()
f(s1) # returns 42
struct S <: A; x::Int end
s2 = S(0)
f(s2) # returns 42

s1 and s2 would be objects of different types, the old and the new S, and the function f should specialize for both of them.

Notice that we can already achieve something similar by wrapping the structures in a module and redefining the module:

abstract type A end
f(::A) = 42
module M struct S <: Main.A end end
s1 = M.S()
f(s1) # returns 42
module M struct S <: Main.A; x::Int end end
s2 = M.S(0)
f(s2) # returns 42

If you execute methods(f).ms[1].specializations, you see that f has two apparently identical specializations: one for the old M.S and one for the new M.S.

My expectation is just that this should work the same without wrapping in a module. I don’t think it’s realistic to expect something much different in this case.

Now, let’s move on to something slightly more controversial. Let’s say that instead of defining f for an abstract type A, we have instead a definition for f(::S):

struct S end
f(::S) = 42

If we redefine S, should this method be applicable for the new S? If we perform the module trick as before

module M struct S end end
f(::M.S) = 42
s1 = M.S()
f(s1) # returns 42
module M struct S x::Int end end
s2 = M.S(0)
f(s2) # fails: no method matching f(::Main.M.S), closest candidate is f(::Main.M.S)...

we see that currently it fails with a cryptic error message. There is nothing inherently wrong with this behavior: the definition of f was intended for the old type M.S, and the new homonym type M.S is a different type.

However, from a usability perspective, one could argue that it is possible to expect the following: when f receives an argument of the new type M.S, it detects that it has a method applicable for the old M.S that was shadowed and specializes again that method for the new type. I’m not saying that this is the right thing to expect, but there is nothing insane or incoherent in expecting this behavior.

In conclusion, I think the behavior of the following code

struct S end
f(::S) = 42
struct S x::Int end
f(S(0))

is open to debate, but there are at least two reasonable behavior that one can expect: either MethodError because it is a new type, or specialize again because the new type is a redefinition of an accepted type (a sort of “retroactive dispatch”).

Sukera · November 25, 2020, 11:13pm

First off, sorry you took my comment this way, I certainly didn’t mean to imply that you’re lazy. I don’t take your comments so far as “just complaining”, rather as a little uninformed about why these long-known issues have not been solved (and likely won’t be in the foreseeable future).

Note that you’ve misunderstood the semantics of const - it doesn’t mean that c will never change value, it means that c will never change type. It’s also not a guarantee the compiler gives you, it’s a promise you make to the compiler about not changing value or type, so that it may optimize code that depends on that type:

julia> const c = 1
1

julia> c = 1.0
ERROR: invalid redefinition of constant c
Stacktrace:
 [1] top-level scope at REPL[2]:1

This is consistent with your observation that you have to “redefine” f. Since the binding was marked as const, the compiler inlined the value into the compiled code because it is bitstype and immutable.

Since struct literally defines a new type (of type S) and julia has nominative typing, “redefining S” quite literally means redefining the constant S referring to the type of the same name. Changing the type of a const binding is not allowed, because it would lead to having to recompile every piece of code that depends on S. You wouldn’t be able to do precompilation at all anymore, because any method might do @eval ... to change the structs in the containing module. You can sometimes get an error where a package does something like this and during precompilation you get the infamous Precompilation may be fatally broken for this module message. This is directly contradictory to the idea of making code fast by precompiling and caching, so I really don’t see how the seeming usefulness outweighs the benefit of losing precompilation.

The trick with the module only works because the seperate modules allow you to distinguish the two S. Modules don’t create a type:

julia> module M end
Main.M

julia> (s::M)() = "test"
ERROR: function type in method definition is not a type
Stacktrace:
 [1] top-level scope at REPL[2]:1

On the other hand, if you want to get rid of nominative typing, that’s an OK approach to solve the issue of redefinition - I just really doubt this is an approach that will be taken in julia because being sure what a name refers to is one of the most important things for any programming language.

I for one think that giving examples and counterpoints to presented theorems is a valid way of conversation, but to each their own.

That said, @eval is the precise mechanism that makes it really hard to judge what a certain piece of code will do. If you allow shadowing of structs in global scope you also have to handle @eval, since the following in your thought model is valid:

function f()
  @eval struct S end
end

function h()
   @eval struct S end
   @eval g(::S) = "from h"
end

f() # we got S
g(::S) = "example"
g(S()) # "example"
h() # we got S_1 and g(::S_1)
f() # we got S_2
s2 = S() # create an instance of s2
h() # we got S_3 and g(::S_3)
g(s2) # MethodError :(

@eval operates in global scope, so by allowing the shadowing of structs you can very easily create situations where you’d expect a function call to work but it in fact doesn’t because the type is different even though their names are the same. Worse, such a function could come from any module you depend on, since (by your own example above) modules can add new methods to existing functions. This can suddenly and very unexpectedly change the behaviour of your code, simply by loading a dependency. That sounds like a very bad idea.

Now, as mentioned above, if we didn’t have @eval, we would not have this problem at all, which is why I brought it up as a counterpoint in the first place. That’s not a feasible thing to remove from the language though.

The compiler has no way of knowing how it should specialize that function for the new type. From its point of view, there’s a method that takes a certain type X, that type X, and a new type Y which may or may not have the same name as X but with no way of knowing in what relation to each other they stand. The safe way is to not allow overwriting existing names. The slow way is to just insert dynamic lookups everywhere you access a value of that type. Specializing on types is the programmers job, so the only thing left to do is throw a MethodError.

josePereiro · November 26, 2020, 12:02am

Interesting!

$ time julia --startup-file=no -e 'using Plots'   
/usr/local/bin/julia-1.5.2 --startup-file=no -e 'using Plots'  24.06s user 0.90s system 86% cpu 28.871 total
$ time julia -O0 --compile=min --startup-file=no -e 'using Plots'
/usr/local/bin/julia-1.5.2 -O0 --compile=min --startup-file=no -e   13.71s user 0.90s system 92% cpu 15.720 total

giordano · November 26, 2020, 12:13am

I honestly don’t understand what’s the point of doing that. Loading Plots is faster, ok, but then any actual computationalyl-heavy operation is slow, so what’s the deal? Also, 24 seconds look a lot, it’s 8 seconds for me on Julia v1.5.3 (and I have a 5-year old laptop)

josePereiro · November 26, 2020, 12:18am

Well, I don’t know. People use python all the time and “ALMOST any actual computationally-heavy operation is slow”, so it must have it uses.

giordano · November 26, 2020, 12:19am

Not really since all computationally-heavy code in Python eventually calls into C/C++/Fortran libraries

josePereiro · November 26, 2020, 12:22am

Yes, I know. But sometimes you just wants to plot results or move stuff around. It’s good to know that you can say “julia relax” I don’t need all your power right now!

giordano · November 26, 2020, 12:30am

Sure, good to know the option exists (and there are good use cases for --compile=min), but with Julia master I save about 0.7 seconds at startup by using -O0 --compile=min

Benchmark #1: julia-master --startup-file=no --project=. -e 'using Plots'
  Time (mean ± σ):      3.921 s ±  0.018 s    [User: 3.995 s, System: 0.625 s]
  Range (min … max):    3.886 s …  3.938 s    10 runs
 
Benchmark #2: julia-master --startup-file=no --project=. -O0 --compile=min -e 'using Plots'
  Time (mean ± σ):      3.209 s ±  0.032 s    [User: 3.322 s, System: 0.594 s]
  Range (min … max):    3.166 s …  3.261 s    10 runs
 
Summary
  'julia-master --startup-file=no --project=. -O0 --compile=min -e 'using Plots'' ran
    1.22 ± 0.01 times faster than 'julia-master --startup-file=no --project=. -e 'using Plots''

But plotting often comes associated with some number crunching, so saving less than one second at startup may not be that useful

josePereiro · November 26, 2020, 12:40am

Sure, by the way, how did you do those cool benchmarks?

FedericoStra · November 26, 2020, 12:43am

As I see it, const is a guarantee from the compiler that the type will not change (it throws an error), plus a promise that I will not change the value, or that I’m ok to pay the price of some subtle errors if I try to do it (since I can).

In this light, I think that

struct S ... end
module M ... end

should be thought of as the analogous of

const S = struct ... end
const M = module ... end

The types cannot change (they must remain DataType and Module respectively), while the “values” can. The latter (the module) already works this way: if you execute module M ... end twice, you simply change the value of the constant M. This is exactly the same thing of doing const c = 0; c = 1.

Now, let me discuss your example with f, g, h, S and @eval. The functions f and g don’t play a significant role, so let me inline their body and get rid of @eval. Executing struct S end twice should be a no-op the second time, because we are not really changing the structure; so, to make the example more significant, let me use two different definitions of the structure. The example becomes

struct S x::Int end         # S_0
g(::S) = "example"          # g(::S_0)
g(S(0)) # "example"

struct S x::Char end        # S_1
g(::S) = "from h"           # g(::S_1)

struct S x::Int end         # S_2
s2 = S(0)                   # s2 isa S_2

struct S x::Char end        # S_3
g(::S) = "from h"           # g(::S_3)

g(s2) # ALERT!!!

As I mentioned at the end of my previous post, I can see at least two distinct ways to handle this situation. The first is lazier and purer on a conceptual level, the second is more convenient for an interactive user. I’ll explore the implications of both.

Option 1: old methods don’t apply to new shadowing types

We accept that the line marked ALERT!!! is a MethodError. I don’t see anything particularly bad about that. Just as changing the value of a const c::Int = 0 can lead to errors, so may do wildly redefining structures. At least we have moved the error a bit forward, instead of throwing it on line 4 (at the first redefinition of S). A careful programmer could define a g(::S) at the right spot to deal with an argument of type S_2 and everything would work. Actually, he is still in time to do

g(::typeof(s2)) = "whatever"

and be able to add a methods that handles arguments of type S_2.

Option 2: old methods apply to new shadowing types (retroactive dispatch)

The other option is to recognize that there is a chain of overrides S_0 -> S_1 -> S_2 -> S_3. This could be represented by a property such as shadows or overrides, so that

S_3.shadows == S_2
S_2.shadows == S_1
S_1.shadows == S_0

Now, whenever we have a function g with a method g(::S) that accepts an argument of type S_1 and we feed in an argument of type S_2, we recognize that S_2 was shadowing S_1 and decide that the method is applicable. We then have to compile a new specialization specific for S_2 and we are good to go.

This can be much more convenient for interactive use, but I also recognize that it can turn out to be quite messy.

Conclusion

Both the approaches seem reasonable to me, and there may be many others. In both cases no dynamic lookup is required, and the behavior of structs becomes more consistent with that of constants and modules.

Addendum

I feel like an additional clarification is required. The name “nominal/nominative type system” might be a bit misleading. The equivalence and compatibility of types is not determined by the name to which they are bound in current namespace (S in the example before), but rather by their true identity as retrieved with objectid. Namely, it is ok to do

struct S end
const T = S
g(::T) = "ok"
s = S()
g(s) # ok

because at the time of definition T was referring to S.

What I’m proposing is not incompatible with a nominal type system. It is basically just a better way to do the following

struct _S1 end; S = _S1
g(::S) = "one"
struct _S2 end; S = _S2
g(::S) = "two"
# now g has two methods handling _S1 and _S2

giordano · November 26, 2020, 12:45am

With hyperfine

josePereiro · November 26, 2020, 12:53am

Cool, thanks!

FedericoStra · November 26, 2020, 1:30am

Now that I think about it, the following macro almost does the trick:

macro redefinable(struct_def)
    struct_def.head == :struct || error("struct definition expected")
    if struct_def.args[2] isa Symbol
        name = struct_def.args[2]
        real_name = struct_def.args[2] = gensym(name)
    elseif struct_def.args[2].head == :<:
        name = struct_def.args[2].args[1]
        real_name = struct_def.args[2].args[1] = gensym(name)
    end
    esc(:( $struct_def; $name = $real_name ))
end

Now we can do

@redefinable struct S end
g(::S) = "old"
s1 = S()
@redefinable struct S x::Int end
g(s::S) = "new $(s.x)"
s2 = S(42)

g(s1) # "old"
g(s2) # "new 42"

The only thing which is missing is that the actual name of the structure is some garbage such as var"##S#259". I could assign to S.name, but unfortunately if only accepts values of type Core.TypeName, which has 0 constructors… If we add a constructor Core.TypeName(::Symbol), then the slightly modified macro (the one that sets the name too) implements precisely the semantic of my Option 1 above.

I modified the macro to support inheritance. Now we can even do

abstract type A end

g(a::A) = a.x # I give only a single method definition

@redefinable struct S <: A
    x::Int
end

g(S(42)) # 42

@redefinable struct S <: A
    x::Char
end

g(S('x')) # 'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)

g of course has a single method, with two specializations (as can be checked with methods(g).ms[1].specializations).

Henrique_Becker · November 26, 2020, 2:17am

I cannot say any of you are entirely right. const makes the binding constant.

This means: if you do not want to incur in undefined behaviour, you should never ever do x = anything if x is a const variable. The consequences of this are that: it will never change type, as you cannot change which object is associated with x; the compiler may just inline the object inside of function that reference it, and so if you change it the function will continue displaying the old value; immutable objects with no mutable fields (basically all isbits objects) like most numbers cannot be changed in any way; if x is a a mutable struct like Vector then you can change its contents and size because const is not about the contents of the object but about the binding, that is, it will forever be the same object (i.e., all other scopes/functions that reference it can be sure the “memory address we are speaking of” will never change). For these coming from C, the behavior for mutable structs is similar to a const pointer to an object, and for the immutable objects is like a const variable holding the object itself (without the pointer indirection).

FedericoStra · November 26, 2020, 10:04am

I cannot say you are entirely correct either, I think.

No it does not:

const v = [1,2,3]
objectid(v) # 0xa41696f90596f3c7
const v = [1,2,3]
objectid(v) # 0xfdb41deeb4b28891

I’m not sure it is undefined behavior in the same sense as in C. It shouldn’t and I hope it really isn’t. I guess it may just induce “erroneous or unexpected results”, not really undefined behavior, which is a much worse beast. By that I mean that the compiler is free to inline whenever it pleases any use of const values, therefore you can happen to witness simultaneously two different values of the same “constant”:

const c = 0
f() = c
f() # 0
c = 1
g() = (c, f())
g() # (1, 0)

Most importantly, you forgot to mention the last paragraph from those very same docs:

In some cases changing the value of a const variable gives a warning instead of an error. However, this can produce unpredictable behavior or corrupt the state of your program, and so should be avoided. This feature is intended only for convenience during interactive use.

This whole “redefining structures business” is not meant to unleash wild programmers who want to build intricate packages based around savage use of @eval, it is meant primarily for interactive use. We should treat it as such: it can have some limitations and rough edges, but should nonetheless be more flexible than the current implementation, which doesn’t allow it.

FedericoStra · November 26, 2020, 11:46am

I worked around the limitation of the displayed name with the new macro

macro redefinable(struct_def)
    struct_def isa Expr && struct_def.head == :struct || error("struct definition expected")
    if struct_def.args[2] isa Symbol
        name = struct_def.args[2]
        real_name = struct_def.args[2] = gensym(name)
    elseif struct_def.args[2].head == :<:
        name = struct_def.args[2].args[1]
        real_name = struct_def.args[2].args[1] = gensym(name)
    end
    esc(quote
        $struct_def
        $name = $real_name # this should be `const $name = $real_name`
        Base.show_datatype(io::Base.IO, ::Base.Type{$real_name}) = Base.print(io, $(QuoteNode(name)))
    end)
end

A small example:

abstract type A end

foo(a::A) = (a, a.x)

@redefinable struct T <: A
    x::Int
end

@show foo(T(42))
# (T(42), 42)

@redefinable struct T <: A
    x::Float64
end

@show foo(T(3.14))
# (T(3.14), 3.14)

Again, this implements the semantic of Option 1: there are two independent types (secretly named ##T#***) that know nothing about each other, and the name T in the current namespace switches from one to the other.

Now, there is still a problem with this macro: the third to last line should be const $name = $real_name, so that usage of $name (T in the example) would be inlined by the compiler.

Edit

In the macro, instead of the Base.show_datatype hack, one can assign to $real_name.name.name, that is, the return value of the macro becomes

    esc(quote
        $struct_def
        $real_name.name.name = $(QuoteNode(name)) # fix the name
        $name = $real_name # this should be `const $name = $real_name`
    end)

Henrique_Becker · November 26, 2020, 1:39pm

Undefined behaviour include you being able to to alter the constant. I do not know where you took your definition of undefined behavior, but it means exactly that, the compiler is free to do whatever.

This paragraph makes no sense to me. You say undefined behavior is a much worse beast than “erroneous or unexpected results” then you give as an example of undefined behavior “that the compiler is free to inline whenever it pleases any use of const values” which is exactly what I said it could do in my reply and, at least to me, if the inlining happens and you manage to alter the const variable then the behavior of the program clearly falls into the category of “erroneous or unexpected results” .

I did not forget. I omitted on purpose, because it falls under the definition of undefined behavior.

Topic		Replies	Views
Thoughts and tempering expectations on redefining structs Internals & Design struct	58	2436	July 12, 2023
What's the best practice in Julia for research (often changed codes)? General Usage design-pattern	26	1652	April 20, 2021
Typeconst proposal Internals & Design proposal , constants	33	2419	February 9, 2022
Advice for dealing with `struct` during development New to Julia	21	4085	April 11, 2021
Will Julia ever fix its "using ..." latency problems? Performance	69	7380	April 25, 2022