Understanding memoization.jl

question
macros

#1

I am trying to understand memoization.jl by simonster to obtain a good understanding of macros. First I tried to macroexpand the output:

julia> using Memoize

julia> macroexpand(:(@memoize function x(a) a end))
quote  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 107:
    function ##x_unmemoized(a) # REPL[3], line 1:
        a
    end # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 108:
    empty!(ObjectIdDict()) # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
    x(a)::Any = begin  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
            if haskey(ObjectIdDict(), (a,))
                (ObjectIdDict())[(a,)]::Core.Inference.return_type(##x_unmemoized, typeof((a,)))
            else 
                (ObjectIdDict())[(a,)] = ##x_unmemoized(a)
            end
        end
end

It is really weird. I think ##x_unmemoized is intended to be a function but since it starts with #, it should be a comment. Then I tried to evaluate it:

julia> eval(macroexpand(:(@memoize function x(a) a end)))
x (generic function with 1 method)

Everything seems to be fine. However, if I copy the output of macroexpand:

julia> quote  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 107:
           function ##x_unmemoized(a) # REPL[3], line 1:
               a
           end # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 108:
           empty!(ObjectIdDict()) # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
           x(a)::Any = begin  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
                   if haskey(ObjectIdDict(), (a,))
                       (ObjectIdDict())[(a,)]::Core.Inference.return_type(##x_unmemoized, typeof((a,)))
                   else 
                       (ObjectIdDict())[(a,)] = ##x_unmemoized(a)
                   end
               end
       end

ERROR: syntax: unexpected else
Stacktrace:
 [1] macro expansion at ./REPL.jl:97 [inlined]
 [2] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

I am totally confused about what is happening here. Anyone give me a hint?


#2

These funny variable names are partially gensym’ed, a kind of name mangling to make sure there variable names are unique.

julia> a = gensym()                                                                                                                                   
Symbol("##778")                                                                                                                                       
                                                                                                                                                      
julia> :($a=1)                                                                                                                                        
:(##778 = 1)                                                                                                                                          

julia> eval(:($a=1))                                                                                                                                  
1                                                                                                                                                     
                                                                                                                                                      
julia> eval(:($a))                                                                                                                                    
1                                                                                                                                                     

If you copy past it you have to rename those to something which is not a comment. Either wrap them in $(Symbol("##x_unmemoized")) or just use other names, e.g. see MacroTools.jl: https://github.com/MikeInnes/MacroTools.jl/blob/51391fb1cc93d295a1643be1d577265468428625/src/utils.jl#L128


#3

Memoize is rewriting your function to be something like

function unmemoized_name(...)
    # original function body
end
function original_name(...)
    # look up result from the ObjectIdDict if it exists
end

##x_unmemoized is just the name of the unmemoized function. The two ## just help ensure that the function name doesn’t conflict with anything else you may have defined. These automatically generated names usually come from calling gensym:

help?> gensym
search: gensym @gensym

  gensym([tag])

  Generates a symbol which will not conflict with other variable names.

julia> gensym(:hello)
Symbol("##hello#658")

#4

Thank you @Michael_Eastwood and @mauro3.

I am still confused.

  1. “##x_unmemoized” is actually hard coded, see https://github.com/simonster/Memoize.jl/blob/master/src/Memoize.jl. Then are name conflicts still guaranteed to be avoided?

  2. I understand what is memoization, but the following expression returned by macroexpand seems to create a dictionary every time the function is called:

julia> macroexpand(:(@memoize function x(a) a end))
quote  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 107:
    function ##x_unmemoized(a) # REPL[3], line 1:
        a
    end # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 108:
    empty!(ObjectIdDict()) # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
    x(a)::Any = begin  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
            if haskey(ObjectIdDict(), (a,))
                (ObjectIdDict())[(a,)]::Core.Inference.return_type(##x_unmemoized, typeof((a,)))
            else 
                (ObjectIdDict())[(a,)] = ##x_unmemoized(a)
            end
        end
end

If I rename ##x_unmemoized and evaluate it (notice I add an extra line println("running")):

eval(quote  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 107:
    function x_unmemoized(a) # REPL[3], line 1:
               println("running")
               a
           end # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 108:
           empty!(ObjectIdDict()) # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
           x(a)::Any = begin  # /home/wangc/.julia/v0.6/Memoize/src/Memoize.jl, line 109:
                   if haskey(ObjectIdDict(), (a,))
                       (ObjectIdDict())[(a,)]::Core.Inference.return_type(x_unmemoized, typeof((a,)))
                   else
                       (ObjectIdDict())[(a,)] = x_unmemoized(a)
                   end
               end
       end
)

, the function x is not memoized at all:

julia> x(1)
running
1

julia> x(1)
running
1

.

So it seems that using macro is not equivalent to evaluate the expression the macro returns. How to understand this behaviour?


#5

They’re not, but pragmatically, you won’t get a conflict unless you go looking for one. The point is that this way you can get to the unmemoized function with something like eval(Symbol("##x_unmemoized")). If Memoize.jl used gensyms, then AFAIK you can’t get that variable without hacking your way into Julia internals.

That’s a confusing situation. You can put arbitrary objects inside your macroexpansion. They put an ObjectIdDict(), whose printed representation is:

julia> show(STDOUT, ObjectIdDict())
ObjectIdDict()

So there’s only one dictionary created - at macro expansion time. I think the normal way of writing this would put the dictionary creation expression inside the macro expansion, as const memo = ObjectIdDict(), but perhaps they had good reasons for doing things this way.


#6

Thank you, I have a basic understanding now.