Why the free variables in function body don't get bound at function definition time?

Yes, it would have to apply to function calls within the function body as well, like this:

g() =1 ; 
f()= g()
f() # 1 
g() = 2;
f() # 1

# But to achieve different outputs:
f(h) = h()  
g()=1; f(g) # 1
g()=2; f(g) # 2

The general principle would be that the effect of a function call (be it return of value or some side-effect actions) should only depend on the supplied arguments (if any).
Hence any free variables (thus including function variables) in the body would have to be resolved to their values once and for all – thus it seems natural to be at function definition time .

Ok, let’s step back a bit … making these rules dependent on global context – whether a variable exists already or not – sounds like a recipe for confusion.
Here are some arguments, why I think it might not be a good idea in the first place:

  1. Usually functions are bound globally and (as you acknowledge) your rule would apply to functions too. This would be very annoying for interactive work, i.e., define g, define f using g, test f and see its wrong (problem was in g already), fix g, try f again (success).
    Note that with your rule, the old definition of g would still stick around and f would need to be redefined as well to see the change – as you can imagine this gets messy very quickly with any kind of interactive workflow. If you ever worked with Python it is annoying enough that old objects stick around when you redefine a class and languages designed for interactive use, e.g., Smalltalk, Common Lisp, can even handle this case gracefully.
    Some compiled languages could get away with this, e.g., Haskell which only allows a single global definition anyways.

  2. Your rule would also not solve the fundamental problem of “spooky action at a distance” due to global variables but simply shift the problem:

    glob_a = 10
    
    # Lot's of other important stuff
    
    f(x, y) = x * glob_a + y
    
    # Some more scrolling here
    
    glob_a = 20
    
    # Further down the lines
    
    f(2, 3)  # What will/should this be? 
    

    Local reasoning does not work in any case …

  3. Finally, it would fundamentally change the semantic of functions which don’t evaluate their body until called (part of it would be evaluated by your rule already at definition time). You’re right that closures do keep a reference to their defining environment, but this was actually introduced to restore local reasoning for higher-order functions:

    function adder(x)
        fun(y) = x + y  # x is free here
        fun
    end
    x = 10
    a1 = adder(1)
    a1(x) # 11
    

    You can try that this breaks in Emacs Lisp being one of the few languages still around with dynamic binding (modern language do provide dynamic variables as an option though, including Julia):

    (defun adder(x)
        (lambda (y) (+ x y)))
    (setq x 10)
    (setq a1 (adder x))
    (funcall a1 x)  // 20
    

    Dynamic variables – or simply redefining a global variable – can be very handy for (locally) changing configuration options

    somefile = open(...)
    redirect_stdout(somefile) do
        print("Hallo")
    end
    

    which would not be possible when following your rule.

1 Like

Implementing obj.method() by methods capturing values (or variables that don’t get reassigned) is not classical OOP at all. Your methods are encapsulated by an object with no nominal class of its own. In classical OOP, a named class defines and encapsulates the methods, and its instances obj or its fields are strictly passed to the methods at call-time, whether by a convention, key word, or argument. An instance can’t exist while the class and its methods are defined, so there are no values to capture. Binding values to methods at definition time just do not help OOP languages do what they do, and it’s unrelated to why Julia is not OOP.

Interestingly, for some macros (@eval, @spawn and others), there is an easy way to use the value, not the name, at the point of definition. E.g.

julia> a = 1
1
julia> @eval f()=println($a)
f (generic function with 1 method)
julia> a = 2
2
julia> f()
1

Of course, macros work at the syntactic level, and the $ expansion is typically done with the let construction suggested above.

A modification to the julia parser could possibly enable such a thing for use outside macros, not that I would recommend it.

1 Like

OK, let’s rename that argument.

update(anew, inc) = global a=anew+inc

But that syntax is not necessary in Julia 1.11.5:

julia> a=0;
julia> f() = global a=a+1;
julia> f(); f(); a
2
# Also for local variables:
julia> function f()
         a=0
         f()= a=a+1
         f(); f(); print(a);
       end
...
julia> f()
2 

The transparency I meant is on exactly which variables the method depends - explicitlely as arguments, as opposed to implicitely via some free variables in the f.body. If you called same myf(42) repeatedly, at any time, you should be guaranteed to get the same effect (return value or side effect … ).
(Unless myf is getting some sort of external input (file, user…) - though that might be made transparent as well, I haven’t thought enough about that.)

If we’re both talking about the value of a global a at the moment of a method definition that uses that a in the body… then at that moment a has a certain value, regardless of reassignments that happen before or after this function definition time. So I don’t understand the question.
I mean a fundamentally new way that functions/methods would work (not in currently in Julia/Python/C…) by automatically bingind the free variables (be they global or other outer variables) to their current values at f.definition time.

Yes, forcing every dependency to be explicity via arguments will be a bit more verbose; but I thought the transparency (and thus ease of reasoning and thus probably less bugs) would make it worth it.
I think that certain programming patterns related to scope and closures would be made easier – I’ll try to bring examples later.

I think you misunderstood me, I didn’t say that that is classical OOP. It seems to me that you concluded based that hacked example with obj returning (; inc,dec,get,A) etc. But that was just a hack , rather for fun, so that, and the specific rule I mentioned at the end to make that hack work is not so important in the big scheme.
But to the rest of what you say:

Yes, indeed, but this is not contradicted or made harder by binding values to free variables in methods at definition time. Because when defining the methods to act on some future obj, you work with self (or something similar) - a variable to stand for that object that will be instantiated at call time. self is, de facto, a parameter, not a free variable, thus resolved at method call time, as all other parameters.
It’s just that in usual OOP syntax, self is not required to be listed explicitely inside the parameter list.

I think most of those downsides do not apply to the new function definition semantics I explained.
Only dependency on input (information needed to carry out the task) needs to be made explicit via parameters, arguments.
So that whatever the effect will be, it will be the exactly the same if the passed argument values are same.
Hence, the requirement that free vars be determined at fun. def. time.

But not to explicitely track or declare the state that will be changed (like files to be changed, globals to be modified…).

# this would be totally fine;  
# effect at call is always the same:
function f() 
  print(42)  
  global a = 9
end
# to have effect depend on some outer variables `t` and `anew`, would need:
a=0;
function f(t, anew) 
  print(42+t +a)  # this `a` will just interpolate current val of `a`, 0 here
  global a = anew+t 
end
# call 
f(3,a)
# to have effect depend on external user input, perhaps like this, to be strict:
function f2(STDIN, t ) 
   userin=read(STDIN)
   print(userin, t)
end

This one could be true.
If the generated numbers depend on some outside-the-program state (memory box, thus similar to a variable), then in principle yes, the function would need to explicitely declare that as a parameter.
I guess same would be for functions that read user or file input.
Perhaps, as a convenience, these cases – of reading inputs from “outside” the program – might make the only exception to requiring specifying the dependency as input parameter.

That particular rule I added after that hack with obj.get() etc (i.e, what to do if the free variable is itself under definition) is not important nor well thought-out… I made it up on the spot just for that hack.
In your arguments, I’m not clear if by “rule” you meant that last little rule or the main rule in this whole topic : free variables’ evaluation at function definition time (what a mouthful… FVEDT). Maybe you could specify there in your original reply? If you’re bored.
So I can’t understand (1). A tiny code example would help.
At (2): simple:

glob_a = 10
...
f(x, y) = x * glob_a + y  
# The above will be same as: 
f(x,y)= x*10 +y

glob_a = 20
...
f(2, 3)  # What will/should this be?
# Answer: 23, always ! 

# if you wanted to depend on the global, then:
f(x,y,g) = x * g + y
# 
f(2,3,glob_a) # the return value explicitely depends on glob_a  ; 

Here I get that you mean the main rule.
Yes, but it’s not a big conceptual leap, is it?
Free variables would be evaluated/resolved to their current values at def. time, but all function calls in the body are, semantically, delayed untill call time.
Basically resulting in a function body that only depends on parameters.

The example with higher-order-function - is interesting and should still work with the new function definition semantics. In:

function adder(x)
    fun(y) = x + y  # x is free here
    fun
end

At def. of fun(y), x is resolved to be the parameter x of the parent function adder(x), thus its evaluation will be forced to be delayed until the call time of adder.

Why not? Again, I’ll assume you meant the main rule.
In the functions in the code above, are there any hidden (global) variables that would make the effects change at different calls? If not, there are no free vars there that would need to be made into parameters.
I gave more explanations & simple examples in my reply above this one.

This contradicts:

among other things. I understand you might have wanted to make some analogies, but it’s unclear how far you intended these analogies to go to the point of being misleading.

When you reference a global variable from a method, there aren’t separate “free variables” in the function body, it’s literally the same global variable. When you bind the value at definition time, the method isn’t depending on a global variable anymore, it’s just directly storing or referencing that value. Like referencing a global variable, you sacrificed the caller’s ability to provide input as arguments, but even worse, you removed the only language-level way to access that method-bound value.

If you just wanted the “same effect” (which isn’t clear because you provided several positive examples of functions like inc() or dec() that do not have the same effect for the same calls), then as several people pointed out, a const global variable would perform exactly like a method-bound value. If you really don’t want that global access, metaprogramming can do exactly what you’re suggesting. Making global variables an exception in the scoping rules won’t add any value, you’d just shuffle the existing options around.

t and anew are local variables of f, not outer variables. This is generally true in programming languages.

Nope. Saying that classical OOP could be done even though the lang has a few different rules is different from saying that those rules are classical OOP.

The first half it’s clear; but I don’t see how you conclude the second half.
Please give a small example if you want me to understand you.

Those functions were to be accessed not asinc() and dec() but obj.inc() and obj.dec(). It matters more than just to be able to access them at all: obj there plays the role of the argument encompassing the changing state. Value of obj changes, thus it’s no surprise that obj.get() produces different effect. However, the example there was just a hack (as a reply to a hack shown by someone else) which matters as well.

I wasn’t after any of these. (Though the new fun. def. semantics might help with reasoning with these or implementing simpler rules).

There can be outer variables of the same name. This is generally true in programming languages.

Of course it could be a call-time argument, but that’s inherently distinct from a definition-time value, and your description and example was clearly the latter. It was impossible for another instance with the same type as obj to use the same methods, which is incredibly limiting and why OOP doesn’t do it that way.

True, but that wasn’t in your example and they wouldn’t be relevant within the scope of f.

Can you quote the exact second half, I’m not sure which sentences or clauses you want the example to show. I think I can make a derivative of the very first example depending on the part.

I don’t think that this is possible. Note, that the result of f does not depend only on its input parameters. print implicitly uses stdout and depending on whether that points to Core.stdout (the real stdout) or whether that points to devnull or to something not existing / invalid / not writable f will do totally different things and might not even return.

You could argue that at least if f returns, it will always return 9, but it’s trivial to change your example with a try-catch to return different values depending on the value of whether print throws or not and then return different values, so that they depend on the implicit stdout.

You could think of a loophole to move this problem into the print method, but as the same rules apply there, this would not be a way out, as print could have been called by f and g which had different values for stdout when first called.

So you would need to define that stdout is fixed as soon as one method call happened, which would effectively eliminate all global variables and making them const. Most of the time this is the recommended style anyway, but there are the exceptions as listed above.

So if you want to have a guaranteed input-output relationship, I think there is no way around making all state explicit. In mathematics, this is part of the definition of a function.

1 Like

Thanks, those are important points.
Is changing to what stdout points to, possible to do, programmatically, in Julia ? From googling a bit, Core.stdout would be in C(or C++).

No, Core.stdout is const, so you can’t change it (regularly). However, Base.stdout, which is the version which is implicitly used by print, can be changed:

julia> using About

julia> Base.stdout |> about
Base.TTY (mutable) (<: Base.LibuvStream <: IO <: Any), occupies 72B directly (referencing 248B in total)

julia> Base.stdout = devnull
Base.DevNull()

julia> Base.stdout |> about
Base.DevNull (<: IO <: Any), occupies 0B.
singleton

julia> println("Now printing to devnull")

Please note the absence of the output for the last command.

However, println really only is an example. As long as there are global variables (in contrast to global constants), they can be changed and could have different state in called methods depending on how they would be reached with your fix-on-first-call proposal.

1 Like

But look, from I/O and Network ¡ The Julia Language :

print([io::IO], xs...)

Write to io (or to the default output stream stdout if io is not given) […]

So it’s as if it was defined like so:

function print(io=Base.stdout, xs...)
   # the work to write to io
end

Thus, the dependence on the global Base.stdout is not really implicit, I mean not via a (hidden) free variable in the function body.
So at least at definition point is explicit, as asked by “my rules” ; it’s just that when calling it’s optional to write that argument.
This seems to illustrate that “my proposal” (you said so !) is … workable, and also “saves” the functions that need to do read operation from exterior.

It’s actually fix-on-definition (fix the vals of free variables at function definition time).

By the way, you could probably try these semantics with Julia, although it will be quite some work:

It should be possible to define an Aqua.jl test to check that no global variables are used. You could use this to guarantee, that all state change is explicit in certain modules.

Alternatively, you could define a macro with which you prefix your module definitions. The macro could change all function definitions in the module to encapsulate them in a let as defined above if they use globals, effectively fixing all function values in that module.

With this, you could effectively use Julia to define a language with these semantics and test how that would work in reality.

In Julia, you can’t define a method with optional parameters. Your code defines a function with two methods (documentation). The method defined without the io::IO argument does not know that there are other methods which have this argument.

Again, print is only an example. It would be possible to define only the print method without io::IO, making the dependency on Base.stdout not only implicit on method level, but also on function level.

3 Likes

The method straight up does this: print(x) = print(stdout, x).

1 Like

Yes, I mean the main rule throughout.

Here we go

julia> g() = 17  # Some global function
g (generic function with 1 method)

julia> f() = 2 * g()  # Refers to global g
f (generic function with 1 method)

julia> f()
34  # Hmm, does not look right

julia> g() = 21  # g was the culprit, fix here
g (generic function with 1 method)

julia> f()
42  # Great, fixed

From what I understand, in your semantic just redefining g would not change f until you redefine it as well … would be very annoyed by that in interactive work.

PS: Seems like global functions are treated different from global variables in that neither f = let g = g; () -> 2 * g() end nor @eval f() = 2 * $(g)() are fixing the definition of g to the one in place when f is defined, i.e., the above example still works. So, it might be hard to implement your semantics for global function references in Julia. Note that both forms, do what you intend when g is defined as a (non-constant) global variable g = () -> 17.

Sure, in this small example it’s easy to see. The point I wanted to make is that in a large code base global variables could be defined in a different file etc, breaking local reasoning anyways. Imho, it does not matter much when they are looked up, i.e., definition or call time, as in both cases the global value at the right time matters.

That was the precisely the point here: There is no value at time of definition and thus it must be considered at call time. Thus, your rule needs an exception – is not universal, whereas the current semantic just works.

Sorry, I was not clear here, print uses stdout implicitly. Thank @PatrickHaecker for clarifying this point.

Overall, you seem to not agree with the arguments put forward in this thread. Which is fine and improving language semantics is important. Imho, at this point it does not buy you much though and the right take would be to not use non-constant global variables. Some languages (Haskell) actually do that, but unfortunately globals can be so convenient at times …

You argue for transparency such that

If you called same myf(42) repeatedly, at any time, you should be guaranteed to get the same effect (return value or side effect … ).

But what is the same side-effect? E.g., printing to the screen always depends on the current runtime state of the screen and adds something to the screen. You can obviously say, it still prints the same thing, but nevertheless the context is different and it’s no longer the screen as when the function was defined. In that sense, looking up a value at runtime makes global memory work like any other side-effect … inheriting all benefits and problems of side-effects.

Big part of why this conversation is stuck is the demand for “transparency”, which contrary to most of our intuition about access is actually some subset of function purity that varied wildly and isn’t at all specified by definition-time value binding (I didn’t find FVEDT intuitive, sorry). As long as that remains ill-defined, so will the conversation.

To take the conversation all the way back to the basics, the reason why fun()=a is expected to change with reassignments of a while b=a isn’t is entirely because of semantics typical of programming languages. b=a takes the value assigned to a and assigns the value to b; it doesn’t reference the variable a itself (and Julia doesn’t have language-level pointers or references to do this among variables). That is a one-and-done evaluation. On the other hand, method bodies evaluate every time they are called, not when they’re defined; fun() thus accesses the variable a every run. fun()=a is not an “assignment to a function” as the original post commented, it’s just function definition syntax that happens to use the = character.

Since metaprogramming can already bypass variables to bind values to the method body at definition-time @eval fun() = $a, there is zero benefit to making a language where fun()=a contradicts every other programming language in existence to do the same thing.

3 Likes

@PatrickHaecker
Thanks for breaking my illusion that print(1), in Julia (and also many other langs) would always do the same thing. In its definition:

print(x) = print(stdout, x)

– indeed, a free variable. To avoid that, would need to require users to always use the form print(io, x), but that’s verbose. What to do…

Yes, and taken to the extreme, it would be possible to do all programming with functions that don’t have parameters at all.

For print(x), why can’t it be defined so that it always prints to the same (constant) output stream? If I want in my program to print somewhere else, then, at that moment, I could use the method print(io, x).
Instead of relying on, or accepting that, some gremlins will do the change indirectly via global vars.

Similarly for other IO functions.

The general idea is that of a default value (or set of values). If want to use a different value, then specify explicitely via an additional argument.
Can’t this work for rand-family and all others ?

If it can, it would be neither verbose, nor via implicit dependency. It would reduce the so called “spooky action at a distance”.



Yes, that’s right, thanks for the example.
But as you say, it’s only at REPL, you only need 1 extra redefinition (f) by a few keystrokes, and you do the same thing when you redefine variables that are defined in terms of other variables.

I would not say it’s an exception, but that the rule as initially formulated was incomplete.
It didn’t deal with cases when the free variable cannot be matched with a variable that already has a value – like what happens above.

Or like what happens next: value of obj doesn’t exist yet at the moment of definition of get:

obj = (let a = 1 
    get() = obj.a  # obj is free var
    (; get,a)
end)

A possible general rule that seems natural is to fix the value for a free variable as soon the matching variable in an outer scope gets itself a value.

One could choose another later moment for this fixing (say, at the first call of the function, as @PatrickHaecker suggested, intentionally or not) – and would still serve the main purpose of disallowing implicit dependence on external variables – but I think the downside is that the later the fixing, the harder to predict what value that free var will take, in general.

EDIT:
Probably the simplest general rule is to just not allow free vars that cannot be resolved to values at definition time. Which obviously kills the use patterns above, but might have other benefits in simplicity/reasoning.

Not easy question :slight_smile: . My current opinion is that we restrict the meaning of “same effect” to what’s defined, controllable and observable in given programming language.

A simpler example: if a function is supposed to only return the integer 14, then, the effect is considered “same” if , at end of function execution, 14 is on top of the “stack”. Or, simpler, if the function call can be used as argument for any another function as if it was given 14 directly. But at exactly which address on the stack that will happen… and other such low level details are not within the purview of the given (high level) language.

Similarly for print(14): the effect is “same” if a 14 is printed (or at least sent to the OS to be printed and that returns success that it was printed) at the end of the string already printed on the screen. That the state of the screen now may be a bit different than at last call time is a detail to disconsider, from the perspective of the given language and what’s controllable by the given definition of print(x).

The way I see it is that reading a value from global memory is just getting input information; that can change the effect (output, in a more general sense than just return value) if that input is used so in some function.
But maybe you mean more complex scenarios like when in order to read global value, need to “stop the world”/ lock the access and that changes the (observable, in this language) effect of other parts of the program.

Let’s call it input transparency, or explicit dependency on inputs: no hidden inputs to change the effect/output (output in a more general sense than just return value).

Yes, it’s (only) a subset of restrictions for what is classically meant, in programming, by “function purity” - but that is an advantage.
You’re not forced to always track/declare state (in “out” parameters) or to avoid changing it or have to learn Monad theory etc.

But you still adopt the essence of the mathematical notion of function: that for the same given inputs, the output should be the same, at any call.
Just generalize “output” to mean not just a return value but any defined “effect”. “Output action” if you will.