Functions with independent (global) scopes

Hi, recently I was working on a project for which I had to learn a bit of C and one thing I came to value quite a bit was that main() itself is a function and therefore its variables are not visible inside an arbitrary function() that I, for example, defined in the same file. If functions in julia would share this property, I think it would have saved me from quite a few head-scratching moments, wondering why my function returns something unexpected, just to find out that I modified some variable it uses while thinking all external variables are passed to it though arguments.

That is not to say I can’t appreciate how the scope in julia works, it’s great for fast development and if I want to define a separate global scope for my function, I can put it in its own module. I just wish, there would be a much more concise way to isolate the scope of a function than this

# some code before the abomination

module F
export f

f(a,x) = a * x

end
using .F

parameter = 3
data = [4.5, 3.2]

y = f.(parameter, data)

# the code continues...

You are not forbidden from defining a main function in Julia, it just isn’t mandatory. In fact, some would consider it good style.

Isn’t the scoping pretty similar to C? In C functions can access variables from the global scope and the local scope they are defined in just fine:

#include <stdio.h>

int main() {
  int i = 1234;
  int f() {
    return(i);
  }
  printf("%d", f());
}

or even

int i = 1234;

int main() {
  printf("%d", i);
}

will print 1234. It’s even a little bit worse in my opinion, because in C functions can modify global variables

int i = 1234;

void f() {
  i += 1;
}

int main() {
  f();
  printf("%d", i); #1235
}

while Julia errors here, unless one declares global i in the body of f.

But more to the point, you are comparing different coding styles, not languages here.
While in C you tuck all variables away in the body of main, you dump them in the global scope when writing Julia.
One simple solution: Don’t. Write a main function just as you would in C :slight_smile:
Even simpler perhaps: Avoid global(!) variables in functions. Pass everything explicitly.
Global variables easily end up killing performance. Functions relying on global state also make for hard to track down bugs as you have already discovered.

3 Likes

Maybe I should have described my workflow a bit more. Quite often I use the REPL to develop. Specifically, from my editor, I can execute a command which sends currently selected text to a running Julia REPL. This lets me build up the whole script couple of lines at a time, not having to run the whole thing, which is really handy when my code contains a time-consuming simulation or optimization. Having a main function would break this workflow…

To the scoping similarities, while gcc allows you to compile the first example you provided, clang does not as “ISO C99 and later do not support implicit function declarations”. Your third example is then exactly the reason why needlessly defining global variables is considered a bad practice.

So yes, my goal is to generally pass all variables to my functions explicitly, due to the reasons you mentioned, which is a coding style in languages like Julia. I just think it would be pretty neat if there were some utilities that would help you do so. Maybe like a macro that would throw an error upon function definition, if it reads variables in the global scope e.g.

@explicit function f(x)
	# code
end
1 Like

One thing to be aware of is: when you write

f(x,y) = cos(x)*sin(y)

f reads cos and sin from the global scope. (BTW, the same is true of C, where most calls refer to function which have been defined in some outer scope). But you wouldn’t want that to error.

Reading variables from the global scope is fine, and I don’t think it can be avoided. On the other hand, writing to variables (or more precisely, re-binding names) in the global scope is sometimes convenient in interactive contexts, but causes all sorts of difficult-to-debug issues and performance reasons. But Julia will already prevent you from doing this in non-interactive contexts.

3 Likes

That’s a good point, but I have to argue that, to my limited knowledge, Julia already strongly encourages the separation of state (in DataTypes) and its modifiers (Methods), unlike OO languages. Therefore, deciding what the function should and should not be able to read, after enabling this option, could be quite straightforward.

I’m not sure I follow… If you write:

foo(x) = 1
const C = 2

bar(x) = C * foo(x) * cos(pi*x)

how is it straightforward to decide what bar “should and should not be able to read”?

I suspect at least cos (a global name that happens to be bound to a function) and pi (another global name that happens to be bound to a constant) fall into the “should be able to read” category.

I don’t think you’ve got much of a choice for foo either: although it’s a user-defined global binding, bar has to be able to read it, right?

What about C? If bar “should not be able to read” it, why? How is it any different from pi or foo?


To be clear, I do agree that it’s good practice to pass all data to functions via arguments, and one should generally avoid global state. Not just in Julia, but also in other languages. However, I don’t think you can automatically enforce this, because there is always some form of global state that you need to read, even if in an ideal case it would be reduced to the set of functions you want to call in your code.

Rebinding global variable is a different beast altogether, and I’m glad Julia helps detecting this (even though only in non-interactive contexts).

Ultimately, it falls down to the developer to ensure that no global variables are used needlessly. And I’d note that, when writing code in modules, the best way to achieve this would simply be to not define globals in the first place…

One last thing: depending on the tools you use, the linter can also help you spot undefined variables used in your functions. (This does not prevent you from using the REPL interactively)

1 Like

I see where you are coming from and I also get bitten by this from time to time when incrementally building up code, i.e., start in global scope to see how things might work and then wrapping it into a function later.
Unfortunately, as others have explained nicely there might not be a good technical solution. I.e., at least for your own code a social one might work, namely a naming convention such that global variables stand out and are easier to spot, e.g., using earmuffs as in *my-global*.

You can spot global variables in functions with reflection, even something as simple as:

julia> foo() = sin(x) # oops I forgot to write `foo(x)`
foo (generic function with 1 method)

julia> @code_warntype foo()
MethodInstance for foo()
  from foo() in Main at REPL[12]:1
Arguments
  #self#::Core.Const(foo)
Body::Any
1 ─ %1 = Main.sin(Main.x)::Any
└──      return %1

But you need to distinguish the intended globals like Main.sin or the unintended globals like Main.x. You don’t have to do this much work if you get the habit of writing all the necessary arguments into a function header, it’s a good habit in any language.

2 Likes

Regarding a function named main, see (on hold): If you have a function called `main`, you may need to tweak it

You may be interested in the contextual module REPL introduced in Julia 1.9. This lets you change the module from Main to an arbitrary module, effectively changing your global scope.

https://docs.julialang.org/en/v1/stdlib/REPL/#Changing-the-contextual-module-which-is-active-at-the-REPL

Julia modules are a top level encapsulation. I believe you can effectively accomplish your goals by changing the module context.

Here is a verbose demonstration.

julia> x = 5
5

julia> varinfo()
  name                    size summary
  –––––––––––––––– ––––––––––– –––––––
  Base                         Module
  Core                         Module
  InteractiveUtils 529.576 KiB Module
  Main                         Module
  ans                  8 bytes Int64                      
  x                    8 bytes Int64
                                                        
julia> using REPL

julia> module Foo end
Main.Foo                                                
julia> REPL.activate(Foo)
                                                        
(Main.Foo) julia> varinfo()
  name size summary
  –––– –––– –––––––
  Foo       Module

(Main.Foo) julia> x
ERROR: UndefVarError: `x` not defined

(Main.Foo) julia> x = 6
6

(Main.Foo) julia> x
6

(Main.Foo) julia> using ..REPL

(Main.Foo) julia> REPL.activate()

julia> x
5

julia> Foo.x
6
6 Likes

I think that your analysis of the example you provided is spot on. Just to expand on functions: bar(x) = 3 * x would certainly also be fine, baz(x) = C * x probably also, but a = 3; qux(x) = a * x would certainly not as a is mutable. To allow for baz, constants would have to fall into their own category (separate from state and functions) which makes sense, but I didn’t think of before.

There might be more kinks like this in my idea, but I still believe that in the end, having a well-defined state, components of which have to be passed as arguments, is achievable. That being said, I now agree that enforcing this in Julia would have to be accompanied by large changes to the code-base.

Speaking of linters, something that could be reasonable is a linter option that warns you any time you break either of two rules

  1. you mutate a name that you imported from a module (a different global scope)
  2. your function reads a mutable variable defined in your current global scope, that has not been passed as an argument.

Just so as not to mix up terms, “mutable” is a technical term that applies to types, not assignments - here, a is assigned to an Int, and Ints are not mutable. The assignment a is not constant, so could be reassigned, which is I think what you meant - this matches a colloquial definition of mutability, but not the technical one.

I bring this up only because I confused these two concepts (mutability and reassignment) for a long time. Here are some examples that helped me get the difference:

a = b = (1,2,3) # not mutable 
x = y = [1,2,3] # mutable

push!(a, 4) # error
a = (4,5,6) # reassignment 

@show a
@show b # b is not changed when reassigning a

push!(x, 4)

@show x
@show y # y refers to the same underlying object, so mutating a mutates b

x = [4,5,6] # reassigned to a different mutable container

@show x
@show y # y was not reassigned
4 Likes

Okay, I’ve since learned that nested functions in C are bad, at least the way GCC implements them. Though clang seems to complain about the wrong thing here. The correct error is IMHO

$ gcc -Wpedantic --pedantic-errors -std=c99 test.c -o test
test.c: In function 'main':
test.c:5:3: error: ISO C forbids nested functions [-Wpedantic]
    5 |   int f() {
      |   ^~~

But that’s just a side note. You certainly wouldn’t want to forbid function nesting in Julia.

Same applies to Julia.

On the rare occasions I do want to keep a global state, I resort to declaring a const mystate = Ref(...).

A warning from the linter would be great, but it’s not so easy to get right I think, because you can declare new globals from within functions, not just at the top level. ()->global a=1; declares a global when it runs, but if it runs might be hard to figure out. The linter could err on the side of caution and report a warning regardless, but false positives are a nuisance.

I think it makes sense to have a macro that causes a function to error if it uses any globals that are not callables.
This is pretty easy to work out at the typed ir level.
But practically impossibly to work out at the AST level.
So it really couldn’t be a normal macro.

1 Like