(on hold): If you have a function called `main`, you may need to tweak it

Huh, seems all Julia-nightly CI is broken now. I suppose this is a good example of the odd issues this will cause.

Run julia --color=yes "$GITHUB_ACTION_PATH"/add_general_registry.buildpkg.jl
[ Info: The General registry already exists locally
ERROR: MethodError: no method matching main(::Vector{String})

Closest candidates are:
  main(; n, max_delay)
   @ Main ~/work/_actions/julia-actions/julia-buildpkg/latest/add_general_registry.buildpkg.jl:53
4 Likes

Why not first check isapplicable(main, Vector{String}) before calling it? It would still break cases where there is a main function that expects an input, but would solve for the (Iā€™d guess much more common) no-arg main problem

2 Likes

I had brought this up on the triage call a while ago: if we use __main__ instead of main for this, then this wouldnā€™t be a breaking change since __ names are reserved. It also helps to hint that something special and potentially surprising might be going onā€”otherwise itā€™s a little strange that some regular function is just automatically called. The name __main__ being automatically called is a little less surprising and somewhat analogous to __init__.

52 Likes

I especially agree with using __main__ because it visually signals that there is potential weirdness, as with __init__.

You can guess that from just looking at it - why else would anyone use those underscores. With main there has to be some transmission of knowledge either from docs, error messages or someone else who knows.

6 Likes

I didnā€™t know this. Should be mentioned in the docs (I could not find it).

Otherwise, totally agree.

1 Like

This type of automatism is not needed in my opinion.

Actually (first) I would like to see a sufficient compilation to standalone apps before thinking about automating it. Second I want this feature in Julia itself and not in a package. Having this, I want (third) a reasonable easy workflow to create a not too large standalone app.

If this workflow is fine and satisfactory there is no need for automatism (or it is easy to be achieved).

Still, maybe, it needs to be defined now what the standardized way of entry-point must be (but I doubt it). In this case do it in a non-breaking way.

(Anyways it will not break my code :wink: )

Couldnā€™t we use a macro for this? Something like:

@entry function main()
   ...
end

Which will expand into:

function main()
   ...
end

const __MAIN__ = main

__MAIN__ here is just an example, it could be anything.

Iā€™m not fan of __main__ because it reminds me of Python :sleepy:.

11 Likes

That would be my preferred option too from a userā€™s perspective.

Were there any (convincing) counter arguments against the use of __main__ other than (from the PR; emphasis mine)

The existing __ function that we have __init__ is something that ideally users never have to use, whereas this is something that we expect people to use regularly. Of course, there is some possibility of unexpected behavior around the transition if some pre-existing main is called unexpectedly, but presumably that can be addressed by a sufficiently prominent note in the release notes.

I kind of see the point the first part makes, but tend to disagree with the emphasized part, and to be completely honest, find it a bit nonchalant. Why go that route when there are non-disruptive and backwards compatible alternatives?

Big thumbs up for the feature in general though!

2 Likes

But thatā€™s not really a valid argument against __main__? I mean, if users already had to use __init__ we would have an argument in favor of __main__. It doesnā€™t follow that we should avoid __main__ because users donā€™t have to use __init__!

On the contrary, I think itā€™s nice that users will know __main__, so when they look at package code and see __init__ they can rightly guess that itā€™s called automatically.

2 Likes

Could this simply be called Main.__init__ ? its docstring seems to almost already fit with this new __main__ feature, except for the ARGS argument:

help?> __init__

  The __init__() function in a module executes immediately after the
  module is loaded at runtime for the first time. It is called once,
  after all other statements in the module have been executed. Because
  it is called after fully importing the module, __init__ functions of
  submodules will be executed first. 
  [...]

(just an idea, not arguing in favor of it over __main__)

4 Likes

A normal-looking name, like main, with weird behavior seems very risky (especially a popular name like that.) A function with very special behavior, ought to have a special-looking name that signals that something out of the ordinary is going on, just like macros are set apart with the @ symbol.

__main__ seems like a serviceable choice, and even though Iā€™m not particularly a Python fan, choosing the same convention right there is probably advantageous.

4 Likes

Iā€™m warming up to __main__. My main thoughts against it are as follows.

  1. It is unclearly documented that __main__ was reserved. We should fix this.
  2. We already have __init__. Why not just use Main.__init__(args)?
  3. Calling julia MyModule.jl already runs MyModule.__init__(). We should just reuse this for the module Main.

My other concern is that we are solving a single entry point problem where as we should probably consider the multiple entry point problem.

While Main.__main__ may be the default entry point perhaps MyModule.__main__ could be an alternate entry point. Main should not be very special.

Rather what I need to know is whether the current module is being executed as the ā€œprogram moduleā€.

macro __ismain__()
   return __module__ == PROGRAM_MODULE
end

Now I could just do

module MyModule
    function __init__()
        if @__ismain__
            # Execute program
        end
    end
end

If a module has a __init__ and __main__, what order would they execute in?

4 Likes

I would argue that __init__ is the wrong function name for this. __init__ by name does not specify that itā€™s going to do anything other than initialize some things (which is how itā€™s currently used), whereas __main__ sounds like itā€™s going to run a whole bunch of application-specific stuff (which is how main, run, etc. are currently used).

4 Likes

If the init in __init__ means initialize to you, I can see that. It could also be seen as ā€œinitialā€ function though.

Moreover, it already acts like an entry point.

$ cat Hello.jl
module Hello
    __init__() = main()
    main(args=ARGS) = println("Hi", args...)
end

$ julia Hello.jl
Hi

$ julia Hello.jl " Julian"
Hi Julian
2 Likes

I would argue that using __init__ is breaking, or at least how to use it if already using __init__ for another use is not obvious. I am using __init__ in a module for other one time initialization requirements. It seems to make much more sense to use a specific function name for this specific requirement so I think __main__ would be more straightforward.

Maybe I am wrong and would just have to add a few lines to the end of my existing __init__ function or add another __init__ function at another location.

1 Like

I was actually surprised by this comment of Keno:

ā€œAs mentioned earlier, the whole idea of this change is that itā€™s the first thing people new to the language will learn, so anything that relies on things that youā€™re not intending to talk about in the first three paragraphs of a ā€œgetting started with juliaā€ tutorial isnā€™t gonna work.ā€

Then I think a better explanation of what this is needs to be provided, at least for users like me. I donā€™t see me teaching this for any Julia new user, neither for relatively advanced ones. My impression is that this is for quite a niche use associated to people that require building compiled binaries for distribution, which will be always advanced users. I donā€™t see making this feature ā€œsimple for new usersā€ of much importance really, much less breaking, if that is an issue.

11 Likes

My point was that __init__ can act like a main for a module today, without any changes. I think @rfourquet was also picking up on this same line of thought.

The more complete incantation from the docs is as follows.

module Hello
    function __init__()
        # some common initialization code
        abspath(PROGRAM_FILE) == @__FILE__() && main()
    end
    main(args=ARGS) = println("Hi", args...)
end

Notice that you can change the condition abspath(PROGRAM_FILE) == @__FILE__() to match any file, not just the current one.

$ tree Hello/
Hello/
ā”œā”€ā”€ Project.toml
ā”œā”€ā”€ scripts
ā”‚   ā””ā”€ā”€ runme.jl
ā””ā”€ā”€ src
    ā””ā”€ā”€ Hello.jl

$ cat Hello/src/Hello.jl 
module Hello
    function __init__()
        # some common initialization code
        abspath(PROGRAM_FILE) == joinpath(
            dirname(@__DIR__),
            "scripts",
            "runme.jl"
        ) && main()
    end
    main(args=ARGS) = println("Hi", args...)
end

$ cat Hello/scripts/runme.jl 
using Hello

$ julia --project=Hello Hello/scripts/runme.jl
Hi

$ julia --project=Hello -e "using Hello"
# <no output when used as a library>

Anyways, this is becoming a tangent. The direction seems to be between main and __main__ with main + hacky heuristics apparently prevailing.

I personally would prefer __main__, but I think that it should be clearly stated that this executes after __init__.

What Iā€™m confused about is why are putting more effort into Main the module to support system images rather than focusing on modules in general and taking advantage of pkgimages.

3 Likes

I honestly donā€™t see any real benefit in ā€œunifyingā€ compiled and script workflows. Theyā€™re different things serving different purposes. Or if they are to be unified, you should make the compiled case more like the script-case, rather than vice-versa. The current semantics are nice ā€“ itā€™s just like typing the code at the REPL, or include()ing it ā€“ read and execute the code, whether that execution is defining things or running. Changing that to ā€œread and execute, and then execute another magic functionā€ just doesnā€™t seem helpful.

In order of preferences, Iā€™d like

  1. Being able to specify entrypoints in Project.toml, which gets us multiple for free, and lets us change exactly how that cashes out in terms of executables, etc.
  2. The pre-merge status-quo until people have actually built out and explored the compilation space more.
  3. __main__, which at least warns that this is something magical.
  4. __init__, which isnā€™t the best name, but is already used for extremely similar purposes.

And finally, about ten further places down, snagging plain old main from user control.

(Note that 2. and 4. are extremely similar ā€“ itā€™s just a question of how much boilerplate needs to be included (2) vs happens automatically (4))

I might even swap 1 and 2.

7 Likes

I also am curious if this is documented somewhere.

4 Likes

Yeah, Iā€™m not sure if it is. Itā€™s a convention from other languages, but it should be documented.