PSA: Use modules for dirty work preparing scripts, debugging, etc

Let’s say I’m preparing a script, e.g. to analyze data in a project. The recommended practice is to put as much code as possible in functions, keeping the global scope clean. But while I’m designing those functions (or debugging them after they have been written) I need to see what’s happening step by step, with some example data. Of course, there are good tools for that - Debugger.jl (or VS Code’s built in debugger), Infiltrator.jl, etc. But they need some setup to be used, and sometimes one feels urged to do quick-and-dirty copy-pasting on the REPL, messing up the global scope.

Another conflict with Julia’s good practices: “if you need global variables, make them const”. OK, but when I’m in the process of writing the script, I often find out that the values of such constants should be modified, and I have to restart the session to redefine them in a safe manner.

Both issues, and others similar, can be addressed by doing that prototyping or debugging work in a module different from Main, so that once you have finished the work (or have messed too much with its global scope), you can get rid of it and restart with another module, without restarting the whole Julia session. This comes for free if you are developing a package, since packages have their own modules (with the bonus provided by Revise.jl, if you use it), but this is also easy to do outside them - in the REPL since Julia 1.9, and in VS Code for a much longer time, see:

I know that this is not news, but this feature is often presented as something to facilitate the development of packages, and I’m sure that there are many users who do not consider themselves “package developers”, who can also take advantage of this, so perhaps they might benefit from this “personal self-advice”.

My workflow to do such kind of things is:

  1. After starting the REPL, create a playground module, e.g. module MyPlayground end.
  2. Move the REPL to that module, typing MyPlayground and hitting Alt+m. (In VS Code, also change the current module in the bottom bar.)
  3. include the script with the code written so far (loading packages, defining functions, constants…). In the first iteration, this might be empty.
  4. Work on the script as usual, messing up with the global scope if necessary.
  5. When I feel like cleaning the global scope, return to Main (same procedure as #2).
  6. Go to #1 with a new module, e.g. MyPlayground2, and repeat.

From another perspective, this workflow is more or less equivalent to the process of “cleaning the workspace” or “deleting everything” that some people feel missing in Julia, except that it does not really delete anything, just puts it in a module that you can leave out and ignore later. Therefore, after various iterations the Julia session may be consuming a lot of memory, and restarting may be inevitable, although that depends on the computer and how big the data are.

6 Likes

This seems like a good idea. It also seems like it’d be good to take your old module, iterate through all the globals and set them to nothing and run a GC. This would avoid all the memory accumulation issues. I’m not sure how to do this though.

In simple situations, it might be possible to go through the whole namespace of the module with names, and set the referred objects to nothing, but this should only be done with non-constant objects (i.e. simple variables) that have not been annotated with a type incompatible with Nothing. For instance:

function wipeout(module)
    for x in names(module, all=true)
        if !isconst(module, x) && Nothing <: Core.get_binding_type(module, x)
            setproperty!(module, x, nothing)
        end
    end
end

Nevertheless, modules can contain virtually any kind of thing, so I guess that there may be cases where that naïve attempt fails.

1 Like

I use a similar workflow, which I think is probably better.

module MyModule

function fun1()
    return 1
end

function main()
@eval begin 
    x = fun1()
end

Do using Revise, then do using MyModule along with Alt+m method of making the REPL refer to MyModule. But this way main is tracked by Revise.

The downside is that the folder structure has too look like a real julia package, like via ] develop. But you can have multiple sub-modules that are really just main() ... end and switch the REPL between them if you want.

It’s a little messy… but it’s a workflow for experimentation. Having a real folder structure is helpful if you decide you do want to make something a real module as well.

1 Like