Is "include" safe to use inside a function?

Thanks dlakelan that would work, but I don’t see how it’s better than using an include() statement inside a function. Either way, the user is dynamically defining global variables within My_Module.

I think my confusion all boils down to how the user will interact with a module:

Option a) the user creates the input data structure by defining global variables WITHIN My_Module (e.g., by running My_Module.initialize_data([parameters])), then the user executes the simulation by running something like MyModule.execute_simulation(MyModule.variables)

Option b) the user first creates the input data structure OUTSIDE of My_Module, and My_Module provides no help in performing that initial task, then the user executes the simulation by running something like MyModule.execute_simulation(my_data_structure).

Option b seems to be the Julia way of doing things (I’m guessing) because it doesn’t lead towards using an include() statement inside of a function (because the lonnnng input configuration file exists outside of My_Module and doesn’t require a function at all).

In contrast, Option (a) places the input configuration file INSIDE of My_Module and requires a function wrapped around it to allow the user to dynamically create the input.

So I’ll go with Option (b), and that solves it.

No there’s no “Dynamically defined global variables” anywhere when you use a function. There’s also nothing related to a module anywhere. If all you want is to define a global variable when you define the function, then just use include, it’s totally fine and is what it is meant to do, there’s nothing local about it. But if you want to create the instance at runtime, then you define the function to create the data by including a file you call the defined function at runtime. The include("thefile.jl") line does not happen in the same scope as the make_instance() call.

Don’t do that.

Errr, no, that’s not true at all. Not using a global variable to pass in argument does not mean your module not providing any help doing it. You module can just define a function to create whatever state for user to pass in, or could even just create the data automatically when the user calls your function.

That depends on your definition of “dynamic”. I meant “dynamic” insofar as the user is changing a subset of the global variables of My_Module at random times (i.e., at initiation AND at random times later).

For real? My_Module is in almost every line… what are you talking about?

No I wanted the user to be able to define a subset of the global variables at random times (aka “dynamically” but I guess I should have said “randomly”).

Yes I understand that you don’t recommend it. I was hoping for a clear logical reason why I shouldn’t do it… haven’t gotten that yet.

lol you said “Don’t do that” two lines above.

Correct, and none of the solution I was talking about has any of that.

Sure, all code are in a module. But the existance of that module does not matter even a little bit here. There’s no difference between calling from within/outside the module. Modification of global variable etc. There are only normal function definitions and it doesn’t matter what module those functions are defined in as long as you can call them.

No, defining global variables is not a goal, it’s not you want. You want the user to construct objects, and they need not be global variables.

Having mutable global state like this is bad in general. It’s bad for performance, bad for debugging, bad for concurrency and bad for the user. There’s nothing julia specific about it. It’s just all around bad programming practice.

No I did not. I said don’t let the user create global state and make that part of your API. Functions are global objects and they are immutable and are totally fine. Same with creating global constants no matter how big they are. If what you want has to be changed at runtime / between runs, however, they must not be global states. What I said you should do here is that you can have functions to help the user to create the state, but the helper must not store those states in any global variables (caches are fine, since they are transparent). I.e. I said “create whatever state”, never “global”.

1 Like

Sweet. I think you’re teaching me something good here. Thank you. So the cache that you mentioned – do you mean a hard-disk-saved dictionary, or hard-disk-saved data structure? Or what kind of cache are you referring to?

1 Like

No, nothing to do with hard-disk. Just anything that caches the result based on any unique condition that they are generated. Doesn’t matter where it is saved. It’s just necessarily global and by definition transparent to the user.

Do you know of any MWE-style Julia examples of such methods?

Or, more generally, a Julia simulation that follows good programming practices in so far as we’ve been discussing?

All of this would be so much easier to communicate via examples.

No.

But all you need is to pass the state around as argument instead of global variable.

Take a look at Parameters.jl, which provides macros that make it easier to deal with the boilerplate of parameter initialization.

1 Like

If you want a function that takes a script name as input and returns the object that it creates, then the following should work:

function create_a_structure(include_filename)
    eval(Meta.parse("let\n" * read(include_filename, String) *"\nend"))
end

This takes the user code, wraps it in a let-block, and then evaluates it in the global scope. The let-block is there to make sure that any intermediate variables created by the user (such as m and r in your example) do not interfere with the rest of your code.

You can call the function as:

x = create_a_structure("some_file.jl")::data_type_A

The :: type assertion at the end of the line is there so that if the last statement in the user’s code does not give a structure of the correct type, an error will be thrown.

However, in most cases, I would rather do as @dlakelan suggested, and tell the user to write a function (or just create the object) and pass that function or object instead of a filename. That’s less magic. Also, if something goes wrong, the user will get an error message that refers to the exact line in their code where the error was thrown, rather than to the eval(Meta.parse(...)) statement.

Please don’t, that’s just include but worse.

1 Like

Yeah. That’s what I tried to say in the last paragraph.

Still, I think it’s neat that it’s easy to do this sort of thing if you want to.

… shucks because all I can imagine is my “option a”

and option (b)

… which both “pass the state around as an argument” but both were denounced by you. I don’t know how I would pass the state as argument unless the state is defined OUTSIDE the function (i.e. the state is “global” with respect to the function). So I’m totally confused. And nobody in Julia-land has an example of good practice? Tragic. You suggested using a cache that is

…which sound exactly like a global variable… so confusing.

I thought the whole point of modules was [quoting from Julia documentation] " to create top-level definitions (aka global variables) without worrying about name conflicts when your code is used together with somebody else’s". Thus I thought it was good practice to store the system state within a module’s global scope, because I can control who sees that module’s scope, and then I can pass the state around as My_Module.state, but you’re saying that’s poor. So I feel totally out of options.

I didn’t say you can’t do option b, just that you don’t have to. You don’t have to push all the work to the user.

And no, your option a doesn’t pass things around as an argument, since you were saying that you save the initialized value in a global variable, which is not an argument.

What I’m saying is simply to replace the global variable with an argument, i.e. when you want to have

function init(...)
    global global_variable = create_complicated_state_based_on_user_input(...)
end

You simply do

function init(...)
    return create_complicated_state_based_on_user_input(...)
end

and let the user pass the return value to you. Or you simply absorb the creation of this state object to your other functions.

  1. You don’t have to pass the state as argument either. You can simply create it in the function if that works. That was how I thought you initially want but then you starts talking about a separate function doing the initialization different from the function that uses these state which is why I said if you want to do this you can pass it in as an argument. Still, I keep saying that you can absorb that initialization function to your main function and not having to pass it in as argument.
  2. And no, external does not equal to global. And it in all cases won’t equal to a global variable known to your function. It is probably stored in some variable, but it’s not necessarily a global variable, and even if it is a global variable, that won’t be how you access it in your function. It is passed in as an argument and you access it from the argument. i.e. if you have
a = ...
f(a)

if a is a global variable, that still has nothing to do with how f access a. The fact that it is in a variable a is completely the caller’s business, not yours, it’s not your state to maintain and must not be in your module.

No, I did not. I said cache is an exception to not use global state. I did not say whether it will or will not match your use case.

Correct, and it has nothing to do with what you want. I mentioned it only to make my logic more complete since this is the main exception to global state being bad. If you are confused by it, just ignore it.

No, it just means if you have global state, the module is where you should put it. But as I’ve said many times, the module is completely irrelavant here. The point is not about where you put the global state, it’s about you must not have such global state.

Every single function that takes input from another function is good practice here.

My impression is that @damonturney is creating something like a “homework grading system” that is: other people are supposed to write Julia code to create a thing and then his code is evaluating whether the thing in the given file was the right kind of thing … is that approximately right ? because it’d help a lot in giving examples if we knew what was the intended usage

AFAICT, including a file is not a requirement (since option b is possible).

My impression is that he’s trying to create a user-tweakable input deck for a simulation or some other costly computation. The package would come with a few example input decks that users could modify to suit their purposes, but each deck has to return a struct containing the initial condition parameters. Based on the initial example, it feels like he’s coming from Python (or some other object-oriented language where all the behavior has to be shoehorned into the object definition).

Instead, I think you want something like this workflow for your user:

using DamonTurneysModule # make exported functions and structs visible

initialize(; a = zeros(10), b = rand(10), c = pi/2)
    # user's initialization logic goes here
    a[1] = 2
    if sum(b) > 5
        b[1] = a[1]
    end
    return DataTypeA(a, b, c)
end

results = run_simulation(initialize(c = 3.0))

Oceananigans.jl has a well-developed initial condition definition along these lines.

I’m writing a simulation of the chemistry/physics of a zinc-manganese battery. The battery has an initial state, then the simulation makes a series of operations to that state. I’ve finished writing all the chemistry and physics of the code, so now I finally need to write the user-interface (i.e. allow the user to define the initial state; allow the user to define sub-details of the operations; allow the user load up the results of each operation). Thus I’m in this discussion here. I also want the code to run as fast as possible! So please let me know if I’m doing something that slows down the code.

I’m presently at this MWE code pasted below.


###### Contents of include_file.jl
#This file is a lonnng list of parameters to initialize the system state
#The user will edit this include_file.jl to define the initial state according to their desires
r=80
q=3
a=[80, 0, 160.0]/r
b=[29.3, 31.1, 3.3]*q
###### end of include_file.jl



module manage_system_state
    using FileIO
    using Dates

    #Define a data type to hold the state of the battery
    struct system_state_data_type
        a::Array{Float64,1}
        b::Array{Float64,1}
        c::Float64
    end

    function create_system_state(include_filename)
        include(include_filename)
        system_state = system_state_data_type(
            a, 
            b, 
            789.5)
        timestamp = Dates.format(Dates.now(),"yyyymmddHHMMSS")
        save(timestamp * "_initial_state.jld2", Dict("system_state"=>system_state))
        return(system_state)
    end

    function load_system_state_from_saved_dictionary(dictionary_name)
        system_state=get(load(dictionary_name), "system_state", 0)
        return(system_state)
    end
end


module simulate
    using FileIO
    using Dates
    using Plots

    #Define an data type to hold infomration about an operator that will manipulate the battery
    struct operator_A_data_type
        a::Float64
        b::Int
        c::Array{Float64,1}
        d::Array{Float64,1}
    end

    function operation_A(system_state, a, b)
        operator = operator_A_data_type(a, b, [99, 0, 22.0], [19.3, 4.5, 0])
        system_state.a[1]=system_state.a[2] * operator.a * operator.c[1]
        system_state.b[2]=system_state.c    * operator.b * operator.d[2]
        ### Next, save information about this operation and it's effect on system state
        timestamp = Dates.format(Dates.now(),"yyyymmddHHMMSS")
        save(timestamp * "_operator_A_output.jld2", Dict("system_state"=>system_state, "operator_A_structure"=>operator ))
    end

end



#User Workflow
system_state = manage_system_state.create_system_state("include_file.jl")
simulate.operation_A(system_state,13.5, 5)
resulting_system_state = system_state.load_system_state_from_saved_dictionary("20200802124716_operator_A_output.jld2")
using Plots
display(plot(resulting_system_state.a, resulting_system_state.b))

I use an include() inside a function, but that function is well-contained inside a small module that will never get modified or confused by other operations. I really don’t see the problem of this use of an include() statement because it is so well-contained. It would be nice if Julia offered a variation of include() that runs at local scope, so I can paste in large boring chunks of code from user input files.

You should make file.jl a .txt file or .csv file holding just the data for your model, i.e. the values r = 80, q = 3, etc. but not the division by r.

Read in this file using Julia’s IO functions, then perform the calculations to get the values a and b.

You function create_system_state should do all this. You should only read in data, don’t perform operations using include.

Thanks pdeffebach. But I HAVE to make those initial calculations somewhere, so why not combine it all into a .jl file? What’s the harm? It’s way more user friendly this way, and I really don’t think there’s any risk of name collisions, or confusion, because it’s well-contained in a super small module.