Accessing variables constructed within another function

I need to create many variables within a function and input these data into numerous functions which conduct operations on the data. How can I do this without declaring the variables as globals? As I understand it variables defined within a function are not accessible outside that function unless they are defined as globals. I have 50+ variables which vary in dimensions (i.e., I have scalars, vectors, and matrices).

Here’s a toy example of this issue. I do not know how to access the three variables defined within prepare_data() in the functions that follow it.

function prepare_data()
# below are two examples of variables constructed. The actual function creates 50 or so variables which vary in dimension
    data = readdlm("data.csv", ',', skipstart=1, Float64)
    var1 = data[:, 1]
    var1_index = findall(var1.==1)
    var1_dummy = (var1.==var1')
end

function dosomething1()
    # construct t as a function of the data in prepare_data() and return
    t = var1[var1_index]
    return t
end

function dosomething2()
    # use data from prepare_data() again
    t = dosomething1() - var1[var1_index]
    return t
end

I guess a workaround would be to store (without the function declaration) the code that creates the variables in a different file (say prepare_data.jl) and include it later in the main file.

So save the following in prepare_data.jl

const data = readdlm("data.csv", ',', skipstart=1, Float64)
const var1 = data[:, 1]
const var1_index = findall(var1.==1)
const var1_dummy = (var1.==var1')

call it in the master file as include("prepare_data.jl"). I can then go about my calculations since the variables defined in prepare_data.jl are accessible in the rest of the document.

That is literally declaring them as globals.
include (like eval) always acts at global scope

1 Like

including them allows me to declare them as consts so that seems an improvement.

probably just stick them all in a named tuple.
and return that from the function.
which can be done by writing (; a, b, c)
and then accessed as x.a etc after you assign that return to x.

But also consider restructuring your code to not need to do this.

Its rare that real functions needs to deal with more than a handful of variables.
Once appropriately scoped into nice distinct functions with clear purpose.

4 Likes

I don’t understand your example, but this sounds exactly like what dictionaries are for, and definitely not something globals are good for.

How about returning a dictionary from the first function?

xvars=Dict()
xvars["data"]= readdlm("data.csv", ',', skipstart=1, Float64)
xvars["var1"]= data[:, 1]
xvars["var1_index"] = findall(var1.==1)
xvars["var1_dummy"] = (var1.==var1')

return(xvars)

Yes I can take a look at dictionaries. Would using dictionaries improve performance compared to declaring every variable as a const?

would sticking them in tuples improve performance compared to const?

I am not sure how these two compare to const globals, but surely the NamedTuple will have better performance than Dicts unless you need to frequently change the values in the NamedTuple fields.

5 Likes

The variables do not change. They are data that i need to constantly reference for each iteration of a maximum likelihood estimation.

I sincerely would go with the NamedTuple solution then.

3 Likes

Not performance no.
But code readability yes

If these variables are going to be consistent over time, then you may even consider creating an immutable struct type rather than a NamedTuple. You can just pass this struct around.

Rather than having one big struct, you could create smaller structs, and then build the larger struct out of those structs. Then you might not need to pass the entire struct. You could just pass the substruct around.

1 Like

Thanks for all the replies. I ran into a memory issue and I am not sure if it is due to the use of const. The OS automatically kills the session about 4 hours in citing memory issues. Would the use of NamedTuple, struct, or dictionaries reduce the memory used?

not directly. But it would mean it was possible for it to go out of scope and thus be garbage collected.

More large scale restructuring of your code may allow different parts to go out of scope at different times, and thus help more