Perplexed by behavior of module scope changing variables

Hello,
I must be doing something wrong. Can someone explain this behavior to me?

module A 
    using DataFrames
    x = DataFrame(a=[1,3],b=[4,3])
    export x
end


module B
    using ..A
        function add_stuff(df)
            df[:,:summed_cols] .= sum.(eachrow(df[:,1:2]))
            return(df)
        end
    z = x
    z = add_stuff(z)
end


module C
    using ..A
    x
end

running module A:

julia> A.x
2×2 DataFrame
 Row │ a      b     
     │ Int64  Int64 
─────┼──────────────
   1 │     1      4
   2 │     3      3

^As expected

running module B:

julia> B.z
2×3 DataFrame
 Row │ a      b      summed_cols 
     │ Int64  Int64  Int64       
─────┼───────────────────────────
   1 │     1      4            5
   2 │     3      3            6

^ As expected

running module C:

julia> C.x
2×3 DataFrame
 Row │ a      b      summed_cols 
     │ Int64  Int64  Int64
─────┼───────────────────────────
   1 │     1      4            5
   2 │     3      3            6

^^NOT EXPECTED
What is happening here? “x” wasn’t even modified in module B, as “z” was put through the function. Why doesn’t C.x look like A.x?

Also, why can module B modify X:

julia> B.x
2×3 DataFrame
 Row │ a      b      summed_cols 
     │ Int64  Int64  Int64
─────┼───────────────────────────
   1 │     1      4            5
   2 │     3      3            6

But if I try to make a modification explicitly to x in the module, e.g.:

module B
    using ..A
        function add_stuff(df)
            df[:,:summed_cols] .= sum.(eachrow(df[:,1:2]))
            return(df)
        end
    z = x
    z = add_stuff(z)
    x = 3
end

I get the error:

ERROR: cannot assign a value to imported variable A.x from module B

x was modified because df[:,:summed_cols] .= ... changes df in-place.

This has nothing to do with modules or dataframes. A simpler example is:

julia> f(array) = array[1] = 2; # modifies array in-place

julia> x = [1,1,1]
3-element Vector{Int64}:
 1
 1
 1

julia> f(x);

julia> x # modified by f(x)
3-element Vector{Int64}:
 2
 1
 1

I would suggest reading Assignment expressions and assignments vs. mutation and Argument-passing behavior carefully. It’s common for people to get confused about this.

1 Like

As an attempt to save some clicks, Julia is one of the languages where variables are only references to objects (on the language-level, the compiler is free to implement it in a variety of ways). When you reassign a variable, you’re only changing what object it’s referencing; mutation of objects is a totally separate action, even if it involves reassignment syntax for elements and fields. This is very different from languages where variables own objects and variable reassignments change those objects.

In your specific example, module A assigns A.x. module B imports x from A then assigns z = x; that takes the object assigned to B.x and assigns it to B.z, therefore B.x and B.z reference the same object. add_stuff mutates and returns the same object, so z = add_stuff(z) reassigned z to the object it already was assigned to. module C imports x from A. By the end, A.x, B.x, B.z, and C.x all reference the same dataframe. You can check that their object identities match: A.x === B.x === B.z === C.x

It does, you just didn’t print A.x after running module B.

We’re not allowed to reassign imported variables that were already explicitly used, like in the z = x line. Besides committing to share variables via imports, that makes assignments more confined to home modules to prevent some errors. However, there’s nothing stopping one module from mutating an object assigned to imported variables because that is sharing as intended.

1 Like