Change struct dataframe value correct way

Hello I have a struct with inside other two structs:

Base.@kwdef mutable struct MyDataset
  input::MyInputDataset = MyInputDataset()
  processed::MyInputDataset = MyInputDataset()
end

Base.@kwdef mutable struct MyInputDataset 
  first_st::FirstStruct = FirstStruct()
  second_st::SecondStruct = SecondStruct()
  df1::DataFrame = DataFrame()
  df2::DataFrame = DataFrame()
end

Now I instantiate the df1 dataframe inside the input struct and let the same dataframe from the processed struct to point to it

julia> dataset.input.df1 = DataFrame("w"=>2)
1×1 DataFrame
 Row │ w     
     │ Int64
─────┼───────
   1 │     2

julia> dataset.processed.df1 = dataset.input.df1 
1×1 DataFrame
 Row │ w     
     │ Int64
─────┼───────
   1 │     2

After that if I want to change the dataframe of the processed it will change also the value of the input’s dataframe

julia> dataset.processed.df1 = DataFrame("r"=>2)
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1 
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

Could anyone explain me why and/or help me?

Thank you

When you assign dataset.processed.df1 = dataset.input.df1 , you’re not creating a new dataframe with the same contents, instead you’re making processed.df1 point to the same DataFrame as input.df1. This is generally how Julia works: assignment does not create a new copy.

If you want it to be a new DF, create a new dataframe with the copy method, for eg. dataset.processed.df1 = copy(dataset.input.df1; copycols=true) .

Welcome to the Julia community!

If I understand correctly, you’re showing desired behaviour here, i.e. you’d want dataset.input.df1 to be as you printed (which is not how this currently works)? If so, then the situation is that you here reassign dataset.processed.df1 to point to a new DataFrame (with "r"), while dataset.input.df1 still refers to the old one (with "w"). Instead, you need to modify the original DataFrame inplace (using ! methods). E.g.

julia> dataset.processed.df1 = dataset.input.df1 = DataFrame("w"=>2);

julia> rename!(dataset.processed.df1, :w => :r)  # Change the content of dataset.processed.df1, but don't reassign the variable
1×1 DataFrame
 Row │ r
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1
1×1 DataFrame
 Row │ r
     │ Int64
─────┼───────
   1 │     2

Alternatively, and for the same reasons, you could work with Refs / Base.RefValues.