When you assign dataset.processed.df1 = dataset.input.df1 , you’re not creating a new dataframe with the same contents, instead you’re making processed.df1 point to the same DataFrame as input.df1. This is generally how Julia works: assignment does not create a new copy.
If you want it to be a new DF, create a new dataframe with the copy method, for eg. dataset.processed.df1 = copy(dataset.input.df1; copycols=true) .
If I understand correctly, you’re showing desired behaviour here, i.e. you’d want dataset.input.df1 to be as you printed (which is not how this currently works)? If so, then the situation is that you here reassign dataset.processed.df1 to point to a new DataFrame (with "r"), while dataset.input.df1 still refers to the old one (with "w"). Instead, you need to modify the original DataFrame inplace (using ! methods). E.g.
julia> dataset.processed.df1 = dataset.input.df1 = DataFrame("w"=>2);
julia> rename!(dataset.processed.df1, :w => :r) # Change the content of dataset.processed.df1, but don't reassign the variable
1×1 DataFrame
Row │ r
│ Int64
─────┼───────
1 │ 2
julia> dataset.input.df1
1×1 DataFrame
Row │ r
│ Int64
─────┼───────
1 │ 2
Alternatively, and for the same reasons, you could work with Refs / Base.RefValues.
My idea is, at the beginning have the processed.df1 to point to the same DataFrame as input.df1 but at certain point into the time I want to let the processed.df1 to point a new DataFrame leaveing the input.df1 to point to the original Dataframe.
(@digital_carver 's reply is for when you want to decouple the DataFrames while keeping the contents identical (for now), while mine was for keeping them coupled, while altering the common content.)
No because what I’m trying to achieve is to change only the dataset.processed.df1 Dataframe without change also the dataset.input.df1.
In other words at the begin I want that both df1 dataframes point to the same memory location, so if I “query” processed.df1 or input.df1 I will get the same result. But at some point I will change the value only at processed.df1 to point to a new memory location with a new DataFrame.
This request is because I need to get the initial data (dataset.input) “clean” to let the program to perform various run and every run will modify only the dataset.processed without perform a copy() or a deepcopy() to keep the RAM footprint as small as possible.
julia> dataset.input.df1 = dataset.processed.df1 = DataFrame("w"=>2);
julia> pointer_from_objref(dataset.input.df1)
Ptr{Nothing} @0x0000025ca5350c50
julia> pointer_from_objref(dataset.processed.df1) # same memory location
Ptr{Nothing} @0x0000025ca5350c50
julia> dataset.processed.df1 = DataFrame("r"=>2); # (or even DataFrame("w"=>2) )
julia> pointer_from_objref(dataset.input.df1) # has not changed
Ptr{Nothing} @0x0000025ca5350c50
julia> pointer_from_objref(dataset.processed.df1) # different
Ptr{Nothing} @0x0000025ca5351b50
I’m not sure I fully understand, but presumably you’d only need a single copy to create dataset.processed.df1 from dataset.input.df1, after which you can just keep modifying dataset.processed.df1 inplace? I don’t see how you could get a smaller memory footprint while keeping the df1s separate.