Change struct dataframe value correct way

rdl987 · September 5, 2025, 4:45pm

Hello I have a struct with inside other two structs:

Base.@kwdef mutable struct MyDataset
  input::MyInputDataset = MyInputDataset()
  processed::MyInputDataset = MyInputDataset()
end

Base.@kwdef mutable struct MyInputDataset 
  first_st::FirstStruct = FirstStruct()
  second_st::SecondStruct = SecondStruct()
  df1::DataFrame = DataFrame()
  df2::DataFrame = DataFrame()
end

Now I instantiate the df1 dataframe inside the input struct and let the same dataframe from the processed struct to point to it

julia> dataset.input.df1 = DataFrame("w"=>2)
1×1 DataFrame
 Row │ w     
     │ Int64
─────┼───────
   1 │     2

julia> dataset.processed.df1 = dataset.input.df1 
1×1 DataFrame
 Row │ w     
     │ Int64
─────┼───────
   1 │     2

After that if I want to change the dataframe of the processed it will change also the value of the input’s dataframe

julia> dataset.processed.df1 = DataFrame("r"=>2)
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1 
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

Could anyone explain me why and/or help me?

Thank you

digital_carver · September 5, 2025, 5:30pm

When you assign dataset.processed.df1 = dataset.input.df1 , you’re not creating a new dataframe with the same contents, instead you’re making processed.df1 point to the same DataFrame as input.df1. This is generally how Julia works: assignment does not create a new copy.

If you want it to be a new DF, create a new dataframe with the copy method, for eg. dataset.processed.df1 = copy(dataset.input.df1; copycols=true) .

eldee · September 5, 2025, 7:38pm

Welcome to the Julia community!

rdl987:

After that if I want to change the dataframe of the processed it will change also the value of the input’s dataframe

julia> dataset.processed.df1 = DataFrame("r"=>2)
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1 
1×1 DataFrame
 Row │ r     
     │ Int64
─────┼───────
   1 │     2

If I understand correctly, you’re showing desired behaviour here, i.e. you’d want dataset.input.df1 to be as you printed (which is not how this currently works)? If so, then the situation is that you here reassign dataset.processed.df1 to point to a new DataFrame (with "r"), while dataset.input.df1 still refers to the old one (with "w"). Instead, you need to modify the original DataFrame inplace (using ! methods). E.g.

julia> dataset.processed.df1 = dataset.input.df1 = DataFrame("w"=>2);

julia> rename!(dataset.processed.df1, :w => :r)  # Change the content of dataset.processed.df1, but don't reassign the variable
1×1 DataFrame
 Row │ r
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1
1×1 DataFrame
 Row │ r
     │ Int64
─────┼───────
   1 │     2

Alternatively, and for the same reasons, you could work with Refs / Base.RefValues.

rdl987 · September 8, 2025, 7:39am

Thank you for the responses.

My idea is, at the beginning have the processed.df1 to point to the same DataFrame as input.df1 but at certain point into the time I want to let the processed.df1 to point a new DataFrame leaveing the input.df1 to point to the original Dataframe.

eldee · September 8, 2025, 9:44am

Well, this is already what your original code does?

julia> dataset.input.df1 = DataFrame("w"=>2)
1×1 DataFrame
 Row │ w
     │ Int64
─────┼───────
   1 │     2

julia> dataset.processed.df1 = dataset.input.df1
1×1 DataFrame
 Row │ w
     │ Int64
─────┼───────
   1 │     2

julia> dataset.processed.df1 = DataFrame("r"=>2)
1×1 DataFrame
 Row │ r
     │ Int64
─────┼───────
   1 │     2

julia> dataset.input.df1
1×1 DataFrame
 Row │ w
     │ Int64
─────┼───────
   1 │     2

(@digital_carver 's reply is for when you want to decouple the DataFrames while keeping the contents identical (for now), while mine was for keeping them coupled, while altering the common content.)

rdl987 · September 8, 2025, 2:07pm

No because what I’m trying to achieve is to change only the dataset.processed.df1 Dataframe without change also the dataset.input.df1.
In other words at the begin I want that both df1 dataframes point to the same memory location, so if I “query” processed.df1 or input.df1 I will get the same result. But at some point I will change the value only at processed.df1 to point to a new memory location with a new DataFrame.

This request is because I need to get the initial data (dataset.input) “clean” to let the program to perform various run and every run will modify only the dataset.processed without perform a copy() or a deepcopy() to keep the RAM footprint as small as possible.

Thank you again

eldee · September 8, 2025, 3:02pm

It really does sound like it’s already working as intended.

julia> dataset.input.df1 = dataset.processed.df1 = DataFrame("w"=>2);

julia> dataset.input.df1 == dataset.processed.df1
true

julia> dataset.processed.df1 = DataFrame("r"=>2);

julia> dataset.input.df1 == dataset.processed.df1
false

julia> dataset.input.df1 = dataset.processed.df1 = DataFrame("w"=>2);

julia> pointer_from_objref(dataset.input.df1)
Ptr{Nothing} @0x0000025ca5350c50

julia> pointer_from_objref(dataset.processed.df1)  # same memory location
Ptr{Nothing} @0x0000025ca5350c50

julia> dataset.processed.df1 = DataFrame("r"=>2);  # (or even DataFrame("w"=>2) )

julia> pointer_from_objref(dataset.input.df1)  # has not changed
Ptr{Nothing} @0x0000025ca5350c50

julia> pointer_from_objref(dataset.processed.df1)  # different
Ptr{Nothing} @0x0000025ca5351b50

I’m not sure I fully understand, but presumably you’d only need a single copy to create dataset.processed.df1 from dataset.input.df1, after which you can just keep modifying dataset.processed.df1 inplace? I don’t see how you could get a smaller memory footprint while keeping the df1s separate.

Topic		Replies	Views
Create new dataframe with minor changes New to Julia dataframes , copy	11	506	April 30, 2022
Modify a dictionary under a new name modifies the original dictionary General Usage question	6	1122	July 27, 2021
Make a Copy of a DataFrame Row General Usage dataframes	9	1303	February 2, 2023
Indexing DataFrame with : does not generate a copy Specific Domains dataframes	2	790	March 17, 2022
Argument passing with dataframes New to Julia dataframes	4	472	July 9, 2021

Change struct dataframe value correct way

Related topics