Issue adding a row record of a DataFrame with `String` name to itself

This should not happen normally:

julia> df = DataFrame(a=1:3, b='a':'c')
3×2 DataFrame
 Row │ a      b
     │ Int64  Char
─────┼─────────────
   1 │     1  a
   2 │     2  b
   3 │     3  c

julia> append!(df, df)
6×2 DataFrame
 Row │ a      b
     │ Int64  Char
─────┼─────────────
   1 │     1  a
   2 │     2  b
   3 │     3  c
   4 │     1  a
   5 │     2  b
   6 │     3  c

However, it can happen if you have columns that are aliases:

julia> df = DataFrame(a=1:3, b='a':'c')
3×2 DataFrame
 Row │ a      b
     │ Int64  Char
─────┼─────────────
   1 │     1  a
   2 │     2  b
   3 │     3  c

julia> df.c = df.a
3-element Vector{Int64}:
 1
 2
 3

julia> append!(df, df)
┌ Error: Error adding value to column :a.
└ @ DataFrames
~\.julia\packages\DataFrames\MA4YO\src\dataframe\dataframe.jl:1423
ERROR: AssertionError: length(col) == targetrows

While DataFrame allows for storing aliased columns you should avoid doing this, as it can lead to errors (which are caught - as in this case - but are in general hard to diagnose).

In my above example instead of df.c = df.a it would be better to write df.c = copy(df.a) and all would work.

To be clear what alias means: it is a situation that two columns are identical (i.e. they have the same memory location).

2 Likes