Hello, I need a mutating version of vcat
for dataframes. This is because I must do the concatenation inside a mutating function so as not to define new temporary data frames.
This is an example of how I would like to use it, the dataframe I want to mutate is called df_streams
.
"""Single stream match with the catalog."""
function match_stream_catalog!(stream::Py, df_cat::DataFrame, df_streams::DataFrame)
window = [[minimum(stream.track.ra.deg), maximum(stream.track.ra.deg)],
[minimum(stream.track.dec.deg), maximum(stream.track.dec.deg)]]
window = pyconvert(Vector{Vector{Float64}}, window)
stream_name = pyconvert(String, stream.stream_name)
field = get_field(window, df_cat)
field.track = Vector{String}(repeat([stream_name], size(field,1)))
ra = Py(field.RA)
dec = Py(field.DEC)
field_coord = ac.SkyCoord(ra=ra*u.deg, dec=dec*u.deg, frame="icrs")
ontrack = stream.get_mask_in_poly_footprint(field_coord)
vcat!(df_streams, field[pyconvert(BitVector,ontrack),:])
return nothing
end
Thank you very much.
Update: is the solution append!
? And for hcat
?
To append individual rows to an existing DataFrame
, you can use push!
(or append!
for multiple rows, but that would require allocating temporaries). For hcat
, itβs exactly that: hcat(df1, df2)
, or hcat(df1, df1, copycals=false)
for low-allocation concatenation.
1 Like
Thank you! I must concatenate two dataframes. I tried with hcat(df1, df2, copycals=false)
but it doesnβt mutate df1 nor df2.
If I use append!(df1,df2)
, does it use temporary dataframe as if I did:
df3=hcat(df1,df2)
or its something different ?
I would like to use as few RAM as possible.
Thanks.
A DataFrame
is (more or less) just a list of named vectors, and hcat(..., copycols=false)
doesnβt copy the underlying vectors:
julia> df1, df2 = DataFrame(a = rand(10^6)), DataFrame(b = 1:10^6);
julia> @time df1 = hcat(df1, df2, copycols=false);
0.000019 seconds (18 allocations: 1.406 KiB)
1 Like
If you want to vertically append in-place use append!
or prepend!
(they are equivalents of vcat
that allocates a new data frame).
If you want to horizontally add columns in-place to a data frame use insertcols!
. Here is an example:
julia> df = DataFrame(a=1:2, b=3:4)
2Γ2 DataFrame
Row β a b
β Int64 Int64
ββββββΌββββββββββββββ
1 β 1 3
2 β 2 4
julia> df2 = DataFrame(c=11:12, d=13:14)
2Γ2 DataFrame
Row β c d
β Int64 Int64
ββββββΌββββββββββββββ
1 β 11 13
2 β 12 14
julia> insertcols!(df, pairs(eachcol(df2))...)
2Γ4 DataFrame
Row β a b c d
β Int64 Int64 Int64 Int64
ββββββΌββββββββββββββββββββββββββββ
1 β 1 3 11 13
2 β 2 4 12 14
Use copycols
to decide if df
should alias or copy columns from df2
. By default it will copy.
3 Likes
There is also a discussion to expose hcat!
in this issue, but currently no one really needed it often so we have not done it. (the hcat!
function is available internally, but is not part of public API).
1 Like
Thanks, now I understood how to use copycols=false
with df1=hcat(df1,...)
.
Thank you very much for this explanation. I will use append!(df1,df2)
.