Sequentially add data to a DataFrame

I often need to populate a DataFrame sequentially, one iteration at a time. I therefore end up initiating a DataFrame with mock data but with the correct types, deleting that first row, and then iterating through the real data I want to populate it with.
It looks like this:

df = DataFrame(place = "", quantity = 0, when = Date(0))
deleterows!(df, 1)
for i in 1:n
    # <do things>
    push!(df, (p, q, w))
end

This works, but I wonder if there is a better way to do this…

Thanks!

1 Like

I use exactly the same pattern except that I create a DataFrame like this:

df = DataFrame(place=String[], quantity=Int[], when=Date[])

to avoid deleting the first row later.

Alternatively there are constructors that take type of the data and number of rows and create DataFrame of an appropriate size and then you do not push! a row but assign it. But typically I prefer push! as it is simpler to think of.

1 Like

Of course! Thanks.

Could you give an example, showcasing the appropriate syntax?

julia> DataFrame([Int, Float64, String], [:a, :b, :c], 3)
3Γ—3 DataFrames.DataFrame
β”‚ Row β”‚ a         β”‚ b            β”‚ c      β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 428567856 β”‚ 2.11741e-315 β”‚ #undef β”‚
β”‚ 2   β”‚ 428567920 β”‚ 5.86393e-316 β”‚ #undef β”‚
β”‚ 3   β”‚ 428567984 β”‚ 5.84443e-316 β”‚ #undef β”‚

julia> DataFrame([Int, Float64, String], [:a, :b, :c], 0)
0Γ—3 DataFrames.DataFrame

The second example creates an empty DataFrame so you can push! into it like above. The first creates an uninitialized DataFrame with three rows.

4 Likes

I’m sorry for necroing a thread that is over 6 years old, but I want to add that the comment marked β€œsolution” will throw an error.

For an empty DataFrame, bkamins’ first comment works, but for an unitialised DataFrame of length N, as explained in another thread, it’s far better to construct and populate the individual columns as length N Vectors first, then construct the DataFrame from these vectors.

3 Likes