Assembling data using DataFrame such that rows where a particular array has empty elements are discounted

I am collecting arrays of data using DataFrames:

df = DataFrame(marketIds = x[:,1], stockIds = x[:,2], price = x[:,3])

In the array of prices, some elements are empty. How do I assemble df such that rows where price is empty are not included?

Being lazy, I would just use missing to encode the “empty” elements, and then dropmissing.

1 Like

I think that is a good solution, but I have a new question. How do I take an array with empty elements and replace these with missing?

What do you mean by “empty”? can show a subset of your matrix as an example?

typeof(price) Vector{Any}

2.20
4.00
3.90
1.74
4.50
3.70
1.93
2.46
3.45
“”
“”
2.08

So, the “empty” are actually " "

Okay my advice is two-fold.

  1. Use
t = replace(x, "" => missing)
t = identity.(t)

to replace your empty strings with missings.

  1. Re-think the way you are importing data. You should really be having Array{Union{Float64, Missing}} from the start instead of Array{Any}. That will make your code a lot easier to work with. But I’m not sure how to help specifically without knowing how you are importing your data.
2 Likes