Hi. I have a dataframe like this:
I want to be able to get the numerical data for a single row, say, “Algeria”. This is the function that I wrote for that.
data -> filter(:country => val -> val == country, data) |>
data -> data[1, 2:end] |>
data -> convert(Vector, data)
I try to call the function like this:
But this is the error I’m getting:
ArgumentError: column name :country not found in the data frame
top-level scope@Local: 1[inlined]
Country/Region in your screenshot.
I suspect this would help you Best Julia Data Manipulation packages combo 2020-09 - YouTube
I don’t think your code works at all looking at the data.
It looks like this post suffers from the same confusion as your other post about converting from CSV to Parquet: what you are working with after reading in from CSV or Parquet is (most likely) a
DataFrame object, which is the same irrespective of how it was constructed (i.e. read from a CSV file, Parquet file, Arrow file,… or indeed constructed “manually” like
DataFrame(col1 = rand(10), col2 = rand(10)))
So your title “… retrieve a single row of data from Parquet” is a bit misleading - you are probably asking for a way to select a row from a
DataFrame object (unless you are reading your parquet data into some other tabular type, in which case please specify this).
This works by simple indexing like for a standard two-dimensional Julia Array:
df[df."Country/Region" .== "Algeria", :]
will give you all rows for which the
Country/Region column has the value
Algeria, and all columns in that row.
I’d recomment you work through Bogumil’s excellent introduction to DataFrames here: GitHub - bkamins/Julia-DataFrames-Tutorial: A tutorial on Julia DataFrames package if you intend to work with them.