Iterating over a data frame without explicitly stating if you iterate over rows or columns can create confusion. Although I think we can agree that iterating over rows might be more frequent than iterating over columns, I think the decision to require the explicit use of eachrow or eachcol is a sane one.
You can try this on your end (will output the same error as the one you reported):
# fails
for x in df2
println(x)
end
However, this works (altogether with eachcol):
# works
for x in eachrow(df2)
println(x)
end
Now, related to your code snippet, it will fail even if you actually use eachrow instead of passing the data frame: because iterating in that way will produce DataFrameRow elements - and Makie will not do the extra work to detect that in fact you are using a single column and guess your intention of using those values.
To fix this, you’ll need to actually pass the column values:
hist!(ax2, df2.yourcolumnname)
You can use use df2[:, 1] if you want to just use the first column without bothering with the name.
Now, related to:
It is not abstract, but it has an abstract supertype.
Try to run this on your end: isa(df2, AbstractDataFrame). You’ll see that it evaluates to true.
Now, there is a Base.iterate method that is implemented to throw a very informative error (pointing towards eachrow and eachcol usage).
Instead of implementing the method for each concrete type that is a subtype of AbstractDataFrame, the method is implemented for the parent abstract type.
Thank you and indeed it solved the problem. I was confused because none of the examples in the Makie tutorial have to do that. That being said, I get a white image. The dataframe is 2 million points. It should be able to handle that. I switched to a kernel density, and same, computation is really slow and results in a blank figure.
Therefore, I subsampled my dataframe to 100 (one hundred) and tried to plot an histogram. Bur alas, still just a blank figure. Even if Makie is not the best for millions of points, we agree that 100 points should not be outside of its capabilities, right?
I am really sorry, I feel like a total idiot bringing you questions after questions… Is there a way to contribute and give back to this community? Like a “donate” or something? I would definitely do it at my discretion.
I am not sure about others but at least in my case, this is my way of giving back to the community (e.g., producing more of the things that I found valuable when starting my Julia journey). So my answer above was actually in the same give-back category.
Now, I understand that there are more ways in which one can give-back. If you are inclined towards a donation, my feeling is that supporting the core language development is a valuable way to spend money. However, I am not knowledgeable enough about the precise ways things work at that level - so maybe somebody with more information can jump in and add more context.
Regarding your additional questions, I think the best way is to create another topic and provide a minimal working example. That will speed up the time needed to solve your problem and be valuable for others that might encounter similar issues (this will aid the search functionality - both here on discourse and from outside search engines).
@algunion is 100% correct. We require an explicit call to eachrow or eachcol because different other ecosystems chose one option or the other as a default and users coming to DataFrames.jl were sometimes confused.
A histogram does not actually plot the points you give it, just a couple bars, so you should be able to throw almost an arbitrary number of points at it. For me, 100 million take about 3 seconds to render. A white image suggests something else is going on.
With GLMakie you should be able to handle a couple million points at interactive speeds.