DataFrame vs. Pandas (vs. Excel...), e.g. to refer to previous row

I think the docs for DataFrames are pretty good, and there’s a wikibook with a more tutorial-like feel for DataFrames that I think stays updated. w/r/t the ecosystem, I assume you’re talking about something like Query.jl, which I don’t use much, but knowing the people involved, I imagine the docs there are pretty good as well. I’m not aware of cheat sheets like those that you describe, though I always found pandas bewildering and DataFrames much clearer, so I wouldn’t ever have gone searching for such a thing. If it doesn’t exist, someone should do it for sure!

As to (B), it’s quite challenging to know how to help without a better description of the problem, preferably with a MWE (see here for some more tips on how to write up your question in a way that will make it easier for us to help). It’s easy to refer to a previous row if you know the index of the current row (df[i-1, :]), but not knowing if you’re doing something in a loop, or want a function that operates on a row, or what, it’s tough to give more guidance.

For (C), I think DataFrames is the best analogue to Panda’s, lots of people like Query (I think that’s more analogous to dyplr, though as I said I don’t use it much). DataFramesMeta was also quite popular, tough my impression is that recent API changes to DataFrames itself are making that package increasingly obsolete. And there are a bunch of others.

On a side note - There’s also no harm in using Pandas.jl if that’s what you’re comfortable with! Do what works, I say. When you find something that doesn’t work there, or is clunky, and you want to branch out, definitely come here for help :wink:

2 Likes