head
seems to be terminology used everywhere including R and Pandas, so why deprecate it?
Because Julia uses first
for this.
Eh… Not really making sense. I think it is a joke. Why is first preferred. It used to use head that’s why it was “deprecated” otherwise there wouldn’t be anything to deprecate
It got changed from head
to first
because Julia uses first
for this everywhere else (see the docstring for first
).
Depends what “this” is. first
gives the first element everywhere in the Julia ecosystem, and this gives the first row, but head
gives the top 5 rows. first(df, 5)
feels like a regression, given that this is constantly used for data analysis.
Roughly how I feel. I feel it’s more of an inconvenience than anything.
Well, just define your own head(df) function in startup.jl .
The issue is that first
and last
take count argument for AbstractStrings
only, not for general collections. Actually - as noted in https://github.com/JuliaData/DataFrames.jl/pull/1932 personally I would be OK to leave head
and tail
, but as noted in the comments above - this is just a convenience issue.
Yes, but convenience issues are nice for those of us who do a lot of data analysis in the console. I sometimes feel that design decisions in julia de-emphasize console convenience a little (e.g. the global scope debacle) and I do feel it’s a shame, especially in cases where the convenience is free (and non-breakingly consistent with previous use).
I never felt this to be a good general advice for these situations - since defining convenience functions in your startup.jl file renders your code non-shareable.
This is exactly what I meant (that is why I also advocated to leave nrow
and ncol
). If you feel we should revert head
and tail
please comment in Why not let `first(df)` have the same behaviour as `head(df)`? by xiaodaigh · Pull Request #1932 · JuliaData/DataFrames.jl · GitHub and I will reopen it for discussion.
I think the issue here is more profound. Whenever a decision is made by package authors to deprecate a function name, it should be assessed first with the entire community of users, maybe a post on discourse? Without any attempt to consult the users, the community will feel ignored.
This has happened before with the Julia language itself, but happily no change so far has caused major complications downstream.
Yeah. Please don’t deprecate head
tail
ncol
nrow
.
That may be overdoing it
It’s a tough one. Reminds of Python 2 - Python 3 transition. Haha.
Maybe you can lead by example here. For example, this commit: Move Kriging solvers to KrigingEstimators.jl · JuliaEarth/GeoStats.jl@7a064b8 · GitHub might be quite disruptive to the community, forcing them to install another package to use the functionality. Maybe this should have been discussed by the community here on discourse first? Will you start opening discourse post for discussions about any API change you make in the future?
None of those changes had any impact on end users. I just moved things around inside the project packages. And really, you want to compare the size of the GeoStats.jl community to the size of Dataframes.jl? The latter is much larger. GeoStats.jl is an ant compared to DataFrames.jl in number of users.
This does raise a valid question of how can package authors more readily access feedback on proposed changes. 99.9% of users do not just go read open GitHub PRs on the packages they use. Perhaps this is another situation where having beta versions of packages easily made available via Pkg
could help? That doesn’t seem like the only or best solution though.
Come on, this change has been released months ago without anybody complaining, and suddenly it’s a major blocker?
You’re assuming that package authors feel that they need more feedback. When that is the case, opening a topic on discourse is very simple. If you personally want more of a say in the development of a package, start contributing to it.
Maybe I should have said something then…