I might have missed this but why is `head(df::AbstractDataFrame)` deprecated?

xiaodai · August 26, 2019, 7:42am

head seems to be terminology used everywhere including R and Pandas, so why deprecate it?

nalimilan · August 26, 2019, 9:28am

Because Julia uses first for this.

xiaodai · August 26, 2019, 9:45am

Eh… Not really making sense. I think it is a joke. Why is first preferred. It used to use head that’s why it was “deprecated” otherwise there wouldn’t be anything to deprecate

kristoffer.carlsson · August 26, 2019, 9:58am

It got changed from head to first because Julia uses first for this everywhere else (see the docstring for first).

mkborregaard · August 26, 2019, 10:05am

Depends what “this” is. first gives the first element everywhere in the Julia ecosystem, and this gives the first row, but head gives the top 5 rows. first(df, 5) feels like a regression, given that this is constantly used for data analysis.

xiaodai · August 26, 2019, 10:07am

Roughly how I feel. I feel it’s more of an inconvenience than anything.

ufechner7 · August 26, 2019, 10:33am

Well, just define your own head(df) function in startup.jl .

bkamins · August 26, 2019, 11:03am

The issue is that first and last take count argument for AbstractStrings only, not for general collections. Actually - as noted in https://github.com/JuliaData/DataFrames.jl/pull/1932 personally I would be OK to leave head and tail, but as noted in the comments above - this is just a convenience issue.

mkborregaard · August 26, 2019, 11:14am

Yes, but convenience issues are nice for those of us who do a lot of data analysis in the console. I sometimes feel that design decisions in julia de-emphasize console convenience a little (e.g. the global scope debacle) and I do feel it’s a shame, especially in cases where the convenience is free (and non-breakingly consistent with previous use).

I never felt this to be a good general advice for these situations - since defining convenience functions in your startup.jl file renders your code non-shareable.

bkamins · August 26, 2019, 11:22am

This is exactly what I meant (that is why I also advocated to leave nrow and ncol). If you feel we should revert head and tail please comment in Why not let `first(df)` have the same behaviour as `head(df)`? by xiaodaigh · Pull Request #1932 · JuliaData/DataFrames.jl · GitHub and I will reopen it for discussion.

juliohm · August 26, 2019, 11:27am

I think the issue here is more profound. Whenever a decision is made by package authors to deprecate a function name, it should be assessed first with the entire community of users, maybe a post on discourse? Without any attempt to consult the users, the community will feel ignored.

This has happened before with the Julia language itself, but happily no change so far has caused major complications downstream.

xiaodai · August 26, 2019, 11:37am

Yeah. Please don’t deprecate head tail ncol nrow.

mkborregaard · August 26, 2019, 11:42am

That may be overdoing it

xiaodai · August 26, 2019, 11:48am

It’s a tough one. Reminds of Python 2 - Python 3 transition. Haha.

kristoffer.carlsson · August 26, 2019, 12:47pm

Maybe you can lead by example here. For example, this commit: Move Kriging solvers to KrigingEstimators.jl · JuliaEarth/GeoStats.jl@7a064b8 · GitHub might be quite disruptive to the community, forcing them to install another package to use the functionality. Maybe this should have been discussed by the community here on discourse first? Will you start opening discourse post for discussions about any API change you make in the future?

juliohm · August 26, 2019, 12:56pm

None of those changes had any impact on end users. I just moved things around inside the project packages. And really, you want to compare the size of the GeoStats.jl community to the size of Dataframes.jl? The latter is much larger. GeoStats.jl is an ant compared to DataFrames.jl in number of users.

tbeason · August 26, 2019, 1:06pm

This does raise a valid question of how can package authors more readily access feedback on proposed changes. 99.9% of users do not just go read open GitHub PRs on the packages they use. Perhaps this is another situation where having beta versions of packages easily made available via Pkg could help? That doesn’t seem like the only or best solution though.

nalimilan · August 26, 2019, 1:39pm

Come on, this change has been released months ago without anybody complaining, and suddenly it’s a major blocker?

kristoffer.carlsson · August 26, 2019, 1:43pm

You’re assuming that package authors feel that they need more feedback. When that is the case, opening a topic on discourse is very simple. If you personally want more of a say in the development of a package, start contributing to it.

xiaodai · August 26, 2019, 1:47pm

Maybe I should have said something then…

Topic		Replies	Views
DataFrames.jl development survey Data question , dataframes	52	2944	September 27, 2020
Suggestion: move DataFrames, plotting into standard distribution Internals & Design proposal , plotting , dataframes	45	3838	February 21, 2018
[ANN-RFC] DFMacros.jl Package Announcements dataframes	30	2027	June 19, 2021
Is there light at the end of the DataFrames tunnel? Data question	36	4301	November 24, 2017
DataTables or DataFrames? Data question	32	15378	November 19, 2018

I might have missed this but why is `head(df::AbstractDataFrame)` deprecated?

Related topics