I might have missed this but why is `head(df::AbstractDataFrame)` deprecated?

This is the point that we do not want ?head to work. I will expand on it in my blog post.

The issue is that it steals head as a variable name in global scope. And head is a very common name to use in user’s code.

E.g. we had H2O benchmarks for DataFrames.jl not running correctly for ~2 weeks due to a similar issue. One of the reasons why benchmarks were failing (as there were more in general) was that in these benchmarks a big variable name was used and it conflicts with big function in Julia Base (I sent a patch to H2O benchmarks to fix it and now all is OK). Such things are hard to avoid of course, but the point is that we do not want to pollute global namespace with common names like head.

Of course in case of big it is a name defined in Julia Base, so it should ideally not be used in user’s code. However, big is used very seldom in practice and most users do not even know about it so it is super easy to introduce a name clash, especially if one is a casual user who is not deep into all details of the language.

2 Likes

I was suggesting making ?head direct to first, not making it available as a variable name.

One problem right now is if you google DataFrames you are taken to R code or Pandas code. And Julia’s presence on stackoverflow is overwhelmed by Python. The second link after this one when you google julia dataframes head is Getting Started · DataFrames.jl. And that’s not a great place to start.

1 Like

This is an interesting suggestion conceptually, but the concrete way the help system works links these experiences – you can’t get one without the other.

6 Likes

Oh! I didn’t realize that. In that case we need to improve Google. :wink:

Edit: R has ?? (aka help-search), which searches the help system for documentation matching a given character string. But setting that up would be a big undertaking I imagine.

@nalimilan + @viralbshah (as both of you were updating handling of DataFrames.jl documentation web pages):
Do you know if it is possible to disable links like Getting Started · DataFrames.jl that lead to an extremely outdated documentation?

2 Likes

Julia has apropos. Trying apropos("head") gives too many results, apropos(r"\bhead\b") is better but doesn’t point to first (yet :slight_smile: ).

Maybe the system could be extended, e.g. with docstrings declaring tags, and tag matches could be shown first…

1 Like

I have a script here that runs over a package’s documentation and inserts a big(ish) red popup that notifies the user that the current docs are outdated (see e.g. this).

4 Likes

Julia also has that, you can do e.g. ?"search" and you’ll get a list of objects whos docstring contains the given string.

5 Likes

please forgive my ignorance, these are some great options!

?"head" is easy enough to remember. The docs don’t say, but it looks equivalent to apropos("head"). :+1:

Is there a feature whereby package creators could make a dictionary of strings they want to prioritize in the apropos search? Dict(“head” => "was deprecated in favor of first as of version x.x.x. See also tail"). Then the dictionary entry would be listed first when apropos(“head”)` is called.

2 Likes

Not that I know of, would be pretty cool. There’s still a lot of open real estate in the not-essential-but-would-be-nice tooling arena. Mostly I suspect stuff like this will need to be in packages, but :man_shrugging:.

Also, to pivot back to this exchange -

I think @xiaodai 's point was that if this is something that appeals to you, I wouldn’t be that hard to implement.

module DataFramesPlus

export head, tail

using Reexport
@reexport using DataFrames

head(df:: AbstractDataFrame, n=10) = first(df, min(n, nrow(df)))

tail(df:: AbstractDataFrame, n=10) = last(df, min(n, nrow(df)))

end

… should maybe work? I didn’t test it. It’s technically type piracy, but if you’re cool with that :sunglasses:

More concretely, I personally don’t want @bkamins working on anything else - his efforts on DataFrames itself (not just development, but also blog posts etc) are already beyond what could reasonably be expected.

But Julia makes it absurdly easy to roll your own solution for things like this, and there may be others that are in the same boat and would use it too. My own solution is to just stick with Julia for everything so I don’t get confused :stuck_out_tongue_winking_eye:

7 Likes

Thanks for sharing this Kevin: I would not have known about Reexport and would have copied entire definitions. Excellent to know. I won’t be needing this right away as our discussion will make me remember first and last for some time :joy: (a binding for ?? would be extra lazy). In addition to making more string keywords available to search would be to make them available to autocomplete too (some people hate it, but it’s useful for newbies). As I start typing first (after using DataFrames of course), I get an autocomplete suggestion. Autocomplete is even clever enough to suggest first when I incorrectly type fist, which is amazing. Now the crazy idea would be to suggest first when I type head :crazy_face:

1 Like

To get apropos pickup "head", looks like it would just be a matter of creating a module for that purpose. To rank the terms would be a bit more work. Edit: It uses the “Levenshtein Distance”, no wonder it’s so clever. If we could set distance("first", "head") to 0 or something small enough…