I’m not sure where to post this so feel free to move this to a different section of the site if I missed the mark.
I’m learning my way around Julia and life could be a bit simpler with some minor tweaks, here’s a wish list.
All package documentation used real examples from default datasets. When I taught myself R I would always reference the default Iris, Orange, or Titanic data that I was very very familiar with that made applying some new package or function immediately obvious and easy to learn what it was doing. Try typing “serialize” into the Documentation panel of JUNO. This is part of a basic “show me” don’t “tell me” lazy programming that makes stack overflow a very profitable business, people really like it.
I’m pedantic (or stupid; this is also an irk I have after learning Python), I don’t understand why we can’t have the option of explicitly calling all arguments. It’s very helpful not having to remember the order for new functions that after a while when I’ve used them enough I naturally drop the keywords, the option is nice. This is also helpful when teaching others how to transition over as well (Julia - package.function(file = “path”, kwarg1 = 1) == R - package::function(path = “path”, kwarg26 = 1) and so forth).
I think 1 is something that naturally improves with the maturity of the language and people leave nuggets of utility all over. 2 is a personal preference but seems reasonable and it would make my code a lot easier to come back to without referencing documentation over and over as I’m getting acclimated.
Regarding the learning of arguments, if you use an IDE, and know the function you want, you can just type it and open the parens and e.g. in Juno this panel appears:
Welcome to Julia, and thanks for taking the time to make suggestions.
I am not sure if you realize that the development of Julia packages is very decentralized: individuals collaborate loosely, and there are no central standards or quality control. Because of this, suggestions like this are read by a few people, but may not be seen by a lot of package maintainers.
It is better to open an issue for a particular package where you think that the documentation can be improved — or even better, make a pull request (search for the term on this forum to learn how). As a newcomer, you are in the unique position to see gaps in the docs that long-time users may miss.
Using “default datasets” is not something that is common in Julia at this point. When comparing with R, it is good to keep in mind that while R focuses on data analysis and statistics, Julia is a much more general programming language so there are no built-in datasets comparable to eg mtcars and iris in R, just like there is no “built-in” in example data for other applications.
Now that artifacts are supported, I think it would be nice to have a minimal-dependency package that makes such datasets available, eg as vectors of NamedTuples, for use by all packages that support the Tables.jl interface. Tracking down some datasets, making sure that the they are in the public domain or have a nice license, and packaging them would be a nice contribution. There is
but something with public domain, CC or MIT license and no dependencies beyond Tables.jl might see more adoption for the use case you suggest.
@DNF
Perusing that long thread, it seems that I’m in the minority of people who can imagine that there’s a solution worth finding for stylistic accommodation. I have a strong opinion and a weak ability to produce a compelling argument in favor of it that hasn’t already been shut down. Hopefully someone smarter than me comes along who is more convincing and able to also provide solutions.
If nothing else, you’ve revealed kindred spirits in @akdor1154@nickeubank . Thank you.
[wondering now how to to re-purpose all the functions I use to use kwargs vs pargs just for fun]
@Tamas_Papp
Thanks for the welcome, I like this community.
Good idea, I’ll chime in on the documentation of some packages that I use and maybe try and even contribute to some of the changes.
For free data, I think we could use the documented publicly available datasets at openml.org e.g, here is the iris dataset, I just don’t know how to implement an artifact yet or publish a package.
But in this case, in addition to the arguments, I have a visceral opposition to the named argument suggestion. It feels like an intrusion to have my variable names dragged out into the light of day. And I want to be able to rename them frivolously. I don’t see any way to compromise there, frankly.
And while I think there is a pretty compelling argument for it based in the idea that to err is human, and so explicitness is a good defensive programming strategy, I recognize the challenges to implementation pointed out by @StefanKarpinski in a multiple dispatch world, and more generally to the fact that there doesn’t seem to be a critical mass that agrees.