Features most wanted in Julia

We had an excellent discussion on Slack the other day where many people gave input to what they considered the “biggest things missing in Julia”. I thought it would be a shame if all the feedback just disappeared into the Slack memory hole. So here, I’m copy-pasting some answers from that thread. Please do respond with your own!

Note that this pertains to missing features that could plausibly be implemented, not existing bad design decisions. The top point from many people (including myself) was “Lack of third party packages”, but that’s hardly useful from a Julia dev perspective, so I’ve not included it here.

Static analysis tools

Interfaces:

Scott Paul Jones:

[We need] a way of explicitly indicating names which are meant to be public even if they are not exported. Many times you don’t want to pollute the namespace when people do using, but then the ecosystem becomes fragile when things aren’t exported, since there’s no consistent way of discovering whether something is part of the module / package API or not otherwise.

Jakob Nissen:

There is currently no way of knowing how to implement a subtype of e.g. Number, Char or AbstractDict. Our current system of subtyping has us rely on comprehensive, up-to-date documentation on abstract types to know what methods to implement, but this is unrealistic: Even in Base, most abstract types are not documented.

Static compilation

Michael Savastio:

Making it easy to produce a self-contained, reasonably sized binary executable is probably still the biggest limitation of the language […] being easy to get reasonable static binaries would greatly expand the number of reasonable use cases for the language, so I see that as being a necessity for going much beyond a numerical and scientific computing focused bunch

Compile time latency:

Ari Huttunen:

For me the biggest pain point has been the constant precompiling when I run notebooks, as I often re-start the kernel.

Jakob Nissen:

A multithreaded compiler would be nice

Documentation:

Ari Huttunen:

The second [biggest pain point] is quality of documentation and/or examples

James Doss-Gollin:

Documentation is general problem. There are lots of stable packages that other developers rely on that only have a GitHub README for documentation

Data types:

Scott Paul Jones:

[We should] be able to have a type that is a union between several pointer / bits types, that can be determined by the lower bits (this is a frequent low-level trick, even used by Julia itself internally).

[We need] “Memory buffers” (which could be used to implement arrays and strings 100% in Julia

Being able to specify a finalizer for a type, instead of having to use a 16-byte (on 64-bit machines) entry for every object.

Tagged Unions where the type could be determined by the instance itself (instead of the way it currently has to allocate a byte per element that is allocated at the end of an array, for the current implementation in Julia for small unions)

Jakob Nissen, on having Rust-like Enums in Julia:

I see two advantages […] it’s easier to make nested types […] but whatever, it’s just syntactic sugar. I think the real killer is the exhaustive matching. Consider Julia’s findfirst function. It infers to Union{Int, Nothing}. But it actually returns either/or. If that instead returned an enum of these two types, it would not be possible to ignore the fact that it can actually return nothing. […] It’s almost like a trap in Julia, that the type is inferred to be possibly a Nothing, but when testing it appears exactly like as if it returns an Int, until it suddenly fails"

22 Likes

There isn’t always 16-byte overhead(?). See: Essentials · The Julia Language and on isbitstype Essentials · The Julia Language

Return true if type T is a “plain data” type, meaning it is immutable and contains no references to other values, only primitive types and other isbitstype types.

My understanding is that there’s no overhead when such types are stack allocated, i.e. (almost) always. And even with an array of such, then 16-byte overhead seems rather low for the whole array (then there’s StaticArrays.jl for no overhead where it applies, and also for tuples of?), if that’s what it really is.

A destructor as in C++ might change things (I’m not sure about finalizers), but making Julia a very different language without without garbage collection (it’s still possible to do your own memory management). For e.g. strings you actually want GC (for long strings), rather than finalizers/destructors and (naively) copying:

I would be rather happy with only 16-byte overhead (memory is cheap), for say strings, as the (64-bit) pointer is 8 bytes, and you need to store the length somehow. I’m more worried about the pointer-indirection, what I believe ShortString.jl (and FixedSizeString.jl) solve. I actually had the same idea as others, described here (and nothing in the way to do it, if not already done), to make more general:

Most of these feature requests have been brought up before, some of them on a regular basis, and the main reasons for not having them yet is that (1) they require quite a bit of work (2) which no one got around to yet.

Pretty much everything not in the issue tracker can expect the same fate. Most of the above already have an open issue, and for the rest you should consider writing up one.

1 Like

A note on docs: speaking for myself, I find it sometimes easier to just spend time answering questions on Slack rather than spend time writing extensive docs, because writing extensive docs is hard work… I imagine I’m not the only one.

While pkg devs can do better, I think as users we could also improve our reflex to open docs PR or issues suggesting doc additions once we’ve « figured something out » (even if it was obvious in retrospect).

People often end up getting help (quickly!) on slack or discourse but the insight does not end up in the docs; imo that should be on the users’ shoulders as well as on the devs’… that’s kind of a way to “pay” for cool free tools :grin:

8 Likes

The total time of answering on slack < time to write docs ? :smiley:

3 Likes

For me it isn’t about efficiency. I genuinely enjoy answering people’s questions on Zulip, Discourse or Slack. I don’t enjoy writing documentation, so for the most part I’ll just spend my free time answering questions.

I often learn new things from answering questions and get opportunities to connect with and welcome new community members. For me, this is a better use of my time, even if it’s less efficient for growing the community.

Free software communities are mostly only effective at getting volunteers to do things they or care about enjoy doing.

Given that most people don’t seem to enjoy writing documentation, itd probably just be more effective to raise money to pay someone to write docs than to guilt users into doing it (assuming we actually agree that these docs are so desperately needed)

18 Likes

I’ll add a corollary to your thought, Mason. I enjoy writing longer-form stuff that looks like documentation (as evidenced by the average length of my posts on GH/Discourse :stuck_out_tongue:), but not submitting it upstream because of friction and polish.

What do I mean by “friction and polish”? Well, most docs PRs that aren’t quick typo fixes face multiple speedbumps:

  1. Finding the correct place to insert changes. This often includes deciding how/what to move around when making structural changes. On Discourse/Zulip/Slack I can write an answer that references multiple parts of disparate projects, but docs are expected to be coherent. The result is mild-to-moderate analysis paralysis while trying to chop up and rearrange said answers into a PR-ready form.
  2. Taking the time to discuss back and forth with maintainers if the tone, structure, headings, links etc. mesh with their style. This may include arguingnegotiating about one or more of the above.
  3. Checking the rendered result and dealing with Documenter quirks. This iteration cycle can be a pain on larger projects.

Perhaps there is a need for more “community wiki” types of documentation that carry no expectation of being complete, coherent or self-consistent. Community members could append answers and discussions, while contributors could selectively pull out nuggets for inclusion in “proper” docs. Do any of the existing communication platforms support this kind of functionality?

Edit: perhaps a good parallel is the Discussions feature on GitHub. Not all discussions will result in issues being created, but it (in theory) cuts down on superfluous issues while still keeping an easily-searchable record of project-related discourse.

7 Likes

Agreed. Often with semi-cohesive things like docs pages, less can be more. Piling on more documentation can just make it harder / less likely to be read. It requires a lot of thought and editorializing to make good documentation.

Yeah, this’d be great. There’s a wikibook: https://en.wikibooks.org/wiki/Introducing_Julia but I think a community wiki that can just sprawl would be really nice.

1 Like

I wanted to add my two cents as a new comer who made extensive use of the Julia wikibook.

It provided an amazing middle ground between the “Intro to programming” articles and the Julia documentation. The secret sauce, that made it so useful for me, was providing a glimpse into tricks or best practice from the very beginning. The wiki has a tendency to drop more sophisticated techniques or demonstrate the “Julian” style after introducing the raw basics. These brief nuggets always relate to the same problem or a logically related task. They offer a clear sense of direction for people just starting out.

I started lurking here to feed off the ‘impermanent’ knowledge that tends to surface. I definitely support the addition of deeper, targeted, or stand-alone articles to the wiki; Particularly those that expose the robustness of the language. Thanks to any who contributed to wikibook.

10 Likes

Interfaces ala Scala please. Julia APIs tend to be hard to use because there is no way to declare or check the requirements of the arguments. Passing the wrong thing tends to break deep inside in some dependent package, throwing an error completely unrelated to the bad argument. The best we can do at the moment is define an interface as the existence of a bunch of methods, and check them at runtime. e.g.

"""
    implements_learner(T::DataType)
Test if a type implements the [`AbstractLearner`](@ref) interface.
"""
function implements_learner(T::DataType)
    return hasmethod(model,(T,)) &&
        hasmethod(model!,(T,Any)) &&
        hasmethod(loss,(T,)) &&
        hasmethod(loss!,(T,Any)) &&
        hasmethod(opt,(T,)) &&
        hasmethod(opt!,(T,Any)) &&
        hasmethod(cbs,(T,)) &&
        hasmethod(add_cb!,(T,AbstractCallback))
end

@assert implements_learner(Learner)

and as far as I can tell, there is no way to check if a method returns the correct thing-- you can check if type T has a getfoo(T) method, but you can not check that it returns a Foo

1 Like

Just want to leave my +1 for Static Compilation. It would be a game changer to my projects.

5 Likes