Examples of Well-Written Julia packages to emulate--in terms of generic functions, design, etc

00krishna · July 28, 2019, 7:02am

I am trying to understand how to write programs in a more Julia, less object oriented way. I was reading the lectures by the QuantEcon folks on how to write generic functions and more generic code etc. The lectures are quite good, but they present a lot of rules of thumb like: use x = similar(y) instead of x = zeros(length(y)) or something. I understand the intuition behind using more generic code, but I would really like to see this principle in practice. Can anyone suggest some well written Julia packages that I can look at for the sake of emulation? I am thinking in particular with respect to generic code and if available, good use of multiple dispatch.

ChrisRackauckas · July 28, 2019, 7:25am

I think there’s somewhat of a divide between fully generic and efficient package code vs well-written clean code. You sometimes have to sacrifice readability in order to get the last bits of genericness and efficiency. That said, it’s usually due to a missing abstraction which will get fixed overtime.

Nonetheless, if you want to know how to make something as generic as possible, take a look at OrdinaryDiffEq.jl. That uses every trick under the sun to have as many types work as possible, and it is highly efficient.

Dense Jacobians are allocated with @. false * u * u' (well, it has its own @.. in DiffEqBase because Julia’s native broadcast has a lot of slow cases, but that’s a detail). This is for the case where u isa CuArray, so zeros or similar won’t work
all cached variables are unit-aligned (so rate_prototype = u0 ./ t is an initialization instead of similar(u0) because of units, and similar(u0,rateType) is not generic enough (has weird counter examples).
some purposeful densification via similar(u,rateType,axes(u)) to get the Array-like type which matches an AbstractArray in order to perform linear algebra.
ArrayInterface.jl and traits from DiffEqBase are used for things like is_structural since isa SparseMatrixCSC is not a large enough class to specialize on.

And so on. Again, if I wasn’t writing library code that had to cover every case a user has asked for, I likely wouldn’t do half of those things and would spend time building a proper abstraction… but sometimes you need to get stuff done .

If you want a more sane example, LightGraphs.jl is a good one. It has well-defined interfaces on AbstractGraphs that it sticks to in its algorithm implementations. The graph implementations are parameterized, so most of the normal stuff will work. However, you can’t expect it to work on GPUs or anything fancy like that through compiler tricks.

chakravala · July 28, 2019, 8:02am

The AbstractTensors is a very interesting abstraction layer for my TensorAlgebra abstract type, used in the Grassmann.jl package to define some generic methods on the variety of subtypes in there.

https://github.com/chakravala/AbstractTensors.jl

It’s a rather short package, but it vastly expands the functionality of the algebras I am working with.

rveltz · July 28, 2019, 8:22am

Why not point out package with design weaknesses in a constructive way?

Per · July 28, 2019, 8:28am

Personally, I feel I’ve learned a lot by reading the code of the packages in the standard library. For example: LinearAlgebra.

Per · July 28, 2019, 10:11am

For constructive criticism of Julia code, there’s a wealth of examples in the comments to pull requests.

(But I wouldn’t give unsolicited criticism of packages on discourse.)

Tamas_Papp · July 28, 2019, 11:00am

Disregarding issues with criticism being unsolicited, I don’t even know how one would critique nontrivial code on a forum like Discourse. “Replace this by that” snippets? A pull request is just the right way to do it.

Reading code written by others is useful, but it takes a lot of effort to understand the reasoning behind well-reasoned design decisions and merely accidental ones.

What I would suggest instead is that you just start working on package you find useful, then once you run into issues and want to make your code more generic, solve particular problems (with help from this forum).

Writing generic numerical code in Julia is both easy and difficult at the same time. Easy because the language supports this very well, difficult because most interfaces are not formally defined and it is easy to run into neglected corners.

00krishna · July 28, 2019, 3:31pm

This is very helpful @ChrisRackauckas. Yeah, I can see what you mean by this tensions between abstraction, genericness, and readability. Sounds like a new julia package developer starts with laying down the interfaces and high level concepts/types. Then as you have your algorithms set and work towards the implementation, you really start to focus on genericness. I suppose I should structure code so that I can swap out less generic implementations for more generic ones, which will probably happen as people file issues and tell me–“Hey this is not generic enough.” You are right, sometimes you have to just get the basic package working and then go back and make adjustments .

00krishna · July 28, 2019, 3:34pm

@chakravala Thanks for the suggestion. I will definitely take a look at the package.

@Per Interesting, it totally makes sense to look at some of the core workhorse packages like LinearAlgebra. I remember looking at some of the implementation of C++ core linear algebra libraries before, and could not even read them because of all the optimizations, but perhaps Julia is different in that way.

00krishna · July 28, 2019, 3:42pm

@Tamas_Papp I think your suggestions are good here. I guess I alternate between writing some code, getting stuck, and then looking at how someone else wrote their code, and so on. When working in OOP code, I found a lot of benefit from studying the different design patterns, simply because they showed how to implement SOLID principles. But of course in most cases I would not write full design patterns because it just creates a lot more complexity and is overkill. In writing julia code the design process is very different, but I am excited to give it a shot.

It is definitely a challenge to read through someone’s code, but hopefully just 1 or 2 good examples are enough to help understand how to make better design choices.

ChrisRackauckas · July 28, 2019, 3:43pm

Exactly. Get it working for what you need, and slowly generalize out to more stuff you need. Without tests you won’t be able to get it right, and the tests will come naturally from the use cases. LightGraphs is a great example for how it solidified the AbstractGraph format long after the SimpleGraph stabilized and it was clear what was necessary for its interface.

giordano · July 28, 2019, 4:32pm

Perhaps we might think about having something similar to GNU Hello: a simple package that shows best coding practices in Julia.

Per · July 28, 2019, 5:35pm

Yes it is! And this is precisely what you can learn from looking at the standard libraries: How to write performant code in a way that it is still easy to read.

00krishna · July 28, 2019, 5:50pm

@giordano this is a good idea. So I watched the video of the very nice talk by Scott Haney at JuliaCon on “Writing Maintainable Julia Code.” So he lay out a nice workflow or process by which to develop a simple set of abstractions/types and the interfaces around those types. And you can work up and down that tree from the main algorithm down to the implementations. That was pretty nice.

Personally, when I saw all the documentation on implementing genericness it was a bit overwhelming. There is a lot of syntactic nuance in setting up generic types, and you don’t want to start your package out on the wrong foot. But again, worrying about how generic your types are before you even know your overall algorithm and the concepts that go into the algorithm is like putting the cart before the horse.

So it would be great to lay out a thought process on how to implement generic code, or best coding practices. Like once you have your basic concepts and interfaces designed, now you can think about creating a simple working implementation and 1. getting some tests to pass and 2. benchmarking the speed of your code. Once you have those tests passing, you can think about applying a few best practices to improve genericness. Once you have those practices implemented, you can check your tests and also your benchmarks–to make sure you did not critically impair the compiler’s ability to optimize your code. Once you get that measured, then you can think of a few use cases for genericness, write some more tests, and then run the benchmarks to make sure you did not compromise anything.

This is just off the top of my head, but if anyone wants to improve it, have at it :).

chakravala · July 28, 2019, 8:00pm

I spent about 1-2 years designing the Grassmann package completely inside my head (I am a type of person who does not need to write ideas down, since I can keep track of very complicated things without ever writing them down), then in December 2018, I decided now I have the time to start implementing it. It was about 2 months into implementing it when I realized I need an AbstractTensors abstraction layer in order to make the interoperability work between multiple packages.

So my point is: start out with building up your ideas first and then introduce the abstractions as you go along and add more generalization on top of each other. As you create more functionality, the abstractions will illuminate themselves to you… although, it is better if you can anticipate it ahead of time also.

I’m not sure I know of any best practices in this regard, but I personally don’t try to make my code very readable by other people because nobody paid me to make readable code. I’m happy maintaining that code base on my own without any help, I do share my code but not all the thoughts in my head. (yes, I know that this will probably be a very unpopular post, but if you want readable code please pay me).

It would probably make sense to open an Issue or PR if you have design suggestions.

Tamas_Papp · July 29, 2019, 5:31am

I don’t think this happens in real life

Personally, I find that designing good generic interfaces is a really hard problem, as is best informed by actual need. If I design something a priori, I almost always overengineer things, so I try to just keep my code flexible and refactor if necessary. This is rather easy in Julia.

colintbowers · July 29, 2019, 6:07am

This question has come up a few times before. I always recommend Distances.jl because it really isn’t that complicated a package. It does one (reasonably) simple thing, and it does it really well. But the design of the package can be generalized to much more complicated problems.