A kind of "two language problem" in Julia Ecosystem

I am learning a lot these days by reading the source code of various Julia packages. There are certain scenarios where I see certain things achieved in some packages - things that don’t seem to be documented anywhere.

It is always expected to have a knowledge gap between those who are familiarized with Julia’s internals and those who are learning the language “from outside” - using the documentation (like myself): but I feel this is especially true for Julia.

When reading some source code and comparing it with the code I write, I feel like we are using two different languages (and I understand that the “type” of code written for a package can differ greatly from the regular “application” code - but still).

A thing that I find exciting about reading the source code is the discovery of “unknown unknowns”: those nice surprises where you learn something that you didn’t even think was possible (and was never your purpose in searching for those). However, I think I would really like to have these “unknown unknowns” documented instead of having to stumble upon them by accident (because it makes me wonder how many hidden things are still there, and I will never find them).

Is this only my subjective experience? Or is there a real gap between the documentation and frequently used idioms in various packages?

I think my subjective experience is warranted if the following is true: “There is actually code written in such a way that somebody relying on the documentation alone would not be able to write.”

How do other people feel about this? Are there certain tips that you can offer so I can narrow this perceived gap?

15 Likes

Can you give some examples of package code that could not be written based on just the manual?

There are some packages that rely on Base internals, but that should generally be discouraged, since the package code can be broken by a minor version or patch version update to Julia.

3 Likes

That’s a great way to learn Julia.

Nevertheless, keep in mind that if you are new to a language, then learning the best way(s) to use it takes time. You may be experiencing that.

It is hard give a more specific response without concrete examples.

8 Likes

One example that comes to mind is index-access behavior without a numbered index, such as the way Ref uses getindex and setindex!:

# refvalue.jl
getindex(b::RefValue) = b.x
setindex!(b::RefValue, x) = (b.x = x; b)

Which allows the following:

julia> a = Ref{Int}(10)
Base.RefValue{Int64}(10)

julia> show(a[])
10

Seeing a[] = ... is somewhat surprising as a new user, and isn’t explained in the Interfaces/Indexing section of the manual

I really like how index style access is used hack additional functionality into assignment like in Observables.jl

julia> using Observables

julia> observable = Observable(0)
Observable(0)

julia> observable[]
0

julia> obs_func = on(observable) do val
           println("Got an update: ", val)
       end
ObserverFunction defined at none:2 operating on Observable(0)

julia> observable[] = 42
Got an update: 42
42

But it took me quite a while to discover the actual mechanism used here; it wasn’t clear to me if it was the same syntax as indexing, or if it was another unrelated piece of syntax.

I should say that this is a pretty niche problem, and I don’t particularly think the manual is lacking for such a specific issue - I did in fact learn the mechanism by reading through source. But I thought this might be the type of example that @algunion is talking about?

16 Likes

I think that’s fairly true. There is code written in packages and in base Julia which is hard to understand (sometimes even if the syntax itself is familiar). Many times this results from the fact that the library code has to deal with a lot of possibilities (error handling, generic types, exceptions, etc) that obfuscate the simple rationale of the function.

Personally, I find hard to learn these things without having myself a problem to solve, and if one wants to be actually do stuff, I think the best to do is just start implementing stuff and progressively deal with the possibility that the language offers.

I can help giving one example:

If I had to implement a sum of two vectors (assuming that x + y didn’t work already, of course), I would do:

function sum_vecs(x::AbstractVector,y::AbstractVector)
    z = similar(x)
    for i in eachindex(x,y)
        z[i] = x[i] + y[i]
    end
    return z
end

Yet, we have this when we @edit rand(10) + rand(10):

function +(A::Array, Bs::Array...)
    for B in Bs
        promote_shape(A, B) # check size compatibility
    end
    broadcast_preserving_zero_d(+, A, Bs...)
end

Which of course is from the start more complicated, deals with more cases, check shapes, and then calls a function which is somewhat optimizing something related to zero dimensions. Yes, that’s more complicated, but the simple code is not wrong or particularly bad. It will be correct and performant, if I had to use it in a package.

23 Likes

It is however used 3 times in the documentation for Ref, and also indicated in the docstring for getindex (since inds... means zero or more things)

  getindex(A, inds...)

  Return a subset of array A as specified by inds, where each ind may be, for example, an Int, an AbstractRange, or a Vector. See the manual section on array indexing for details.

Having said that, this sounds like a good opportunity for someone who finds this part of the documentation lacking to open a PR :slight_smile:

12 Likes

Thanks a lot. This way of viewing things helps.

Recently I delved into Cobweb.jl - where they have a function h that can be used with appended dot notation - with out-of-the-box intellisense (e.g. h.div, h.p - etc.).

I am not saying that I am the only guy who managed to review ALL of Julia’s documentation (I certainly didn’t).

But from what I read, I never encountered the following possibility (and I am very pleased by this finding).

Now this might be something naive and not a big deal for most of the experienced Julia programmers - but for me, it was an aha moment, especially because I could do this on a function.

function myh(a)
	"you called myh using property with value $a"
end

Base.getproperty(::typeof(myh), tag::Symbol) = myh(string(tag))
Base.propertynames(::typeof(myh)) = [:x, :y, :z]

myh.z # outputs: "you called myh using property with value z"

And now I can have intellisense for my myh function (e.g. myh.x, myh.y).

While we have this documentation for propertynames, it wasn’t clear to me that the above feat was even possible - somewhat that doesn’t follow from the documentation (or maybe there is another relevant material that I missed).

Now, when the “unknown unknown” became “known known,” it is somehow easy to look back and give the explanation: this is the power of multiple dispatch at work. But my brain wasn’t able to travel the other way around: and this confirms what you said (these kinds of solutions have a better chance to present themselves when actually trying to achieve/solve something).

Anyway - thanks for your time and explanations.

I’ll keep up the exciting activity of reading source code - I never thought it might be so rewarding.

9 Likes

The thing with the documentation is that once you get familiar enough to get programming, you only use it to fill some gaps (or when you encounter new challenges that require acquiring some new info from docs).

Please take a look at my example from here. I can look at the documentation, and it is still opaque that in Julia I can have “a property of a function call the same function with a certain input.” And even more: you can call the property of the function as a function that will, in turn, call the original function (see Cobweb.jl main functionality).

Again, this might be something pretty lame for the veterans around here - but I have written code in Julia for a few years now (and full-time in the last two years). The good part is that you can do a lot with what is transparent in the language without ever encountering things like “property calling” on a function - I get that it is just syntactic sugar, and you don’t actually need it to get things done.

Anyway - I am intrigued by these findings in Julia packages source code - but mostly excited and motivated to dig deeper.

4 Likes

@CameronBieganek, I would differentiate between two kinds of levels of “could not be written.”

  1. Factually impossible to be written by someone having full knowledge of the documentation.
  2. Practically impossible / very improbable (as in “it would not be able to pull it off”). This is to say that relying upon the documentation alone might not have transpired as a possibility to the language user.

I was referring to the latter. Also, please check the example from here here.

I think of syntax as a set of Lego pieces. Give a bucket of pieces to different people, and they’ll build different things. Some will build things (combining the pieces in a certain way) that you could never have imagined possible.

As you’ve found out, a good way to learn new piece combinations is to look at what other people have done. But there’s also some value in keeping things simple.

5 Likes

This notion of “unknown unknowns” resonates strongly with me, and not just on the level of syntax or optimization tricks. My main unknown unknown was the set of all tools that make package writing easier, which (for the most part) you can’t even discover by reading the code because they were used locally by the developers.
I’m trying to address this here:

Reading this discourse is a great way to keep discovering by the way, it is full of extremely knowledgeable people giving answers to questions you never even knew you had. So keep up the good work!

13 Likes

I was not doubting your premise, I was just asking for specific examples. :slight_smile:

The getproperty overloading for functions is a good example. I was aware of getproperty overloading for structs, but it never occurred to me to overload property access for functions.

3 Likes

But I don’t think stuff like that belongs in the documentation. Because it’s not actually a feature of the language, but an application of other, more generic features.

Julia (like all languages) has a limited and enumerable number of features. But those features can be (like in other languages) used and combined in an unlimited number of ways (like the Lego bricks mentioned by @mbaz). It is impossible and not really desirable to list all conceivable uses of those features (in a way, that is what Github is for, showing as many different usecases as possible.)

It is a sign of that you have a powerful set of features when people are able to use and combine them in ways that were not considered or predicted at all by those who designed those features.

9 Likes

I really like the Lego pieces analogy.

Others commented that I should not expect the documentation to present the full possible combinations of things that can be achieved by using the language features - and I could not agree more since such an unthinkable request would imply something similar to the Library of Babel (but in valid Julia code).

To clarify - I do not expect the documentation to contain actual programs. To keep the discourse in the Lego analogy space, it seems to me that some of us know more about singular Lego blocks than others (while having access to the same documentation).

Take, for example, the propertynames function: because the wording in the documentation differentiates between the type and the instance of that type, it somewhat transmitted to me that the functions are excluded. To be fair, the documentation mentions “the properties of an object” (and functions are also objects) - but the latter addition about instances somehow directs the meaning towards structs. At least this is how I read it.

I cannot underline enough that my initial post and everything that followed is not meant to critique - I am just amazed at the work (and productivity) of some Julia developers/builders, and sometimes I feel like we have two parallel worlds: the gods and the mere mortals (and for some reason, I see Tim Holy as “the” god).

Also, I added my post after doing some work in the area of “source code reading” and arrived at some conclusions (or impressions, at least). I hope there is no room to interpret this in any other way than a constructive approach: take this as a call for help from somebody who wants to get more intimate with the language and wants to pick the brain of more experienced people.

4 Likes

https://documentation.divio.com/ has been shared numerous times in this community, I think you may find it interesting. I think what you’re getting at is that the core documentation is lacking in one or more of those quadrants on some pretty fundamental functionality. Were there to be a unified “Explanation” or “How-to guide” covering the *property and *field methods could’ve helped address your example. It might not be sufficient to get to that aha moment right away (defining getproperty on a function type is pretty obscure as Julia code goes), but it would make that moment come faster.

3 Likes

I had a similar thought recently. This same thing happened to me while skimming through the source code of DataInterpolations, where I encountered this

(interp::AbstractInterpolation)(t::Number) = _interpolate(interp, t)
(interp::AbstractInterpolation)(t::Number, i::Integer) = _interpolate(interp, t, i)
(interp::AbstractInterpolation)(t::AbstractVector) = interp(similar(t, eltype(interp)), t)

So I learned by chance that it’s possible to define () call methods for custom types and thus assign function-like behavior to any object. It’s a neat feature that I was surprised I hadn’t come across before in the manual or other resources.

3 Likes

In that case, function-like objects are described in Methods · The Julia Language

3 Likes

I think the documentation can be more transparent about a generic function being an instance of a singleton Function subtype, just with special syntax that makes the function name const in the global scope and printed in the REPL. Look what happens when I don’t use function syntax:

julia> struct Foo end; Foo()
Foo()

julia> struct Bar<:Function end; Bar()
(::Bar) (generic function with 0 methods)

This is somewhat alluded to in the sections on functors, but since functors aren’t used frequently across the documentation, much less than regular functions, people take some time to grasp this fundamental that functions are a value with a type that can be dispatched on, and that calls f(x) dispatch on the type of f (with some common sense restrictions) in addition to the arguments. The docs do say that functions are first-class objects that can be inputs or outputs, but because there’s special syntax for them, many people don’t really get exactly how functions are like any other instance.

Here’s something cool the functor docs don’t explain. When your type is singleton, you can define the methods with a const name of the singleton instance instead. Works for generic functions too.

julia> struct Foo end; const foo = Foo() # singleton, const needed for this
Foo()

julia> (::Foo)() = 0

julia> foo(x) = 1
Foo()

julia> methods(foo)
# 2 methods for callable object:
[1] (::Foo)() in Main at REPL[2]:1
[2] (::Foo)(x) in Main at REPL[4]:1

julia> function f end; const g = f
f (generic function with 0 methods)

julia> g() = 0
f (generic function with 1 method)

julia> methods(g)
# 1 method for generic function "f":
[1] f() in Main at REPL[8]:1
5 Likes

Another “unknown unknown” was that the property of an object could even be a string.

An expression like o."world"("!")() is actually standard Julia code (no macros involved here). The same for o."world"(".")."How"."are"."you"("doing")(). Let’s ignore the readability of the examples I provided (e.g. these are to exemplify “what is possible” and nothing more).

struct CrazyObject
	x::String
end

(c::CrazyObject)() = getfield(c, :x)
(c::CrazyObject)(x::String) = CrazyObject(c.x * " " * x)
Base.getproperty(c::CrazyObject, x::String) = c(x)

o = CrazyObject("Hello")
o."world"("!")() # output: "Hello world !"
o."world"(".")."How"."are"."you"("doing")() # output: "Hello world . How are you doing"

However, trying to use an integer as a property will cause a “cannot document the following expression” (I can define the method, but I got the error when using the property - not sure what that is about yet).

The next snippet is unrelated to the one above:

append(x::String, y::String) = x * " " * y
Base.getproperty(::typeof(append), x::String) = y -> append(x, y)

append."Hello"("world!") # output: "Hello world!"

Ignore for a moment that this is totally useless, and everybody would try to murder the author if such code were deployed in a serious production environment. The point is that this is standard Julia code (and yes, I can think about more complex situations where this kind of dot-driven chaining can make more sense).

I am like a small kid getting new toys here.

7 Likes

Is it? getproperty(::T, s::String) is not documented anywhere so it just seems to be a lapse in limiting dot syntax to symbols, and there’s no promise future versions would still allow this. If they wanted to allow properties to be more things then they wouldn’t have made f.(x+y, z) a broadcasting call syntax.