Hints for an old programmer new to Julia


#1

First, let me say I love Julia! It is a terrific new tool in my box for Machine Learning :smiley:

I am an old time programmer who is used to type safety and static type checking, and with Julia, I am experiencing many of the same frustrations I have with Python-- namely when I try to use someone else’s code, it can be a bear.

For example, I am currently using Flux to do some NLP. I want to make a text classifier using a Conv layer. I feed it my word embeddings and get a mystery crash deep inside NNLib. Obviously I am not passing it the kind of input it expects. The Flux code has no argument types, and the comments do not mention types. I waste huge amounts of time trying to reverse engineer the stack dump and guess what I did wrong.

For reference, here is the code

Data should be stored in WHCN order. In other words, a 100×100 RGB image would                                                      
be a `100×100×3` array, and a batch of 50 would be a `100×100×3×50` array.                                                          

function (c::Conv)(x)
  # TODO: breaks gpu broadcast :(                                                                                                   
  # ndims(x) == ndims(c.weight)-1 && return squeezebatch(c(reshape(x, size(x)..., 1)))                                              
  σ, b = c.σ, reshape(c.bias, map(_->1, c.stride)..., :, 1)
  σ.(conv(x, c.weight, stride = c.stride, pad = c.pad, dilation = c.dilation) .+ b)
end

This is not a complaint about Flux (which I love). I have the the same problems with most libraries in Julia and Python

  • Without something like Traits or Interfaces it is hard to know what one should pass as arguments
  • You don’t find out your mistake until runtime, often after things have been running for a while
  • The error is usually some stack dump that starts way down in a library called by a library called by… your code

I end up in the REPL trying different things to see what works

How can I be more productive?


Structs and parametric constructors
#2

There is definitely a tension here between flexibility (often obtained by not specifying the types of arguments of functions), and type-checking and legibility of the interface.


#3

It looks to me like x should be typed as AbstractArray at least, since it’s using size (which is part of AbstractArray interface).


#4

This, to a certain extent, is a side-effect of Julia being so generic. With multiple dispatch, you usually don’t restrict to a specific set of types, as anything that respects the interface (more on this below) will work. Eg this can be used to extend functions written in pure Julia with AD or interval arithmetic, even when these functions were not written with this in mind.

The downside is that interfaces are not formally defined, and even if some have some recognized conventions this is not explicitly enforced by the language. A non-working program could then be a mistake on the programmer’s part, a bug in eg Flux, or some corner case of the interface which no one explored so far.

My usual strategy is to make a self-contained minimal example of the bug (which you should do), and ask here, then open an issue.


#5

I’ve been playing with Flux recently and it’s been a lot of fun. But also slightly frustrating because being a rather experimental package it’s in a state of flux itself, and the internals are only very lightly documented at the moment. (Not to complain; my own packages suffer from many documentation problems of their own.)

For Flux specifically, I found it very helpful to build up my Flux models interactively in the REPL with a “toy sized” batch of data. This can be done incrementally, adding one layer at a time until the shape of the data matches correctly between layers. The loss function and a single step of an optimizer can be tested in a similar way. Once all this is working I would try a training run on some medium sized example data.

On a meta level, I think it’s useful to pursue the tightest feedback loop possible for experimentation and discovery, and this is where the interactive environments which can be created using python and julia really shine. It’s that qualitative difference in creativity which comes with having the result right now, rather than in five minutes after recompiling and reloading the data. But there’s a cost for production code bases compared to static languages, which seems to come in the form of more tests. Things which the compiler doesn’t test for you inevitably need to be tested manually.

Coming back to your original problem, yes we don’t have interface definitions in the language and this leads to a large amount of duck typing. Right now, I don’t think there’s a better solution than clear API documentation. Type constraints on functions is a non-solution in general because the type tree (being a tree) often can’t capture the full set of types a function will work on (*). Amusingly this is exactly the same problem faced by C++ template libraries which use trait-based dispatch rather than normal class inheritance. These C++ libraries fail in exactly the same way you’re experiencing: deep within the implementation in a place the user should never see (witness, eg, something like boost::iostreams)! The problem isn’t solved there either as C++ concepts have been rejected from the last several standards.

(*) Yes, Unions can help out here, but are inherently non-extensible. Manually defined traits help and are extensible, but are often clumsy to use.


#6

Thank you Chris, Tamas, Jonathan and Petr. That is exactly the kind of advice I was looking for. I will concentrate on building up small things in the REPL.

I like the flexibility of Julia, and use it in my own code. But it sure would be nice if there was some optional way to declare ‘contracts’

For example, in the case of Flux.Conv it seems to only work for 2d convolutions and AbstractArray data that is of dimensions [w,h,d,n], so if you have a single grey scale image, you have to reshape it as [x,y,1,1]. Furthermore, while Conv works on TrackedArrays, operations like transpose on TrackedArray causes a stack dump. I guess the return of transpose(TrackedArray) is no longer an AbstractArray. It sure would be nice if the compiler could optionally catch that sort of thing.

In any case, this sure beats working in C++ :smiley:


#7

This may eventually emerge, but perhaps not in the short run because of other priorities. In the meantime you can keep an eye on

and related issues.


#8

Perfect! I see that this issue has been discussed from the beginning, and bigger minds than mine has pondered it.

I will +1

What we do care about is allowing people to document the expectations of a 
protocol in a structured way which the language can then dynamically verify
(in advance, when possible).

and now I will go watch the Guy Steele talk