Julia solves a two-language problem but the real problem is a 3-wants problem which Julia does not necessarily serve well atm

Julia solves the two language - write the algo in a slow language and then reimplement it in fast language.

However, that is not the biggeest two language.

The real problem is this

  • Modeler write code for their models
  • Software engineers put code into production

It’s a pain to productionise code not in the sense that you can’t run a R glm model or a Python Pytorch model in a production environment it’s that those productionised systems are brittle. It fails with if the wrong type of data is passed.

The way to solve it is to recode everything into a language with static analysis and proper types like C++ and Rust.

Recently, one of complaint at work was a monte-carlo simulation model in R was too hard to put into production since no one in the implementation team has the skills to tackle that!

The dev-production divide is a two language. It’s arguable more important the two-langauge problem solved by Julia.

In fact it’s a 3 wants problem. We want to

  • Easily dev the algo
  • Make the algo fast
  • Make the algo deployable in a non-brittle fashion

It’s not a 3 language problem as one language can solve the latter 2 (c++/rust heck even java in some cases).

Unfortunately, it’s harder for Julia to solve the 3rd want and I think Static.jl is not going to be able to solve since Julia’s ecoystem is not friendly to strict typing.

Alas, we are back to the drawing board. Hopefully Julia 2.0 can solve the 3rd want with some Trait system and some redesigns that can improve static analysis.

25 Likes

You may have a point here. That is the “it runs on my laptop” syndrome.
As someone who configures HPC systems I am interested in this.

However I will counter you with asking why does containerisation not help here.

9 Likes

Yeah. Typed this while drunk on a beer. Also en not first Lang and didn’t bother using grammarly. Lol

5 Likes

The typing discipline and static analysis issue isn’t solved with containeriation though. U can deploy via docker but is the deployment brittle?

1 Like

Not sure about that endless debate on statically vs dynamically typed languages. While there are some type systems that guarantee correctness to a good extent – Haskell, Scala and Rust – I would trust Julia code – given that it is often shorter, more general and readable – more than C/C++ any time. There are also examples of large code bases in dynamically typed languages such as Ruby or Javascript and whole operating systems have been written in Lisp or Smalltalk.
At least in ML, I get the feeling that Julia has another (bigger?) political problem. As the language is extremely composable there is a tendency towards small libraries instead of big frameworks that can be combined at will (a similar structure has emerged in the Clojure community). In the same way, the language prevents vendor lock-in into a single monolithic framework, i.e., just use library A together with autodiff from library B if you want. To me, this seems to be a major reason why large tech companies rather reinvent the wheel in their own closed ecosystem forcing users into a incompatible subsets of Python, i.e., Pytorch, Tensorflow or Jax, instead of investing into Julia.

18 Likes

For my understanding let’s take concrete example, say I have a system that take a user supplied csv file as input and produce some output file from it. What risks the Julia code has that the statically typed can’t have (and maybe contrast that with risks both code base share) ?

1 Like

I don’t know about CSV input, but here is one I just ran into:

In some code I was writing, I had a vector of Vectors. I wanted to find the least common multiple of all the Vectors’ lengths. Then I wanted to repeat the Vectors’ so that they all ended up the same length. I had some testing and everything checked out.

Then I was running this code over some production input and got the strangest Inexact Int error… Digging in, it said my longest vector was Inf long??? Oh, one of my Vectors had 0 elements, which was causing a divide by zero.

Rust’s static analyzer would have forced me to handle the edge case of zero length.

Edit: @mstewart caught that I was wrong here. However, I think it is this class of problem that static analyzers help with – edge cases that our monkey brains didn’t consider

4 Likes

I have exactly the same pain of making my Julia functions accessible to programmers using a different langauge such as python. To solve the bigger two-language problem, you proposed the static compiling approach. Different from the static compiling approach, I have been thinking about a web-based solution:

  1. define a protocal for web-based function call, useually in the form of JSON schema.
  2. setup a remote host backended with computing resources (EC2 or cluster), so that developers can upload their code to that host,
  3. provide multi-language clients to access the web service.

So a developer just upload his code to that server (e.g. using Git protocal), and his function will be accessible to multi-language clients.

Do you think this web-based approach can solve most of the bigger two-langauge problem or do we have to go to static compiling?

2 Likes

one way to help with deployment. there are many ways to do things like that. I think typescript is skyrocketing in popularity even as a backend tool is because of types and how it can help with static analysis.

That sounds like an interesting approach to interlanguage communication. But it doesn’t address OP’s problem: needing provably correct production code.

I think it’s worth being clear that we have no evidence this is a language problem. It’s a “laws of the universe” problem – there’s no reason to believe any tools that optimize for one set of incentives will be optimal for a different set of incentives. Unless there’s evidence you can jointly optimize all incentives simultaneously, the issue isn’t technical, but social – how do you balance the competing incentives and decide on defensible tradeoffs?

It is surely the case that existing languages aren’t on the optimal tradeoff frontier, but even an ideal language may not solve the problem you’re saying most interests you.

16 Likes

I’m not sure I have a lot to add to the discussion generally. I have used Haskell and do like these sorts of features. But there are a lot of tradeoffs in languages. Julia has different strengths. I haven’t tried Rust, mostly because giving up garbage collection doesn’t seem essential for performance in the sort of numerical code I want to write. That is a tradeoff that didn’t sound appealing.

However, I am wondering how this is enforced in Rust. I know Rust has algebraic data types and pattern matching, with which you can certainly enforce handling edge cases. Looking at the Rust docs, makes it look like the vectors provide a length function that returns an integer and can be used without a pattern match. I would have thought your bug would still result in a divide by zero at run time, even in Rust. Are you referring to something that is standard and idiomatic in Rust or something that is possible but not commonly done? Even in Haskell, which has a culture strongly in favor of static type safety, they don’t force special handling of empty lists or vectors in their standard library. (Although I think some in that community consider that a mistake.)

2 Likes

JavaScript is literally the worst language, yet people make large products out of it; so if it’s possible to make production code with JavaScript, then it must be possible to make production code with Julia. There might be lessons to be learned—especially w.r.t. tooling, e.g. linting.

I feel this. It seems related to the Lisp curse.

Ecosystem fragmentation, outdated and incomplete package documentation, and poor method discoverability can make it pretty rough when you’re trying to explore the design space and learn new APIs.

:100:. Not much point in proving correctness of code that doesn’t correctly encode the author’s intent, which is hard to get right if it’s an illegible mess.

Even in the strictest language you can make bugs. Question is, what features will minimize the total errors, not just the errors that can be caught by the compiler? IMO Julia has found a pretty sweet spot here. (although maybe it can get sweeter?)

15 Likes

Regarding safety, I wonder if it would be worth it to add ErrorTypes.jl to Base in Julia 2.0. I haven’t played around with ErrorTypes.jl, so I’m not sure how ergonomic it would be if Base used it across the board. Even simple functions like first and last would have to return an Option. It could be taken to extremes… Should sqrt return a Result? Right now it throws on negative real number inputs.

Ah! Yeah, you’re right! I thought compiler would catch, but this does panic:

fn main() {
    let mut v = Vec::new();
    div_len(v);
}

fn div_len(v:Vec<i32>) {
    let l = v.len();
    let r = 9 / l;
    println!("{}", r); 
}

I had thought that since dividing integers ~can~ yield an error, it would need to be handled.

I’m only a Rust dabbler, so I shouldn’t have spoken so authoritatively.

2 Likes

I have tried ErrorTypes.jl in a few small places. One nice thing is that JET.@report_call from JET.jl can then catch when you have failed to handle an option type correctly. It seemed to work well. I first tried it a few weeks ago and will probably use it more in the future. I think JET.jl can also find uncaught exceptions using mode=:sound, but when I’ve tried it, there are a lot of things from Base. Using JET.jl and 'ErrorTypes.jl` seemed more helpful.

However, although I liked it where I used it, the ergonomics seem like they would be horrible for interactive use if it were applied broadly.

1 Like

Is there a language that can discover this type of underlying assumption?

As a mathematician I know that sometime we think we have found the proof of something, just later to discover that the whole thing stand on an assumption that needs to be proven.

A compiler that can discover missed assumptions seems to be a dream for me. What is it?

3 Likes

You’re asking a question a little out of my depth there! Rust’s compiler provides an enormous number of protections with it’s Option type. The Option type is a Sum type of every possible state for a value. For example, an HTTP request might have a 200 option, 404 option, 500 option, etc. Rust’s compiler requires you to handle all states to “unwrap” the value and get the actual response out. If you correctly model the states of your problem space, Rust compiler and type system will do a ton of correctness checks.

That said, I unintentionally pointed out a place where this breaks down in this thread :sweat_smile:

I’m not sure if there is an even stricter language out there.

I think Haskell is probably stricter than Rust on some things in that it aims to be purely functional and side effects happen only in ways that are tracked by the type system using monads. But the standard library is not completely consistent in using option types everywhere there might be an error. Division is a tough case for any language. You could have it return an option type, but you would annoy a lot of people with the ergonomics of that. There are also more exotic dependently typed languages that incorporate theorem proving and can require a compile time verifiable proof that a number is nonzero before dividing by it. Idris would be an example. Theorem proving systems like Coq are super-strict dependently typed languages. Typically, they even check for termination. I think they’re all pretty tough to use, but it is interesting how far you can go with that approach.

5 Likes