Julia solves the two language - write the algo in a slow language and then reimplement it in fast language.
However, that is not the biggeest two language.
The real problem is this
Modeler write code for their models
Software engineers put code into production
It’s a pain to productionise code not in the sense that you can’t run a R glm model or a Python Pytorch model in a production environment it’s that those productionised systems are brittle. It fails with if the wrong type of data is passed.
The way to solve it is to recode everything into a language with static analysis and proper types like C++ and Rust.
Recently, one of complaint at work was a monte-carlo simulation model in R was too hard to put into production since no one in the implementation team has the skills to tackle that!
The dev-production divide is a two language. It’s arguable more important the two-langauge problem solved by Julia.
In fact it’s a 3 wants problem. We want to
Easily dev the algo
Make the algo fast
Make the algo deployable in a non-brittle fashion
It’s not a 3 language problem as one language can solve the latter 2 (c++/rust heck even java in some cases).
Unfortunately, it’s harder for Julia to solve the 3rd want and I think Static.jl is not going to be able to solve since Julia’s ecoystem is not friendly to strict typing.
Alas, we are back to the drawing board. Hopefully Julia 2.0 can solve the 3rd want with some Trait system and some redesigns that can improve static analysis.
At least in ML, I get the feeling that Julia has another (bigger?) political problem. As the language is extremely composable there is a tendency towards small libraries instead of big frameworks that can be combined at will (a similar structure has emerged in the Clojure community). In the same way, the language prevents vendor lock-in into a single monolithic framework, i.e., just use library A together with autodiff from library B if you want. To me, this seems to be a major reason why large tech companies rather reinvent the wheel in their own closed ecosystem forcing users into a incompatible subsets of Python, i.e., Pytorch, Tensorflow or Jax, instead of investing into Julia.
For my understanding let’s take concrete example, say I have a system that take a user supplied csv file as input and produce some output file from it. What risks the Julia code has that the statically typed can’t have (and maybe contrast that with risks both code base share) ?
I don’t know about CSV input, but here is one I just ran into:
In some code I was writing, I had a vector of Vectors. I wanted to find the least common multiple of all the Vectors’ lengths. Then I wanted to repeat the Vectors’ so that they all ended up the same length. I had some testing and everything checked out.
Then I was running this code over some production input and got the strangest Inexact Int error… Digging in, it said my longest vector was Inf long??? Oh, one of my Vectors had 0 elements, which was causing a divide by zero.
Rust’s static analyzer would have forced me to handle the edge case of zero length.
Edit: @mstewart caught that I was wrong here. However, I think it is this class of problem that static analyzers help with – edge cases that our monkey brains didn’t consider
I have exactly the same pain of making my Julia functions accessible to programmers using a different langauge such as python. To solve the bigger two-language problem, you proposed the static compiling approach. Different from the static compiling approach, I have been thinking about a web-based solution:
define a protocal for web-based function call, useually in the form of JSON schema.
setup a remote host backended with computing resources (EC2 or cluster), so that developers can upload their code to that host,
provide multi-language clients to access the web service.
So a developer just upload his code to that server (e.g. using Git protocal), and his function will be accessible to multi-language clients.
Do you think this web-based approach can solve most of the bigger two-langauge problem or do we have to go to static compiling?
one way to help with deployment. there are many ways to do things like that. I think typescript is skyrocketing in popularity even as a backend tool is because of types and how it can help with static analysis.
I think it’s worth being clear that we have no evidence this is a language problem. It’s a “laws of the universe” problem – there’s no reason to believe any tools that optimize for one set of incentives will be optimal for a different set of incentives. Unless there’s evidence you can jointly optimize all incentives simultaneously, the issue isn’t technical, but social – how do you balance the competing incentives and decide on defensible tradeoffs?
It is surely the case that existing languages aren’t on the optimal tradeoff frontier, but even an ideal language may not solve the problem you’re saying most interests you.
I’m not sure I have a lot to add to the discussion generally. I have used Haskell and do like these sorts of features. But there are a lot of tradeoffs in languages. Julia has different strengths. I haven’t tried Rust, mostly because giving up garbage collection doesn’t seem essential for performance in the sort of numerical code I want to write. That is a tradeoff that didn’t sound appealing.
However, I am wondering how this is enforced in Rust. I know Rust has algebraic data types and pattern matching, with which you can certainly enforce handling edge cases. Looking at the Rust docs, makes it look like the vectors provide a length function that returns an integer and can be used without a pattern match. I would have thought your bug would still result in a divide by zero at run time, even in Rust. Are you referring to something that is standard and idiomatic in Rust or something that is possible but not commonly done? Even in Haskell, which has a culture strongly in favor of static type safety, they don’t force special handling of empty lists or vectors in their standard library. (Although I think some in that community consider that a mistake.)
Ecosystem fragmentation, outdated and incomplete package documentation, and poor method discoverability can make it pretty rough when you’re trying to explore the design space and learn new APIs.
. Not much point in proving correctness of code that doesn’t correctly encode the author’s intent, which is hard to get right if it’s an illegible mess.
Even in the strictest language you can make bugs. Question is, what features will minimize the total errors, not just the errors that can be caught by the compiler? IMO Julia has found a pretty sweet spot here. (although maybe it can get sweeter?)
Regarding safety, I wonder if it would be worth it to add ErrorTypes.jl to Base in Julia 2.0. I haven’t played around with ErrorTypes.jl, so I’m not sure how ergonomic it would be if Base used it across the board. Even simple functions like first and last would have to return an Option. It could be taken to extremes… Should sqrt return a Result? Right now it throws on negative real number inputs.
I have tried ErrorTypes.jl in a few small places. One nice thing is that JET.@report_call from JET.jl can then catch when you have failed to handle an option type correctly. It seemed to work well. I first tried it a few weeks ago and will probably use it more in the future. I think JET.jl can also find uncaught exceptions using mode=:sound, but when I’ve tried it, there are a lot of things from Base. Using JET.jl and 'ErrorTypes.jl` seemed more helpful.
However, although I liked it where I used it, the ergonomics seem like they would be horrible for interactive use if it were applied broadly.
You’re asking a question a little out of my depth there! Rust’s compiler provides an enormous number of protections with it’s Option type. The Option type is a Sum type of every possible state for a value. For example, an HTTP request might have a 200 option, 404 option, 500 option, etc. Rust’s compiler requires you to handle all states to “unwrap” the value and get the actual response out. If you correctly model the states of your problem space, Rust compiler and type system will do a ton of correctness checks.
That said, I unintentionally pointed out a place where this breaks down in this thread
I’m not sure if there is an even stricter language out there.
I think Haskell is probably stricter than Rust on some things in that it aims to be purely functional and side effects happen only in ways that are tracked by the type system using monads. But the standard library is not completely consistent in using option types everywhere there might be an error. Division is a tough case for any language. You could have it return an option type, but you would annoy a lot of people with the ergonomics of that. There are also more exotic dependently typed languages that incorporate theorem proving and can require a compile time verifiable proof that a number is nonzero before dividing by it. Idris would be an example. Theorem proving systems like Coq are super-strict dependently typed languages. Typically, they even check for termination. I think they’re all pretty tough to use, but it is interesting how far you can go with that approach.