Not only for technical computing: changing the narrative around the usecase for Julia


#1

Hi. I’m a relatively new member of the community, and I really only started to dig into Julia in the past few months. I had poked at it a few years ago because I found the initial announcement so intriguing. Here’s a list of things I like in programming:

  • Lisp and metaprogramming
  • Haskell and awesome type systems
  • The write- and readability of Python (and that ecosystem, of course)
  • String processing in Perl
  • Interfaces for working with process calls (I’ve written a couple myself)
  • Making stuff go fast with C
  • Unix-style programming

Sound familiar? So, as you might imagine, the initial announcement hit all the right points for me. I messed with it a little, found some things that were weird to me but I couldn’t be bothered to figure them out, plus, Julia was really just a language for technical computing, and I’m a professional string mangler, so what’s the point, right?

At some point last year, I sort of started to look into it again because I was fairly fascinated by how well thought-out the command literals in the language are, and I thought it might be a more elegant way to replace Bash scripts than doing Python rewrites. (it was, by far)

My experiences inspired me to begin working on a tutorial for administrative scripting with Julia, based loosely on a similar one I wrote for Python.

Since then, I’ve also used it for small string processing and parsing jobs at work, where the thing I needed to do was very simple, but I needed to do it about 800,000,000 times. I’ve used it to solve Advent of Code puzzles (very helpful for getting a feel for the language) and I’m working on using Julia to wrap Raptor, an RDF parser in C.

The thing I notably have NOT used Julia for at all is numeric computing, because that isn’t really in my field of interest or work. (I use some statistics for the string stuff I do, but it’s the kind of thing that can be done in AWK as well as any other language. It’s not fancy.)

I’m consistently amazed that Julia is not only good at general programming tasks, but that it frequently is better for general problems than most of the other languages I like. I mentioned that I’m interested in Unix programming, and Julia is so perfect for that stuff. Standard streams and command-line arguments are in the default namespace, file read functions default to stdin, newlines are omitted by default when iterating over lines (why are Julia and Bash the only languages that get this?). It also has filesystem functions in the default namespace, and those delicious regex and command literals. If you didn’t know anything else about Julia, you’d think it was a language designed to compete directly with Perl.

Then you go to the macros. Macros for days. With the absolute power they give you to manipulate semanitics and define DSL’s, you’d wouldn’t blink if someone told you Julia was designed as a Racket competitor.

Eventually, you hit the point where you want to optimize something and you want speed and concurrency. You can allocate all your buffers and mess around with pointers and flipping bits if it helps. Plus, amazing multiprocessing, fantastic green threads, and pretty OK-ish OS threads (haven’t used them much). You might think it was meant to compete with Go on networking and scalability.

Is Julia great for technical computing? I personally don’t know, but I’m lead to believe it is. However, it is also great for all kinds of other jobs.

I really want to get this message out that Julia isn’t only for scientists and economists, but that it’s also amazing for “normal programmers” like me! The more programmers get into Julia, the better the ecosystem shall be for all of us.

What are some things we might do as a community to promote Julia as a general-purpose language which is great in many domains including but not limited to technical computing?


#2

You could make this a blog post, as a good first step!


#3

I don’t have a blog at the moment, but I’ve been thinking of starting one. Maybe I should get to that. It’s just… Bleh. I guess I should just bite the bullet and learn Hugo properly.


#4

You should — you write really well and have an interesting vantage point!


#5

Thanks, I appreciate that!


#6

These terms are always fuzzy, but I would argue that

(\text{numerical computing} \cup \text{string mangling}) \subset \text{technical computing}

Also, I am not sure there needs to be a single narrative, I think that various ones can coexist. There is nothing preventing you from starting your own for the string mangler community, and I would say a writeup like the above is a good start.


#7

See that’s the other thing about me: don’t know math very well. I had to go on Wikipedia to discover that this means the union of numeric computing and string wrangling is a subset of technical computing. Cultures in conflict! :rofl:

In my particular case, I’m sort of fixing broken metadata in a library catalog. I’m just some kind of digital humanities janitor. It doesn’t feel like technical computing to me, but maybe I need to broaden my definition.


#8

You could blog at nextjournal! You can publish your articles there with the added benefit, that you can import IJulia notebooks, markdown and directly inline executable julia code in your article :wink:


#9

Can Nextjournal attach tags to documents and generate RSS feeds for that, so that the finished article can be listed with the JuliaBloggers aggregator?


#10

I couldn’t agree more. In general I view numerical computing as a strictly harder problem than what people call “general purpose computing”. If you’re doing numerical computing, of course you need to be able to do shell-style programming and string manipulation and I/O and threading. But you also need to be able to crunch numbers quickly and accurately. So Julia is designed to be good at all of those things and great at numerical stuff. Just because I want to multiply some matrices doesn’t mean I don’t want to open files.

I’ve been thinking about a blog post entitled “Julia is a General Purpose Language”. But if you want to write such a blog post first, please do so! It’s more convincing coming from someone less biased.

I confess, I did a lot of Perl and Ruby programming earlier in my life. It was a fairly significant influence on that part of Julia’s standard library design. Some of the earliest Julia blog posts [1] [2] (badly in need of an update, but you can still get what they’re talking about) were about how Julia does shelling out the right way compared to Perl, Ruby, Python, et al. (yes, Python has subprocess but man, that is not a pleasant API compared to Julia’s command objects).

While I would agree with your assessment (in your administrative scripting post) that Julia may not be ideal for administrative scripting because of the heftiness of the runtime, the command API was expressly designed for that kind of thing and I think it’s actually better than any other language, possibly including shells. I had observed that the backticks for shelling out in languages like Perl and Ruby are so convenient and seductive that I would see my colleagues using them all the time but I would always cringe a bit because every time someone splices a file name into backticks in Perl or Ruby, they’re creating a bug/trap that’s just waiting to get sprung by a file name with a space or some other metacharacter in it. I thought, instead of lecturing people on not doing that, how would you design something where the convenient, seductive thing was actually the correct, safe way to call an external process? And thus Julia’s backticks syntax was born. Instead of shelling out and causing all sorts of problems because of leaning on the shell, it implements shell-like semantics itself and avoids all of those problems.

Another benefit from not shelling out to call commands is that Julia’s command API is actually fully portable: it works exactly the same on Linux , Mac, FreeBSD and Windows because it doesn’t rely on an external shell. A shell that will be GNU bash on Linux and macOS but tsch on FreeBSD and won’t exist at all on Windows. Shelling out is one of the largest headaches when people try to port tools between operating systems. Julia doesn’t have that problem. And all of Julia’s I/O is fully portable because we use libuv.

Btw, I’m currently working on a design for adding more shell-like features inside the backtick syntax which will ultimately bring that up to the level of being nearly as capable as a shell. Julia’s backticks will essentially be a fully portable mini shell language, without the control flow and evaluation constructs since you’ve got a full programming language for that outside of the backticks.

This will especially be true once the ongoing threading work is done. After that Julia’s threading model will be essentially the same as Go’s: Julia’s tasks will be the equivalent of Go’s “goroutines”—coroutines that may be run concurrently on different hardware threads. We also try to take it a step further by having distributed programming model that matches the multicore programming model as closely as possible: Channel versus RemoteChannel, etc.


#11

YESYESYES!

All of this exactly. I wrote a Python preprocessor, eggshell, some years ago that allowed inlining shell commands, and it basically wanted to address exactly the same problems that you mention with Ruby and Perl backticks. Imagine my shock when I discovered Julia Cmd literals only to find that you had created a nearly identical API. I used different syntactic conventions, but exactly the same logic: commands never get a shell, strings and scalars are treated as single arguments while other iterables are expanded, lines are chomped when you iterate, Non-zero exit code raises an exception where possible, commands are just objects that don’t do anything until you try to do something with them (well, mostly. They are a more stateful than Julia’s, but I think your design is better on that count).

My discovery of Julia essentially killed that project. No need to keep going when you’d already done it perfectly. Every time I use Julia for any of that scripting kind of stuff, I always feel like I’m communicating with “someone who gets it”. I guess that was you!

And it’s definitely better than any shell except maybe Fish which operates under similar logic. Posix shell is a great user interface and a horrible programming language. It might be just a little more handy in Julia if you could do pipes and some redirection inside the backticks, but I assume that’s the kind of extensions you’re currently talking about adding, since it’s the obvious thing.


#12

Python subprocess? I still have the scars. And IT CHANGED during the Python 2.x series.

Anyway, I completely agree with @ninjaaron And his Harry Met Sally ‘crisis’.


#13

A blogpost on livejournal would be awesome!


#14

Here’s some practical things that I think would help with Julia being accepted as a general-purpose language rather than being purely for technical computing. This isn’t a list of complaints but meant to be constructive. Most of this has to do with the common theme of packaging software for end-users (not for other developers, which I already think already is quite nice in the Julia ecosystem at least once https://github.com/JuliaLang/julia/issues/27418 is fixed), and much of this is well-known and actively being worked on.

The first big thing is that unlike python or Perl, Julia isn’t packaged by most distributions and must be installed separately. Part of this obviously has to do with lower adoption of Julia compared to other languages, but part of it is self-inflicted. Specifically, because Julia carries so many patches of its dependencies, it doesn’t play nice with linking against dependencies provided by a distribution, especially when it comes to LLVM. This means that on my distribution, where I have packaged Julia, the package weighs in at a hefty 200 MiB. Of this, 45.4 MiB is LLVM. Julia devs have done a good job in the past of pushing LLVM patches upstream, but it would be wonderful if sometime in the next few releases, Julia could depend on an unpatched LLVM. That would help make the case for more distributions to carry it in their package repos.

Along that note, the package manager and packaging ecosystem isn’t currently very friendly to system Julia installations. My ~/.julia folder is currently 2.8 GiB. If another user on my computer also wanted to use Julia, they may end up with a ~/.julia of a similar size. As far as I know there’s no good way currently to provide system installation of common packages that can be shared between users. And I’m not even going to get into the mess that we’ll be in once such a means is provided and somebody wants to package something that uses BinaryBuilder.jl but wants to use the system libraries instead.

Because Julia doesn’t come packaged on most distributions, any software written in Julia must be distributed alongside the Julia runtime. PackageCompiler is great for this if you can get it to work with your software, but it still needs polish, and the binaries it produces are still rather heavy. Other solutions involving bringing along a whole Julia installation are of course even more heavyweight, and then you’re also stuck with the often very large pause of compilation time at invocation.

Basically, for actually distributing software to end-users, Julia, its implementation, and its ecosystem just get in the way right now, which is a shame given that I completely agree with you on the suitability of the language itself across a wide variety of domains. I think solving these packaging issues would go a long way towards changing the narrative you’re talking about.


#15

@jameson or @Keno Does Julia’s static compilation size reflect a fundamental limitation of Julia’s design or current compiler (ie GC and allocating code style in base or MD with generic code)?. Or can it be improved without a an extensive overhaul of the language or code? Perhaps by restricting input types?


#16

I think this is considered insignificant these days on modern desktops/laptops.

Effectively, you are suggesting substituting hardware (storage space), which is super-cheap, with developer effort, which is very expensive and the bottleneck for many things.


#17

Multiply 45.4 MiB (for LLVM) by the number of Julia users, and that’s closer to the number you can compare to the cost of developer effort if you want to go down that road. The reality is that most of the patches Julia carries for LLVM are bugs in LLVM, so the real total utility of upstreaming them isn’t even localized to only Julia users but to all users of projects that depend on LLVM. In any case, a good deal of the effort involved is on the part of the LLVM team to review said patches. When I last checked several months ago, there were several patches Julia had sent upstream that hadn’t even been looked at yet.


#18

I think this is considered insignificant these days on modern desktops/laptops.

strong statement. I have to say that @non-Jedi brings up some important points why Julia is currently not yet a drop-in replacement for a variety of tasks. This does not mean that Julia is not a very nice general purpose programming language, but if I have to recommend Julia to other users (which the OP is kind of about) I would not just bring up the positive things but also the negative things. We have a 30 user setup with 30*3GB (.julia folders) = 900 GB storage just for Julia packages on a single computer.

My personal reason to not recommend Julia for non-numeric applications is primarily the compile-time. Further it seems the GUI programming is not a prime focus of the community, which means that Gtk.jl is by far not yet where the Python bindings are. For numerical computing Julia is much better than its competitors.


#19

Julia devs have done a good job in the past of pushing LLVM patches upstream, but it would be wonderful if sometime in the next few releases, Julia could depend on an unpatched LLVM.

Sure, that would be great. Let’s hope the next LLVM version doesn’t have any bugs. Seriously, though, Julia uses the LLVM toolchain much harder than most projects. And it’s this intense level of compiler tech that make Julia so fast and capable. I don’t anticipate Julia being able to use a vanilla LLVM version until we stop pushing so hard on compiler tech, which I don’t really see happening anytime soon. Most of the projects in the compiler work thread (that everyone is so excited about) are likely to uncover LLVM bugs that will require patches until proper fixes can be upstreamed. By comparison, the languages you’re comparing Julia are extremely conservative with the language tech they use. Guido van Rossum has repeatedly rejected proposals to add “fancy language techniques” to CPython, even if they provided significant speedups.

Along that note, the package manager and packaging ecosystem isn’t currently very friendly to system Julia installations. My ~/.julia folder is currently 2.8 GiB.

You must have a lot of stuff installed! Mine is only 60 MB. Have you tried doing pkg> gc? That will clean up any package versions that are no longer referenced by any manifest file.

As far as I know there’s no good way currently to provide system installation of common packages that can be shared between users.

There absolutely is. Your default DEPOT_PATH will include something like /usr/share/julia and /usr/local/share/julia. These are where shared installations should go. Any packages installed there with the right permissions will be usable by any user with those depots in their depot path.

Because Julia doesn’t come packaged on most distributions, any software written in Julia must be distributed alongside the Julia runtime. PackageCompiler is great for this if you can get it to work with your software, but it still needs polish, and the binaries it produces are still rather heavy. Other solutions involving bringing along a whole Julia installation are of course even more heavyweight, and then you’re also stuck with the often very large pause of compilation time at invocation.

This is all quite true. The way forward is to improve PackageCompiler and being able to generate leaner standalone binaries for Julia programs. Of course, that will probably require carrying a bunch of LLVM patches around :man_shrugging:


split this topic #20

8 posts were split to a new topic: Side discussion on LLVM C backend