I'm trying to write something with the best possible performance (for a library). I keep wishing I was writing C! is that normal? Should I just write C?


#1

I’m trying to write something with string parsing (something that may eventually turn into a JSON parser) with really good performance. I’m working something that needs to parse JSON-esque string literals like 800 million times. The current JSON library in Julia is fine for little things, but it’s not as fast as (for example) the Python’s JSON module. (just that everything else I’m using it for is faster in Julia)

At the moment, I’m kind of trying to write C in Julia: Turn off bounds checking, pre-allocate lots of buffers, deal with Base.CodeUnits instead of strings, use veiws for everything. I’m doing bit shifting. I’m doing all the things.

My current problem is that I’m trying to send a part of a string to parse that I’ve already turned into CodeUnits, but I don’t want to pay for it being copied to a proper string.

Anyway, I know how to do it in C. char is just another number. There is no conflict. In Julia, I have to try really hard to work with buffers and raw arrays. Just let me send a $*^@#% pointer to some random point on an array to a string function!!

It seems that writing things in pure Julia is considered a virtue in the community. I kind of want to use Glib instead of parse. I kind of want a lot of things. Is this normal? Should I just do it in C? I dunno. I just want to make something useful and reusable and as effing fast as possible.


#2

Are you talking about JSON.jl? If there’s a problem with the performance of that package, why not try and improve it over there instead of reinventing the wheel?

Aside from that, do you have a specific question about the performance of some piece of code? If so, please post it here so we can help you.


#3

You could look at using https://github.com/JuliaData/Parsers.jl. It provides a number of different parsing abilities on any IO object, including IOBuffer(raw_byte_vector). It’s used to power the parsing functionality in CSV.jl and JSON2.jl. Happy to answer any questions or help give pointers however I can.


#4

Sorry! I was writing my post and somehow did a typo that resulted in my message being posted before it was finished, and I think you started answering before that!

The question is, should I just write C if I’m trying to write something very fast with manual memory stuff, or should I stick it out and try to do really fast pure-Julia?

I will make a pull-request to JSON.jl if I come up with something faster, but that actually involves writing something first. I’m just working on string literals right now (because I need that), and I’m sure it could be patched in. The current tagline for JSON.jl is that it’s “in pure Julia”. That’s fine and all, but I want it to be fast more that I want it to be Julia.

The question is, should I keep trying to write optimized Julia, or should I just write C?

I’m also not entirely sure which cases Julia will optimize, so I don’t know how much time I’m wasting.


#5

The thing about Julia is that you don’t need to get nitty gritty down into memory shuffling to get really good performance. If you’re trying to do that in Julia, you’re probably doing it wrong.

It seems like you’re unhappy in general with some performance you’re not getting - what have you tried so far? Please share the code, so we can take a look and propose changes to make it faster or share some specific part you’re unhappy with.


#6

What made you choose Julia in the first place?


#7

I would argue that you should just write pure Julia. At worst, the innermost loops will be no worse than low-level C code, though often there are tricks you can do in Julia (e.g. via metaprogramming or higher-order abstractions) that are not easily available in C even when you are not writing type-generic code, though it takes time to learn how to use these effectively. Moreover, all of the surrounding non-critical code is likely to be much nicer in a high-level language. And it will be much easier to distribute your code as a Julia package (since you won’t have to worry about the availability of a compiler on the user’s machine).


#8

I’m normally a Python programmer, but I can also write some C (and some other things). I became interested in Julia because it is many times faster than any other dynamic language for processes than run longer that JIT warmup. I also like it because of AST macros, multiple dispatch, and because it is surprisingly excellent for administrative scripting. Most of the filesystem functions are in Base and have surprisingly UNIX-y names, and the Cmd literal is the best abstraction I’ve seen for running processes in any language, bar none. (I started programming with BASH and learned other things when I wanted to write programs that were actually hard, so I still have a soft-spot for administrative/automation scripting).

Professionally, I’m now doing some very specialized string manipulation (specifically, reconstructing Hebrew script from Romanized Hebrew in metadata – I’m a Hebraist by education, but my stuff aims to be applied to all kinds of script-conversion problems). I’m not a mathematician or computer scientist, I’m just a humanist-turned-programmer who wants to do things with strings really fast. Python is mostly fine, but sometimes it’s not, and when it’s not, you desperately try to extend it with C, but all the API’s suck. In Julia, the language itself is way faster, and the interop with C is phenomenal, and going back and forth between the two is no problem.

So, normally, I want to write in a Python-ish, high-level language (and Julia is great for that), but I’m also aware of performance, and I know how to write fast-ish C.

So far, Julia has been really good for me at being faster than Python, but now I’m trying to write something that’s comparable in speed to C.

The Julia dream is that it should be able to facilitate this. The tagline is that it solves the two-language problem. I’m sure it does, somehow. I’ve had a lot of success writing Julia that is faster than Python. I’ve had minimal success in writing Julia that is faster than libraries implemented in C.

It’s not that I don’t love Julia. I do. If I write this thing in C, I’m still going to call it from Julia, and it will be so easy–that’s the beautiful thing. It’s that, when I want to deal pre-allocated buffers, pionters and raw arrays, it all comes much more naturally in C, because C is designed for things horrible things like pre-allocation and mutating objects in place and all those things that are really fast for the cache and the CPU.


#9

It takes time to learn to do performance optimization in a new language, especially a new high-level language where you aren’t sure at first what the effects of different abstractions are. But I’ve never found a case where an experienced Julia programmer, given enough time, could not essentially match C performance, usually with cleaner and more versatile code. (You should almost never have to deal with raw pointer arithmetic.)

(The cases where Julia does not match C performance are mainly large, highly optimized C libraries that represent an enormous amount of engineering effort … it simply takes a lot of time to replicate that effort in a new language.)

If you have a small working snippet that you are having trouble optimizing, feel free to post it on discourse — a favorite pastime of many Julians is seeing who can squeeze the most performance out of a simple problem. Seeing what people come up with will help you to learn how to optimize your code in Julia.


#10

If you’re interested in parsing, you might find some of the work the BioJulia group useful, in particular, Automata.jl which they’ve been using for automatically generating fast parsers for various input formats.

See also Ben’s JuliaCon talk:


#11

I wonder if Cassette.jl could provide the general solution you need. i.e. context specific optimizations.

Something I’ve wondered if it would be hard to do ( maybe it exists? ) is the equivalent of the inline assembly for C. Like @__llvm__( some llvm code ).
I would use asm_ occasionally in C to, for example, get an as fast as possible specific geometry operation.
You’d probably loose much of the context between llvm calls you needed for your specific problem though.


#12

Really good question. The tricky thing is that I don’t know exactly what Julia optimizes, but I do know what works well in C. I’m sure Julia is fast for a lot of things I don’t know about… but I don’t know about them. At the moment, I’m trying to convert strings to code units and do all non-mutating slices with @view, and trying to avoid converting back into String or Char until the last possible moment. I would desperately prefer to write the whole thing with Char instead of UInt8 (bytes), but my intuition is that converting between Char and String does a lot stuff–namely, converts an UInt32 to several utf8 UInt8s every time, with several memory allocations. I want to keep everything in buffers and avoid allocations as much as possible. I’m just not sure how much Julia compiler does this for me, and how much I need to worry about. This code doesn’t even run yet, but as you requested, here is what I’m doing so far: https://github.com/ninjaaron/LibAaron.jl/blob/master/jsondec.jl

The problem isn’t that it’s impossible to optimize Julia–the problem, for me, is that it takes a lot more thinking to work on raw arrays and do in-place mutation in Julia than in C.


#13

So yeah, funny thing. As I mentioned, I’m not a computer scientist, and I don’t know anything about llvm or asm or anything. I just happen to be able to write pretty fast C, because I figured out that everything works much faster with pre-allocated buffers and pointer arithmetic than constant copying, bounds-checking, etc. I’m not an expert in C, and certainly not in algorithms, but I can usually make things fast. At present, learning LLVM or asm is outside of scope. Not that I’d never try it (LLVM is a very interesting technology), but I only know how to write C and Julia at the moment (well, and Python and four or five other high-level-languages that are irrelevant)


#14

From the way you describe your problem and motivation, I think you’re ideally suited to get a lot out of Julia’s mixture of performance and convenience. And you’ll certainly find the community is full of like minded people who care about performance and just love a good optimization challenge :slight_smile:

I know well that “fish out of water” feeling, the frustration of going slow while knowing just how to do the task in another language / system. Of course, if you already know how to do it you can just call C and I don’t think you should feel bad about it. Especially if you’re in a hurry and if it lets you get some “real work” done! But I’d encourage you to try optimizing it in Julia because good performance should be achievable, you’ll learn a lot which will be useful in the long run, and the library will be easier to distribute without binary dependencies.

Try the code introspection macros to quickly build a good intuition about this: @code_warntype, @code_lowered, @code_typed, @code_llvm and @code_native. The last of these remains one of my favorite things about julia and it still blows my mind a little bit that I can get the native assembly in the REPL, and change the implementation interactively to see how the optimizer does.


#15

I’m kind of trying to write C in Julia: Turn off bounds checking, pre-allocate lots of buffers, deal with Base.CodeUnits instead of strings, use veiws for everything.

All of these sounds reasonable and unavoidable in any language. You can’t make compiler create buffers for you (at least not in general case), so you have to modify function signatures. You either do bounds checking or don’t do it, Julia and C provide both of these options, though with different defaults. You either copy arrays or use views/pointers, again both options available in Julia and C with different default behavior. So I wouldn’t call it “writing C in Julia”, I’d call it “thinking of low-level details” which are actually language-agnostic.

Anyway, I know how to do it in C. char is just another number. There is no conflict. In Julia, I have to try really hard to work with buffers and raw arrays. Just let me send a $*^@#% pointer to some random point on an array to a string function!!

You might just need another kind of strings instead of default Julia String type. Indeed, depending on use case, strings may have very different implementation with very different pros and cons. For example, in C++ there are 2 “standard” string types (char* and std::string), but almost every large library implements its own string type. Python 2 vs Python 3 strings (and bytes) have been a topic for many holy wars. UTF8 vs UTF16 vs UTF32. Null-terminated vs. length-keeping. Mutable vs. immutable. Strings are really hard!

Fortunately, Julia supports alternative implementations of AbstractString which should work with most string functions without conversion. You might find something more suitable for your needs in Strs.jl or even implement your own string type.


#16

It takes time to learn to do performance optimization in a new language, especially a new high-level language where you aren’t sure at first what the effects of different abstractions are. But I’ve never found a case where an experienced Julia programmer, given enough time, could not essentially match C performance, usually with cleaner and more versatile code. (You should almost never have to deal with raw pointer arithmetic.)

true dat! I don’t want to do pointer arithmetic, it’s just that I’m doing something where I want to mutate buffers in-place and copy it at the last possible moment, and pointers are actually pretty good for that kind of thing. Not that I’m using pointers in Julia (or even know how), but I’m doing some loops with multiple counter variables, and a lot of inline functions that pass counters and buffers all over the place and soforth, and it’s all uint8_buffer[i] rather than for char in my string. I actually found myself wishing I could do for (int i=0; i < buffer_len; i++) because it would have been easier than trying to write it with a while loop, in this case.

(The cases where Julia does not match C performance are mainly large, highly optimized C libraries that represent an enormous amount of engineering effort … it simply takes a lot of time to replicate that effort in a new language.)

Definitely! That’s why I’m tempted just to use ccall and link against a fast C library. I was trying to escape 15 million strings in (potentially) malformed URL’s the other day. There is a URLParser.jl in JuliaWeb that has an undocumented escape function, and it did what I wanted, but the API was weird (probably why it’s undocumented), and it just happend to be 7x slower than the analogous function in glib, so I just used ccall and glib (which had a better API, to my taste). That script is more of a one-off and isn’t something I care if other people can use, but in this case (the JSON string parsing), I realize I’m working on something that could potentially have wider applicability, so I’m trying to write it in a reusable way, rather than rely on third-party C libraries.

If you have a small working snippet that you are having trouble optimizing, feel free to post it on discourse — a favorite pastime of many Julians is seeing who can squeeze the most performance out of a simple problem. Seeing what people come up with will help you to learn how to optimize your code in Julia.

I posted one in a previous comment, but it’s not even in a runnable state yet! I’m not having trouble optimizing it perse–I’m trying to write it with the CPU in mind now so I don’t have to optimize it later. This tiny little code snippit is short, but it took a lot of thinking (I normally write Python, but, for this particular snippit, I was thinking in C because I know exactly what I want to happen with the memory). I had to look up a lot of things in the docs to get this far. I’m not stuck yet, exactly, I’m just doing something I know I could do more easily and with more predictable behavior in C.

I would even say that I know Julia better than C. It’s just that when you call a function a billion times in a row, you care more about allocations, and controlling allocation is just a teeny-tiny bit more intuitive in C.


#17

That’s why I specified working snippet. People aren’t going to volunteer to optimize non-functional code.


#18

Maybe that’s the problem. I can write C goes fast and does the right things with memory, but I don’t know a lick of assembly. Maybe the next step for me (in general, not necessarily as part of this project) is getting more comfortable with asm so I can understand more about how Julia optimizes.


#19

for i = 0:buffer_len-1?


#20

Yes, definitely! If something was running, it would have been in the original post. I myself don’t even know if it’s slow. I just know the JSON.jl is 20x slower decoding JSONstrings than the Python JSON library, and I want to do better. That’s why I want to use a C library!

the above statement was in error. was thinking nanoseconds were a millionth of a second, but apparently I was off by a thousand.

I promise I will post a new thread as soon as I have something working and need further advice on specifics. I more meant this question in the general sense of “is it worthwhile to try to write something fast in Julia, or should I just call C?”