Hello, I am new to julia, and before that, I use matlab, numpy, fortran quite alot, and I find Julia is everything I need to do computational work, especially its auto-differential ecosystem.
After reading some offfical documents, I find the biggest coding paradigm difference is to write devectorized code according to: Performance Tips · The Julia Language
I am wondering, since devectorized code is so highly prefered in julia, why does not julia make devectorized operation its default behavior? instead of using boilerplates like: f(x) = 3x.^2 + 4x + 7x.^3; and fdot(x) = @. 3x^2 + 4x + 7x^3; or to use additional Devectorize.jl
Is it possible to make devectorized or dotified operation default behavior for Julia in future versions?
and also, the println and print intrinic functions are mighty confusing for new users, can you do something about that?
Writing unvectorized code is orthogonal to automatic broadcasting. What Julia and the unvectorized paradigm says is “don’t treat vectors special”, which is a much more tenable and extensible way to program. Want to take the length of each vector in a vector of vectors? Just broadcast it. I would recommend trying this out for a while and then trying to go back to matlab, you’ll notice and be annoyed by how arbitrary their broadcasting rules are.
Maybe you can explain what you mean by ‘devectorized’? I would normally take it to mean ‘not vectorized’, as in using explicit loops instead of broadcasting, but the above paragraph is a bit confusing on that point.
Anyway, neither ‘vectorized’ nor ‘de-vectorized’ are preferred in Julia, both styles are equally valid, and you are encouraged to use the one that best suits your tastes and the problem at hand. If it is easy to vectorize a problem, and it makes the code more readable, consider broadcasting/vectorization. If it is simpler and/or clearer to use a loop, consider broadcasting. Performance is often identical, but loops are always fast.
by ‘devectorized’ I did mean ‘unvectorized’, and maybe I made a mistake of thinking there is a preference of unvectorized code over vectorized code since I am new to julia.
But from many comments from forum inluding the one about performance tips from the official manual, there seams to be a huge performance boost both in memory usage and computation time using unvectorized code.
by ‘making unvectorized operation default’ I mean would it be possbile to make operations like A = B + C unvectorized by default behind the curtain, instead of writing A .= B .+ C ?
the performance tips from the official manally I was talking about:
This example from the performance tips is just showing poorly vectorized vs correctly vectorized code. In the first example some of the operators are not ‘dotted’, and therefore there will be intermediate array allocations, while in the second example all the operations are dotted (by using the @. macro), and therefore all the broadcasting operations are fused into a single broadcast.
Proper use of dotted broadcasting should be very performant, and normally similar to an ordinary loop.
I struggle a bit to follow what you mean, but in the end, ‘behind the curtain’, operations like these are always finally turned into loops (devectorized). But I think this will all be clearer to you if you read this blog post, introducing the dot broadcasting: More Dots: Syntactic Loop Fusion in Julia
If what you are actually getting at is that you would like the dotting to happen automatically, then that is not going to happen, and the current behaviour is, in my opinion, one of the very nicest things about Julia. Read the blog post, and perhaps you will see.
I am new to julia, by reading comments from forum and the offical manual, I found writing loops is almost always faster (even of orders of magnetude ) than writing vectorized code.
I am a geoscience and remote sensing software engineer, I work with satellite data, most of them are raster, writing vectorized code is the most intuitive way for my application.
My question was, since julia is most performant with unvectorized code, would it be possible to unvectorized operations such as A = B + C by default, without decorating with @. in future release?
about ‘println’ and ‘print’ , it might be the most commonly used function in any language, would it be possible to add a default argument for ‘print’, someting like: print(…, newline=true) ? It might be a welcoming change for OCD people~
As I said, this is not correct. Dot-vectorized code can be equally fast as loopy code, as long as it is done correctly, and used in situations where the vectorization can be ‘fused’ (as described in the blog post.)
No, this will not happen, and is not a good idea. Also, it would not improve performance at all to allow vectorization without the @. It would only be a change to the surface syntax.
BTW, it is confusing that you are using the words ‘devectorize’ and ‘unvectorize’, when it seems that you mean ‘vectorize’. What you are describing is automatic vectorization, not devectorization.
So you want all calls to automatically be vectorized, i.e. an implicit @. in front of every expression?
That would create quite a bit of problems, one simple example is that A * B and A .* B should mean completely different things if A and B are for example matrices, and now we want to always force the second one?
Not sure I’m following, but you are saying you would you rather write print("...", newline=true) than print("...\n") or println("...")?
To me the current methods are clear enough, and more convenient, but that is subjective I guess.
I think the OP wants newline=true to be the default, so that print("Hello World!") would automatically add \n to the end. This is the behaviour of the print function in Python. Then you would have to write print("Hello World!"; newline=false), to not get a newline.
This would be a breaking change. @Haibo_Zhao, so it could not happen until version 2.0, and I don’t think it would be very popular, so I doubt this will ever happen.
Instead, I suggest that you get used to the println function, it does what you want, and is widely used.