How does Julia compare to shell scripting using bash and its "friends" like bc on a Linux system?

Shell scripting is excellent for many system operations. Many of the utilities are, er…, like 40-50 years old, and was designed for unix for such purposes. Utilities like ls, find, test, grep, uniq, comm, cut, join, rev, sort, head, tail, awk, sed, dc forms a fairly good set of tools for e.g. working with various csv-style files. Prior to perl, most system operations were done by combinations of such things, in various shells like sh or csh. When the clock speeds creeped upwards some 30 years ago, perl took over as the universal tool for such purposes.

With even faster computers, python was developed. The extra cycles, 8 MHz became 4 GHz, were mostly used for interpreting languages like python, java, and R. This made various things easier, at the cost of performance. Performance isn’t that important for systems operations.

Those who required speed for technical and/or scientific computing, used compiled languages, mostly C, C++, and Fortran, and gradually got several orders of magnitude speedup. Or used the, er…, Cobol magic available in python, that of using special arrays like numpy for computation, together with a set of numpy-tools written in C. Or they used R with embedded C++ code (it’s really easy in R). Some even used assembly for critical parts of the code. This has gotten harder and harder with all the parallelism in modern cpus, so most such development is still done in C, C++ or Fortran, though with the newcomer julia as an option.

There are many ways to do things. Why python has become so popular I do not fully understand. Perhaps because universities found it suitable as an easy introduction to programming. My university (University of Oslo), switched from simula to java to python as the introductory language during the last 40 years.

7 Likes

One of the often overseen reasons may be that popularity follows the money. This made MS Windows dominating the market no matter the quality and usefulness. An alliance of big player like IBM, Google, Microsoft, Apple, … investing money into popularity with the expectation to increase earnings with high rate of return on investments could explain what otherwise seems to be a mystery. Notice that Guido van Rossum was hired by Google and now is working for Microsoft. Money from main big players is probably also the reason how it comes that LLVM gets momentum and maybe even the reason behind Julia?

There’s many factors, market trends would have more to do with what libraries happened to catch on or what communities picked something up than anything to do with the base language. However the base language does have an effect, and all these factors do amplify each other.

The big one is Python is one of the user-friendly glue languages. You don’t have to think as much to get something done, and it didn’t matter that the base language was slow because the libraries ran optimized code compiled from other languages. Note that this also applies to bash “orchestrating executable files”, though bash intentionally has few features and is thus more limited as glue. Among such languages that gained traction early e.g. R, MATLAB, Python, my peers and I found Python’s base language to be more straightforward and organized (modules made a big difference), so this might have contributed to its dominance.

hmm… this topic doesn’t seem to go anywhere useful…

6 Likes

Agreed. I set a timer for two more days, after which the topic will be closed automatically.

4 Likes

Is this a way way to prevent getting interesting answers which might show clear advantage of using command line tools which arrive at results faster than Julia?

~ $ time maxima --very-quiet --batch-string=''

real	0m0.089s
user	0m0.079s
sys	0m0.017s
~ $ time julia -e ''

real	0m0.438s
user	0m0.421s
sys	0m0.140s
~ $ time maxima --very-quiet --batch-string='block([modulus:102], print(rat(3)^10000000000000), exit());'

block([modulus:102],print(rat(3)^10000000000000),exit())
warning: assigning 102, a non-prime, to 'modulus'
69 
                                    exit()

real	0m0.092s
user	0m0.077s
sys	0m0.022s
~ $ time julia -e 'println( powermod(3, 10000000000000, 102) )'
69

real	0m0.545s
user	0m0.552s
sys	0m0.116s

Your example here, which is mostly measuring the overhead of spawning julia for individual small calculations, has already been extensively discussed in another thread: Python big integer vs. Julia big integer arithmetic - #24 by stevengj

Thread timers are there to avoid wasting everyone’s else’s time on repetitive discussions (such as you repeatedly posting similar benchmarks and forcing people to give the same answers over and over) or meandering unfocused threads (e.g. generalized discussions about reasons for relative popularity of programming languages, which we’ve had over and over on this forum).

6 Likes

This is from my perspective a conclusion the commented post did not deserve, because it shows clearly the overhead making it possible to see the timing required by the calculation itself. Have you overseen this, or is calculating the difference of two simple timings too difficult to be done in head and need to be pointed out separately, or is there another reason for your conclusion?

I surely would not repeat the posting from another thread here … but it is a nice example how next contributions to this topic can over time enrich the subject if it stays open for answers.

This kind of dismissive sarcasm is not welcome here.

9 Likes

Nope. Subtracting the times isn’t sufficient here, for one thing because there is a variable startup cost depending on what functions are being invoked. People have devoted a lot of effort to figuring out how to accurately benchmark marginal compute costs in Julia and other languages, and your approach isn’t how to do it.

2 Likes

So why not provide the right approach showing the facts instead of criticizing a newbie for not being able to come up with what you mean would be better representative for measuring timing comparing Julia to Maxima?

The very first answer to your question proposed a simple approach, which doesn’t include startup time but still includes first compilation time:

For more accurate benchmarking with repeated function runs, you should check out either BenchmarkTools.jl or Chairmarks.jl.

3 Likes

This user Python big integer vs. Julia big integer arithmetic - #6 by sgaure already mentioned this quite early into your thread

1 Like

bc isn’t faster on such big numbers … but seems to be faster on smaller ones. Direct comparison without startup times is hard up to impossible to do 100% correctly … Picking up one extreme example does not tell much about the overall performance … bc is only an example of a command line tool available on a Linux system. maxima is another one better suitable for comparison on this one winning by some orders of magnitude against Julia.

You asked about benchmarking methodology, we gave you the answers. Repeatedly.
In a nutshell: startup and compilation time only matter if you’re gonna turn on the computer, perform one computation, turn off the computer and walk away. In nearly every other situation, you want to exclude these latencies from your benchmarking procedure. That’s what BenchmarkTools.jl and Chairmarks.jl do.

For any given problem, you can find another tool that does it better or faster than Julia. That is especially true if you include startup time in your measurement, and no one here is disputing that. Like every other programming language, Julia offers a compromise of doing some things well and some things poorly. If startup time is your main worry, then indeed Julia may not be the best tool for you, and shell scripting seems like an interesting alternative. To me that’s a good summary of the discussion.

3 Likes

The good summary of the discussion would be to show how expertise in using Julia compare to expertise in using shell scripting and the command line tools coming with a Linux system. Generalizing that for any tool there might be a better and faster one is not a summary and not on topic - this is obvious and generally true. The question is about the comparison of shell scripting and Julia and it seems that up to now there are only a few answers to the actual question which need more clarification to support the claims they suggest to be true.

Programming languages are tools. They have tradeoffs, as @gdalle mentioned. Expertise in Julia will allow you to accomplish tasks for which Julia is an appropriate tool – differential equations, simulations, data science, to name a few. Expertise in shell scripting will allow you to accomplish tasks for which shell scripting is an appropriate tool – system administration, task automation, basic CLI utilities.

Because Julia is in a totally different class of tooling, we don’t offer a comparison because they are apples-to-oranges. A clear example of this is the best approaches to benchmarking. For a shell script, it makes sense to measure startup because they are often used repeatedly from a cold start. It does not make sense to measure Julia operations from a cold start because it ought not be used that way – it’s the wrong tool for the job.

This is like having a comparison page for hammers and jackhammers. You can drive a nail with a jackhammer… I suppose. But I don’t need a comparison page for them :slight_smile:

3 Likes

“Comparing expertise” is a very vague prompt, and the experience of the moderators here shows that such threads are seldom productive, which justifies my timer:

My advice to you: if you have a specific task you want to try in Julia, ask a precise question about it with a clear benchmark and a performance to beat. That’s how you will get the best answers from the incredible concentration of smart and kind people on this forum.
Vague considerations about “shell vs Julia” will just end up wasting your time, and the time of the people trying to help you.

5 Likes

You don’t answer the asked question … you run an “attack” on me personally instead. I have trouble to see the value of doing this … The question asked for the reasons of your conclusions and still awaits a reply.

If you are not interested in such topics, why do you engage in the discussion at all especially if your experience tells you that it would be waste of your time? Wouldn’t it be much easier to stay away from such threads in first place instead of trying to stop others from providing contributions?

If shell vs Julia is “vague” how does it come that Julia documentation copes with Python vs. Julia and other comparisons?