Feedback on Julia 1.3 RC5

I have started using Julia 1.3 RC5 and I would like to provide some feedback

Firstly, Multithreading makes things easy to paralyse because unlike Distributed, I don’t have to use the keyword @everywhere everywhere and I dont have to keep track of what worker have access to what functions.

In multithreading every thread have access to everything in the environment, and all I have to do is to ensure each thread only uses the part of the data structure they should use and not overwrite each other’s area.

But the part that drives me insane is the unit testing.

In a single threaded environment, The outcome is deterministic and I can easily follow the program execution.

In multithreading, each additional thread splits the universe into a parallel universe. Let me explain in an example.

Suppose you have a race (like a 100m sprint). With only 1 runner (like Hussein Bolt), there is only 1! way the race can finish.

But with multithreading, like 8 threads then they are 8! ways the race can finish. Furthermore a complex program can consists of 7 subraces.

Therefore the number of ways the combination of subraces can finish are

(8!)^7 which my scientific calculator gives as 1.73E32 which is a number much bigger than a million.

It is hard to keep the visions of all 1.73E32 multiverse in my head to figure out all the ways things can go wrong much less write unit tests to check that I covered all multiverse possibities in my testing.

Steven Siew

Hi Steven.

Can you provide an example of the kind of test you are thinking of?

3 Likes

I’m not sure what your explicit problem is here but you can test things that should apply to all results. You can also test maximum and minimum of stochastic results etc

1 Like

What I meant is that in a multithreaded program the outcome is non-deterministic and thus it is hard to get the same output everytime I run the program.

So I am at a lost on how to write a unit test that can ensure that my program ran properly. Unless of course if the program produce the same outcome even if the program is non-deterministic like finding the global minimum of a problem.

In my opinion, a deterministic algorithm should produce the same output whether it’s running single-threaded or multi-threaded. Otherwise, there’s a problem with the algorithm or with the implementation.

5 Likes

You must be more specific if you want guidance. I have many multithreaded applications and they are all easy to test. Do you get the desired result or not? Should be fairly easy to test for. Even if results are reordered etc., there must be something you as a programmer deems to be a correct result. Maybe you have to use something like sort/all/in/any/minimum/maximum etc., but if you can stare at the result and determine it’s correct, surely you can put that into code as well.

3 Likes

R’s future package solves that by detecting what needs to be passed for you automatically Julia just needs the same.

Google provides other people experiences on testing multi-threaded programs

https://stackoverflow.com/questions/12159/how-should-i-unit-test-threaded-code

Here is a quote from them

The first thing to do is to separate your production thread handling code from all the code that does actual data processing. That way, the data processing can be tested as singly threaded code, and the only thing the multithreaded code does is to coordinate threads.

The second thing to remember is that bugs in multithreaded code are probabilistic; the bugs that manifest themselves least frequently are the bugs that will sneak through into production, will be difficult to reproduce even in production, and will thus cause the biggest problems. For this reason, the standard coding approach of writing the code quickly and then debugging it until it works is a bad idea for multithreaded code; it will result in code where the easy bugs are fixed and the dangerous bugs are still there.

Instead, when writing multithreaded code, you must write the code with the attitude that you are going to avoid writing the bugs in the first place. If you have properly removed the data processing code, the thread handling code should be small enough - preferably a few lines, at worst a few dozen lines - that you have a chance of writing it without writing a bug, and certainly without writing many bugs, if you understand threading, take your time, and are careful.

5 Likes

Is this in any way specific to

  1. Julia,

  2. 1.3,

  3. RC5?

It looks like you had a difficult time writing unit tests for your multithreaded code — this is not surprising per se, as testing parallel code can be quite tricky. But I am not sure if this is a bug report (for that, open an issue, otherwise it will get lost), a feedback on the interface (again, an open an issue or make a PR with a suggestion, which could improve the next release), or something else.

5 Likes