Julia with respect to reliability, sustainability, critical application, dynamic/static typing, big data, HPC?

When I think about it, I am not sure this is something that is either feasible or desirable.

Many of the implicit interfaces in Base are currently not even documented, or even generally recognized as interfaces despite the fact that people treat them as such. Very few package authors document interfaces, which suggests that doing so is burdensome. So what we have is “duck interfacing”, gradually becoming documented on demand (but not formally described and verified using a DSL) and I have come to recognize this as a fine solution to a difficult trade-off.

(And this topic is turning into a somewhat off-topic discussion. But as the original question was very hazy and the OP did not show up again, I think this is fine too).

2 Likes

Should we move this to another thread? I’ll let someone else decide.

I suspect that if there were a way to document interfaces in the Julia language then we’d see more documented interfaces. The evidence I have for that assertion is very roughly provided by Python and Javascript/Typescript, where we see a subset (certainly not yet the majority!) of those communities opting in to providing explicit annotations.

I would prefer to have this in the language, so that tooling to provide more checking would naturally arise, and perhaps even some social pressure to write explicit interfaces would come about too if there were a language provided approach.

Some interfaces are already documented, there is nothing preventing anyone from following this practice.

I suspect that the real reason for the absence of many more examples is that good interfaces are rather difficult to design, and evolve organically. At some point people realize that what they implemented can be abstracted as an interface; the threshold is rather fuzzy.

3 Likes

I was unclear. When I said “document interfaces in the Julia language”, I meant having some actual annotation, like the aforementioned interfaces, in the Julia language itself, as was mentioned in the Julia issue I linked to. Any language of course allows external documentation, but that’s as helpful as comments. A language feature would facilitate external tooling, and at least according to the issue mentioned, would not hinder your preferred method of external documentation.

I think it’s worth considering that designers of other dynamically typed languages find the use of types to be useful in very large code bases.

It is not clear to me what the tooling would be in this case.

A static type checker like mypy/pyright/pyre would be great. But I know that “type checking” is actually listed in Compiler work priorities so I suppose the core team is thinking about this as well.

Sorry, I don’t understand how this came up in this context. AFAICT that article is talking about declaring types for variables, eg

In statically typed languages, developers typically specify the type of a variable or a function parameter when they’re declared, for example, using the keyword int to specify an integer, or str to specify a string of characters, to use two simple examples.

Julia already has this: when you specify types for a method, they are effectively checked, and for all other values you have type assertions.

I am not super-familiar with Python and Typescript, but I would suspect that Julia’s problems are somewhat different. With its rich parametric type system, the main problem is not verifying/asserting the type of something (that is trivial), but figuring out what that type should be.

This is a hard problem, and it is not even clear that exposing/specifying this information should be part of an API. Eg if I do

julia> StaticArrays.SVector(1,2,3)
3-element SArray{Tuple{3},Int64,1,3} with indices SOneTo(3):
 1
 2
 3

should the user care about all the type parameters of the result, or are they just implementation details the maintainers of StaticArrays can change at any time? I am inclined towards the latter.

the user should not care, but the developer of packages using StaticArrays needs to know the parameters and if they change later on, this is what semantic versioning is for

I disagree — I would propose that even the packages that use a package should not care about the fine details of its concrete types per se, unless they are part of the API, but then it is probably not the right API.

Abstract types are a different matter. But in most cases those are best left as an internal detail, too, and the relevant functionality should be exposed via traits. My 2¢.

Well, in Grassmann.jl I actually need to use the implementation specific parameters of StaticArrays, otherwise I would not be able to write the @computed struct for the MultiVector type. So, in reality, the details do make a difference, and a change in the parameters would require compatibility changes. In this case, there was no way around it.

Hello Tamas (and others contributors),

Sorry to answer just now but I was really busy these last weeks. Nevertheless, I follow more carefully this blog now, don’t worry.

Thanks very much for your contributions and for the contributions of the others ! :slight_smile: Many informations … thanks … .

Tamas, I agree, my initial question is large and could be split.

Nevertheless, I am indeed interested in different questions concerning Julia and in having different opinions. These questions includes :

  • reliability for critical apps with minimum response time and high availibility,
  • reliability for long-life process (memory recycling management ?),
  • reliability with intensive processing computation and capacity to manage big amount of data (which explains my question about K. Bouman) compared to classical langages in app. math. (eg Fortran/C/C++),
  • reliability of bug free developpement regarding the use of a dynamic type langage compared to a static type langage.

At least, are there benchmark tests for the 3 first points ?

Thanks again to all contributors ! :slight_smile:

I wondered were the words come from: fiabilité (fr) is reliability, pérennité(fr) is sustainability.

I believe the word @step_de_paris intends is “viability”.

Or, perhaps “reliability”?

I have updated. Thanks.

How would you benchmark “reliability”?

You questions are still very vague, and consequently hard to answer meaningfully. What you call “reliability” is a relative concept: various trade-offs exist, and the best choice depends on your exact requirements, which you are still very vague about.

I wonder if you actually have experience programming for the requirements you list, because then you would be aware that these (especially low response time, high availability, and long running processes) requirements are usually served by very specialized environments.

Julia’s comparative advangage (at the moment, and it is likely to stay this way) is not 24/7, low response time, long-running processes, but fast prototyping, with performant, generic code. This does not mean that Julia cannot be useful for your purposes, just that it would require that your programmers become familiar with, and contribute to, solutions that are work in progress (eg precompiled images).

This, again, is a very vague and general question; and not even specific to Julia. Statically typed languages can catch some obvious bugs, but in high-uptime environment with nontrivial stakes, you would do much more extensive testing, CI, and QA anyway.

Finally, I suspect that this conversation will yield very little useful information for you unless you are much more specific about what you want to do with Julia.

1 Like

I guess I can take a shot at addressing some of these.

reliability for critical apps with minimum response time and high availibility,

Minimum response time is generally dependent on two things:

  1. language speed and
  2. whether there is a garbage collector with long stop-the-world pauses.

Julia is fast when used right and presumably for something so critical, you’ll do some benchmarking and use the other tooling the language has for performance analysis to make sure the code itself is fast.

Julia does use a stop-the-world mark-and-sweet garbage collector. However, Julia’s support for and widespread use of immutable data and APIs for mutation of preallocated memory (f! name convention for mutating functions) makes it fairly easy to write zero- or low-allocation code. This means that pressure on the garbage collector is much lower than many other garbage collected languages where memory pressure tends to be quite intense since the design of such languages usually makes it hard to avoid allocating objects.

That said, Julia is probably not the right choice for writing real-time systems.

reliability for long-life process (memory recycling management ?),

I guess you’re asking about whether the garbage collector works? It does. There have been a few memory leaks over the years, but they get fixed and there haven’t been any major ones in quite a while (the most recent one seems to have been due to an OS resource leak, if I recall correctly). In any case, writing and running long-running, reliable programs in Julia is as doable as it is in any other language. You, as the programmer, need to write correct code and make sure you handle error conditions, but you have to do that in any language.

reliability with intensive processing computation and capacity to manage big amount of data (which explains my question about K. Bouman) compared to classical langages in app. math. (eg Fortran/C/C++),

Yes, you can load huge amounts of data and do compute-intensive work on it. These have been design goals of the language from the very outset. This is one of the reasons, for example, that Julia does not use copy-on-write arrays. Matlab and R both do this, which effectively forces you to have several times the memory of your largest data, because it’s often impossible to avoid implicitly copying. There are many other similar considerations that have been made along the way. In Java, for example, array indices are 32-bit, which means you can’t have more than 4 billion elements in an array. Usually that’s not a problem, but sometimes you have a data set with more than 4 billion elements. In Java, you cannot load such a data set into a single array. Julia uses 64-bit indices on 64-bit systems and has configured all the libraries it ships with to also use 64-bit indices (this is often why Julia cannot use system copies of these libraries which have usually not been configured to use 64-bit indices). This means that the only limit on the size of the data you can work with is how much memory you have.

reliability of bug free developpement regarding the use of a dynamic type langage compared to a static type langage.

If you want a static language with type checking, then Julia is not such a language. On the other hand, it’s way safer than C, C++ and Fortran, which don’t even do array bounds checking or protect you from memory errors, so segfaults and accidental memory corruption are a standard part of daily life in these languages. Even Java and C# still leave you open to null pointer exceptions and other runtime errors that the type checker can’t prevent. So if you consider C, C++ and Fortran safe enough for your purposes then Julia is definitely safe enough.

The academic research is not really conclusive about whether static or dynamic languages are more reliable (this is a good survey with a slight self-acknowledged bias towards preferring static languages). The research itself is fairly split at a high level, with half of the papers leaning one way, and half the other way. Moreover, the conclusions tend to reflect researcher bias in a fairly obvious way, and the results can generally be interpreted to support whatever position you wanted to come to in the first place. If there was slam-dunk evidence that using a static language was, say, 10x more reliable than using a dynamic language, then I think we would know it by now. And that doesn’t seem to be the case.

My personal interpretation of the research on static versus dynamic reliability is that static type systems do probably catch a decent number of bugs that would go unfound in duck-typed dynamic languages, but that it’s not a huge portion of bugs—probably only some 10-20% of bugs and 20% is probably too high. I also think that these are the easiest bugs to catch by testing; moreover this class of bugs is considerably less likely to go undetected in Julia since most methods have type annotations and throw method errors if called with unexpected types. That said, I do think there is some reliability advantage to static type checking in safe static languages, but I also think that the cost in productivity and expressiveness is quite high.

A little more opinion and interpretation… It seems fairly widely accepted that unsafe static languages like C, C++ and Fortran are the least reliable and the least productive—they win on performance and maturity, however, which many people prioritize. Safe static languages like Rust, Scala or ML seem to be the most reliable languages and more productive than the unsafe static languages—the improved compiler support helps both safety and productivity. Duck-typed dynamic languages like Lisp, Perl, Python, Ruby, Matlab and R seem to be more productive than static languages by a significant margin but are somewhat less reliable than safe static languages but still much safer than unsafe static languages—they catch bounds errors and don’t allow unsafe memory access. Julia falls into a relatively new category of language: typed dynamic languages. TypeScript and Dart are also in this category. Since these languages are pretty new, there hasn’t been that much research on them, but they seem to retain the productivity of the dynamic languages while approaching the reliability of safe static languages.

Static language advocates seem to me to significantly overstate how much static type checking helps with catching serious bugs. Statements like “if it compiles, it probably just works” seem very foreign. What kind of programming are people who say things like that doing? How often do I make a mistake where I’m passing the entirely wrong type of value to some function or returning the wrong type of value? Sure, it happens, but not that often and it’s usually trivial to find such a mistake given even cursory testing during development. (Again, especially in Julia where functions can and do declare argument types and typed data structures are used pervasively.)

What is much more important in my opinion for the reliability of software in both dynamic and static languages, is having a strong culture of testing. Yes, there are type bugs that type systems can catch before you even run tests—but testing can also catch all the other kinds of bugs (80-90% of them by my previous guess). Julia does very much have a strong culture of testing. Julia’s own test suite is massive: 136,985 lines of test code, with 38 million individual tests for Base alone. Many of the widely used packages are also very thoroughly tested. Tooling for testing ships with Julia in its standard library. Test stubs are automatically generated when you create a new Julia package—just add the tests. If you host your package code on GitHub all that’s required to turn on continuous integration to test every commit that’s made is to turn on the Travis/AppVeyor apps—the configuration files for also are autogenerated for you. And doing continuous integration is very much the norm.

Bottom line: Julia is a good language for reliable programming. It is much safer than unsafe static languages like C, C++ and Fortran. It is also safer than most dynamic languages, which are duck-typed, whereas Julia can and does have explicit type annotations on function arguments. Finally, testing is the most important aspect of making any language reliable, and Julia has extensive tooling support for testing and a widespread culture of testing.

32 Likes

Usually that’s not a problem, but sometimes you have a data set with more than 4 billion elements.

Current world population is estimated at 7.7 billion. I guess that means when Facebook finally sign us all up they will be changing from PHP to Julia.
IS that my coat? You are so kind.

5 Likes

Types only catch bugs when types are incorrect. Tests catch bugs when they are computing the wrong value. In the majority of scientific algorithms, it’s easy to get a vector float64’s out, but it’s hard to get the right vector of float64’s out.

20 Likes

Refactoring in large code bases is easier for me in a static typed go project versus a dynamic typed rails project…

But really, even productive refactoring is more about good design than static types.

I’ve yet to build large projects in Julia but I’ve found the tooling and support made possible by the type system much more productive and reliable than my work in Ruby.

I have no reservations building critical software in Julia. The type system, tooling, and macros like code_warntype make development very productive.

I also use go alot and have started to use rust heavily.

The biggest benefit I get from static typed languages (besides knowing my programs are type stable) is readability and maintenance in that functions explicitly have their types, which also makes things like code formatting and code completion and text editor linting pretty good. For example, rust language server for vscode can lint my program and show me where an error exists before I compile, saving precious time and energy having to read compiler error messages.

Many of these tooling benefits that make me productive in go and rust I’ve also found in Julia, especially Juno.

At this point I chose Julia or go or rust based on needs related to packages, deployment, and time to deliver. Speed, memory, and reliability are usually not so we’ll defined that one PL is the clear choice… and honestly for most projects this is the case.

My take on the original post is that for the criteria mentioned you are likely to most successful with good developers. And if those developers are highly skilled in Julia then you’ll get what you need.

3 Likes