Is Julia's way of OOP superior to C++/Python? Why Julia doesn't use class-based OOP?

I’m coming late to this discussion and I can’t claim to have read all the posts carefully so forgive me if I reiterate something already discussed. One of the central differences between the “methods defined within classes” style of OOP and the Julia approach of generic functions with multiple dispatch is, obviously, the ability to dispatch on the types of more than one argument. The “poster boy” application for why this is a good idea is numerical linear algebra. To achieve flexibility and efficiency in numerical linear algebra you need to be able to examine and exploit the types of all the arguments.

Over the past 30 years I have worked with many different linear algebra packages in Fortran, C, C++, and Python. I even had a hand in designing and writing the Matrix package for R. None of them come close to the coverage, flexibility, and efficiency of the LinearAlgebra standard library for Julia.

Of course, this isn’t an accident. Julia was designed for technical computing and the choice (and hideously efficient implementation) of multiple dispatch was intended for exactly these cases.

31 Likes

To be clear @Botant, type piracy is not something we encourage or condone in serious packages. It’s something that should be avoided through coordination between package authors and the creation of packages that facilitate this coordination.

However, type piracy is a fantastic tool for users or library writers who just need to get something done fast. Later on, once you’ve had time to talk to other package writers and discuss a solution, then the piracy should be removed.

I agree with @Tamas_Papp though that there’s a risk here that we’ve derailed a great discussion with the type piracy talk. Perhaps a moderator can split the piracy stuff into it’s own thread?

6 Likes

Would you say that multiple dispatch is particularly suited for more numerical code? Or put differently, is multiple dispatch less of an advantage when used for programming less “science-y” tasks, like business logic in a web server, system administration tools or (weird thing to come to mind :slight_smile:) implementation of a cryptocurrency?

I’m probably not the best person to answer that question because almost all the programming I do is “science-y” and, for me, a system based on generic functions and multiple dispatch, combined with the Julia’s type system, works well. I haven’t had situations where I thought, “Gee I wish I could use a class-based OOP system instead of defining types, generics and methods”.

6 Likes

I would say that the OOP paradigm can be quite useful when doing simulations; particularly if you’re having to simulate different types of “things”. For example, one might simulate a generic pump, and then later, decide that pump is a multi-stage centrifugal pump. Most methods that work on a generic pump will apply to multi-stage centrifugal pumps, so you’ll want to inherit the measurements, specifications and prediction routines of a generic pump, and add a few specific measurements, specs, and prediction routines for specialization.

However, not all things are cut-and-dry inheritable, and it’s nice to be able to pick and choose what fields to inherit. In fact, from what I understand, inheritance is falling out of favor against interfaces/decorations for reasons like this. Things like “calculate pressure drop” might be quite different for a shell-and-tube heat exchanger vs a plate heat exchanger. So while inheritance can be useful, it can quickly turn into a square peg that you try to hammer into a pentagonal hole. Julia doesn’t have an “inherit” functionality out of the box, but it only took me (more of an engineer than a programmer) a couple of days to figure out Julia’s metaprogramming well enough to build my own inheritance functionality (that allows me to pick and choose what to inherit).

Not only that, I’m able to dispatch different “sim_predict!” functions so I can apply a “generic heat exchanger” prediction with the added operations I would need for a shell-and-tube predictor. It also allows me to “reach into” the “sim_predict!” for generic heat exchangers and swap in methods like “get_flow_resistance” for different kinds of heat exchanger so that the generic method is still applicable. It’s only when you try to build out simulations with swappable components that you realize how quickly multiple dispatch can make your life simpler. It allows you to say “do_operation_x” and allowing the “how” depend on the “what” at great detail in a simpler manner than OOP (which would require me to assign all the methods as interfaces).

5 Likes

I try an keep a list of small clear cases where multiple dispatch is a undisputabled better solution

1. Structured Matrix Multiplication.

I have this written up full here, so can read that blog post (or watch the talk version which is at the top of the page)

2. Arithmetic Type Promotion

Turns out this is really hard.
A few years ago a large proposal to overhaul Swift’s Integer type was written.
Even in this large document, they explictly:

DOES NOT solve the integer promotion problem, which would allow mixed-type arithmetic. However, we believe that it is an important step in the right direction.

To understand type promotion and why it needs multiple dispatch consider;

  • +(::Int32, ::Float64)::Float64
  • +(::Float64, ::Int32)::Float64
  • +(::Complex{Int32}, ::Float64)::Complex{Float64}

The resulting type depends on both input types, and not in a trivial way either.

3. Contextual Formatting of Objects

IIRC @thautwarm said on twitter something like “The quality of a language can be determined by how hard it is to access polymorpic printing methods”
Which I took to mean this.

This applies to julias whole show(::IO, ::Mime, ::Object), but lets sep away from that to think about generating some document.

Lets consider that was have verious objects that need to be displayed:

  • Number
  • Text
  • Equation

And variouis contexts with in the document to display them:

  • FlowingContent
  • Diagram
  • Table

So consider the options:

  • Number in general should be shown using proportional figures
  • Numbers inside Tables should be shown with tabular figures
  • Anything in FlowingContent should be shown in serifed font
  • Anything in Diagrams should be shown in with a sanserifed font
  • Except Equations in Diagrams should be shown in large sized serifed font
28 Likes

This is a great collection. We should really collaborate on a blog post (or a series thereof). I feel that the definitive “why multiple dispatch” case has yet to be made although I’ve been working towards it for some years now. My case is a bit more abstract, you’ve got the concrete examples. The hardness of the arithmetic promotion problem and the completeness with which multiple dispatch completely solves it is completely astounding.

28 Likes

Sure, lets make that happen, I will direct message you tomorrow.
A big chunk of JuliaLang: The Ingredients for a Composable Programming Language is unabashedly my take on your JuliaCon 2019 | The Unreasonable Effectiveness of Multiple Dispatch | Stefan Karpinski - YouTube
I am breifly speaking on a RSE podcast in mid-janurary about this, so any drafts I get together before then are useful for preparing for that.

11 Likes

Plotting is another good example where you want to define new plot operations for existing types, but you also want to add new types to plot. (Matplotlib does a lot of manual type introspection, and it isn’t very extensible.)

9 Likes

The common OOP textbook examples of mapping objects (in the model or the real world) to objects (instances of classes) usually involve mutable state, which brings in its own host of complications.

Mutable state is sometimes worth it, usually as an optimization, but is not in itself a desirable design feature. Yet a lot of common C++ design patterns are using it extensively.

6 Likes

Difficulty with mutable states. I’ve been there, even WITHIN the area of optimization where mutability is an advantage (such optimizing a bunch of interconnected models predictions in parallel, this basically requires some sort of decoupling). On that note, I’m pretty sure increased reliance on parallel operation is one of the biggest reasons why there is a renewed interest in functional programming styles. It’s nice the Julia’s paradigm supports the merits of both OOP and functional programming in this respect (and that the style guide explicitly denotes mutation with “!”; it’s SO hard to tell if mutation is actually being applied in Python or not).

2 Likes

In general, this is something I think would be really valuable. I continually run into people who say “Julia is just a fast Python, but who cares, because I can make Python fast anyway”. Besides the issues involved in making Python universally fast (which are not always well articulated by Julia proponents), I feel this misses the point.

The value proposition of Julia isn’t just speed. There is tremendous value in how you approach software architecture and reusable / interoperable / composable package ecosystems when multiple dispatch is available, and this message definitely isn’t well appreciated by those who only see Julia as a way to possibly speed up some numerics. And while Julia advertises itself as solving the two-language problem, it’s still not really clear to many people, who rely on extensive C-backed Python libraries with dedicated dev teams, what kind of advantage that can be in lowering the bar to more user-level contributions, among other virtues.

For that matter, it’s not well appreciated how other powerful other Julia language features can be, like metaprogramming, ability to take advantage of a compiler to do compiler-like things at a higher level of abstraction, etc. — and how they all work together in a “whole is greater than the sum” way. The “walks like Python, runs like C” performance idea seems to have taken over general perception of what Julia is about, to the exclusion of all else.

What I feel would be useful is a series of blog posts, as you say, that dive into some of these issues in a more concrete way, and not contrived hypothetical scenarios, but with examples from real Julia packages or use cases. (Not necessarily just scientific/numerical use cases, either.)

And I think it would also be helpful to make some more direct comparisons to other languages, like Python, in terms of “why it would be hard to do this without language features X, Y, and Z”. This can be a little dangerous, because it’s not great to appear to be picking a fight with another language or set of libraries. But on the other hand, I feel that lacking more direct comparisons, users of other languages are having a hard time concretely understanding what it is we’re going on about. There are some examples along these lines, like @ChrisRackauckas’s “Why Numba and Cython are not substitutes for Julia” or @tamasgal’s post in another recent thread, that provide some real examples of how Julia development can be different from Python. I think there’s just a question of how to make some of these points more widely, yet respectfully.

I’m not sure what the best solution here is, but I can say that, based on my experience trying to describe Julia to non-Julia users, that there there needs to be more basic messaging we can point to about how multiple dispatch and other Julia features can fundamentally improve software development beyond just “making some things faster”.

35 Likes

@StefanKarpinski

Blockquote
This is what makes code reuse so straightforward — there is no wrong choice that can be made that prevents it.

Not clear where the code reuse is actually occurring. I write a new method–just a function with the same name, which dispatches on my argument type (new kind of object), and do whatever I want in my function. It seems I didn’t reuse anything except the original method’s name.

Is this how re-use might occur: in my new method, I first cast my object into the original method’s object (this may or may not be possible or easy, but it might be quite easy…). Then I use the converted object to call the original method. Then I do “additional stuff” on the returned value and return that result. This seems like I can use the code of the original method and do some more/different stuff. The original method might do some tricky stuff that I dare not mess with–don’t have to as long as I understand the argument type to the original method. Now, I really am re-using the original code, without even ever seeing it–as long as I like what it seems to do, given what it returns.

Is that what I’d have to do to really re-use the code of the original method?

One of my personally favorite ways to demonstrate how something conceptually straightforward is difficult in Python is to compare

julia> 1//10 * im + 2//10 * im
0//1 + 3//10*im

and

>>> from fractions import Fraction
>>> Fraction(10, 100) + Fraction(20, 100)
Fraction(3, 10)
>>> Fraction(1, 10) * 1j + Fraction(2, 10) * 1j
0.30000000000000004j

The problem happens when you do Fraction(1, 10) * 1j , but I like to show the floating point “error” for dramatic effect, since that’s the real-world consequence of trying to do rational and complex arithmetic with <class 'fractions.Fraction'> and <class 'complex'>. As far as I know, you cannot. They are even both in Python’s standard library, and designed with some awareness of each other, but the result is still subpar.

I aimed to go into the details of the complex-rational example with this presentation I gave to my lab.

Specifically, I call out this code comment:

This isn’t even linear algebra. It’s just complex numbers and rational numbers, and already you see it’s necessary for Python to “Greenspun” something that amounts to multiple dispatch. But not really, because it’s closed off in at least two important ways:

  • it only works for two arguments and for specific dunder procedures
  • the switching logic over types is not extensible without modifying the source (one direction of the already-mentioned expression problem). Take stock of all the isinstance conditions…

By the way, I just tried the notebook in mybinder.org, and it seems to work just fine.

17 Likes

The reuse doesn’t occur in your own code that you control. There you can just add new operations and types as you wish, so the expression problem doesn’t come up. This may be why so few programmers seem to recognize it as a significant issue — it doesn’t become a major impediment to code reuse until you look at an entire software ecosystem with various independent actors. The issue occurs when one person wants to reuse parts of someone else’s code without them collaborating closely.

Suppose, for example, that you want to add new operations to types that I have defined in one of my packages. OOP makes this sufficiently awkward that it’s often easier to just forgo sharing of types and have many separate and independent implementation of the same concepts. Which is indeed something that we see again and again in OOP ecosystems — duplication of similar types instead of sharing of common types. In Julia, on the other hand, we see collections of common types like Distributions, ColorTypes, StaticArrays, various geometric primitives, etc. and they are reused by anyone who needs such a type anywhere in the ecosystem. Different use cases call for different sets of methods operation on those types, but this is no problem with external functions. The code reuse is the reuse of shared types across otherwise unrelated and uncoordinated packages.

On the flip side, in a functional languages, when someone wants to extend existing functions to new types, they cannot generally do so without editing the definition of the original function. It may happen that by some form of polymorphism—including duck typing—the original definition works; but if not, then there is no recourse but to copy-and-modify the function definition and give it a different name or, of the language allows it, “monkey patch” the original function. Again we lose out on code reuse, this time from not getting to share the logic for generic algorithms. OOP, on the other hand, excels here: it allows writing generic code that can apply to new externally defined types automatically. If, however, you need to specialize some generic operation on more than just the receiver of a method then you are still stuck in OOP whereas multiple dispatch handles this fine.

In summary, OOP allows sharing generic implementations of algorithms but discourages sharing and reusing common data types. Functional programming allows sharing types easily but makes it hard to write generic code that can be applied to entirely new types unless the existing implementation happens to work without any specialization required. Multiple dispatch supports both kinds of code reuse straightforwardly.

13 Likes

A simple to understand example based on your scenario: the function you extended is Base.+ now every function in Julia ever written that only calls Base.+ over its generic parameters will work for your types. The code reuse in that situation is the immediate capability of calling many functions you did not even know they existed over your type just because both you and third-party libraries know of the same package you extended a function (with you extending a function of such package, and the third-party libraries writing code that calls that function over generic types).

10 Likes

For content management systems, the Zope community discovered that Python’s object oriented model was inadequate – so, they created the “Zope Component Architecture”. Equivalently, the same sort of phenomena occurs among Java community’s “Design Patterns”; even walking a tree can be complex in Java, requiring double-dispatch, giving rise to the “Visitor” pattern. In Julia, programming in this way is just normal way of doing things; they don’t require any special user-land interface definition libraries or design patterns, they just work. Moreover, they are fast because they are supported directly by the compiler rather than being added on. It’s for this reason that I think Julia will eventually break into and start dominating in all sorts of business applications. The next popular content management system (the core of any enterprise application), will have an intuitive and performant mechanism for negotiating the cross product of content produces, observers, and consumers based upon multiple-dispatch.

14 Likes

One thing that I will note is that julia-land is not free of these clunky design patterns either. We just need them less often.

Probably the most famous example of such a ‘design pattern’ in julia is the Holy Trait pattern, which can become important in many circumstances where subtyping is not feasible.

Julia’s multiple dispatch and type system are fantastic, but I do think the presence of this trait pattern points to a deficiency in the language semantics, just like how patterns like ‘double dispatch’ and the ‘visitor pattern’ point to deficiencies in Class-Based-OOP languages.

18 Likes

I’ve used the visitor pattern a few times in production code, and my thoughts on it were not that OOP is deficient somehow. Similarly I would expect that other people who come to Julia from OOP will have the visitor pattern in their toolbox. It would help newcomers who know such patterns if better alternatives in Julia would be easily available (documented).
Multiple dispatch can be better, but bashing OOP does not help people migrating from OOP.

There’s a brief window in the early morning after a small piece of chocolate where I think I understand the Holy Trait pattern. However, by afternoon, I’m back to puzzling… I know the problem exists. I’m not even sure I could describe the problem. That said, I did have some unexpected success lately with package interoperability. Both HypertextLiteral and Hyperscript produce objects that are showable to "text/html" – this is what lets them interoperate seamlessly. Both of them, as a fallback, use show(io, MIME"text/html"(), obj) to print subordinates, so it just works. My code was originally doing look-before-you-leap test via showable, but it seems the Julia way is just to leap, and let a “Method Error” appear when something can’t integrate.

I don’t see pointing out deficiencies of older technologies as bashing. We’ve learned quite a bit in 30 years. The earliest that I remember being introduced to object oriented programming is with “cfront” pre-processor in the late 80s, the precursor to C++. It was, at that time, magic. It solved so many problems. I remember having lots of meetings with procedural programmers on my team telling them why we should use data structures where the first element in the structure was a pointer to a function table of methods, instead of having an enum where each function uses switch to pick the correct method implementation. This was a revolutionary idea that solved so many problems. It was the cats meow!

But, technology moves on. Just like using a pointer to a virtual function table as the first element in the data structure was a huge improvement over existing techniques at the time. Certainly those who stuck with older patterns had programs that worked, but, they spent much more time building their programs than those who used “cfront”. It’s the productivity that came with the change that convinced people. The same will happen here.

Let’s get to your specific point. The reason why the “Visitor Pattern” needed to be documented is that, to someone unaccustomed to OOP with a statically typed system, it’s really unobvious how to implement the traversal of a tree with multiple kinds of visitors. Now, with dynamic languages, like Python (that came along about a decade later), and duck typing, it’s less of a challenge. However, the Design Patterns book was really essential reading when it came out.

It’s not that Julia is perfect. It’s just that a working implementation of multiple dispatch is a huge step forward. That said, we really do need to figure out traits… it is critical to having interoperable libraries that don’t know about each other in advance. Julia seems just so close here… but I can’t even articulate what the challenge is. So, in this regard, a bunch of Julia Design Patterns will emerge, but I think these will be focused on different problems.

10 Likes