Is julia a good choice for this?

Hello! I’m a complete julia novice, but I’ve been hearing about it from years from friends and colleagues. I’ve approached it for a project that I want to start but need some advice on what direction to take.

I’m looking to start a relatively long-term project of converting a large codebase (at the moment written in Fortran) to a more modern language. Part of this is in order to make it more accessible to a wide community, and part is due to the nature of the application, which could greatly benefit from more modularity.

I’ve tried to work out an example. Imagine an application that goes through the life of an animal. The data would be what the animal is like (weight, size, number of limbs…) while the methods would be the actions it performs (eat, sleep, move…). The data will look different depending on the animal, as some will have not only various dimensions and number of legs, but also completely unique characteristics. Similarly, the “eat” function would look very different for a domesticated dog, which is given food at home, and a wild dog, that has to hunt down its prey.

This seemed well-suited to an object-oriented implementation, having a main class with the basic version of the data and process, and then different classes with the appropriate data and updated methods; what class to use can be decided depending on the choices made by the user when starting the program. For this reason I approached python, using numba to achieve similar efficiency to the fortran code. The problem I encountered is that support for classes doesn’t go all the way yet, or at least the functionality I was looking for required some counterintuitive hoops. This is why I’ve decided to finally try julia instead.

Now, I’ve quickly come to understand that julia doesn’t really support classes in the way that python does, and relies on multiple dispatch instead. In my understanding, this means that instead of having class Dog, class Cat, etc, each with their version of eat, sleep, etc, in julia I would define different data types Dog, Cat, etc, and then I would define a eat(Dog), sleep(Dog), etc, eat(Cat), sleep(Cat), etc…
I see how this could be done, but it doesn’t strike me as much different or better than having a single eat function with a bunch of if(Dog) …, if(Cat)…, or a switch(animal). In fact, considering that I would give up on using classes either way, I’m not so sure what the advantage would be over going back to python/numba, which is more familiar to me. My worry is mainly that switching to julia will not particularly improve readability or efficiency.

I have essentially two questions: first, do I have the right idea of what julia could do in this case or have I misunderstood something crucial? Second, are there other and better ways to approach this sort of situation?

Thank you!

7 Likes

Welcome! Your impression is correct on the fundamentals — that is, that Julia completely separates data (the nouns) from functions (the verbs), and that this stands in contrast to the typical “bundling” of the two in standard OOP classes. Personally I find this to be much more like human languages, but it does mean you lose the dot-oriented programming that many love (like dog.sleep()).

But a key question underpinning this all, however, is whether you use a single struct Animal (whose particular species is a field therein, using branches) or lots of types, each of which would be a subtype of some AbstractAnimal (using dispatch).

Both are perfectly fine design patterns in Julia, but this is where your cartoon example might not have enough detail to say one way or the other. I do often see folks new to the language come up with far too many types.

Whether Julia ends up as more readable or more efficient will really depend upon you and what you’re doing and what you’re comparing it against.

16 Likes

I would instead say that

struct Dog <: Animal
...
end

function sleep(dog::Dog)
...
end

is not very different from

class Dog(Animal):
    def sleep(self):
        ...
    ...

Functionally, it’s basically equivalent. In both cases, you can have separate code bases (or separate packages/libraries) for dogs, cats, etc.

With multiple dispatch, if you have animal interaction, defining attack(::Dog, ::Cat) is far nicer than the Python alternative (where you’d be stuck with ifs) but I don’t know if that’s relevant to your problem.

10 Likes

dog.sleep() vs. sleep(dog) is really just syntax. The real difference is the way to add new functions which work with existing types or new types which work with existing functions, which is essentially called “the expression problem” and only considered to be solved elegantly by a handful languages. One of the prominent examples is Julia with its multiple dispatch nature ;) (a very nice talk on this topic from Stefan Karpinski is available here The Unreasonable Effectiveness of Multiple Dispatch)

Given that your code is written in Fortran, I assume that performance was one of the key decisions (I might be wrong, but it’s usually the case). I am absolutely sure that you will be able to port that code nicely to Julia with comparable performance but you need to bend your mind a little bit.

Btw. Julia is pretty good at interoperating with Fortran so you could even start with some bindings to call Fortran routines from Julia.

7 Likes

Welcome @wlosh!

Here is a related advise:

If there is a Julia package covering most of the features of the Fortran project, I would contribute to it instead. If the Fortran project is open source, I would share the name here so that others can jump in and share their ongoing efforts.

Don’t know how to search Julia packages? Use the search bar in JuliaHub:

6 Likes

A couple important questions that actually have very little to do with the language:

  1. can you accomplish your goals by using a popular language (Julia isn’t that popular) to wrap the Fortran code? Enforcing equivalence of rewrites isn’t trivial, and there’s no inherent need to throw away code you know to work.

  2. can you accomplish your goals with Julia’s current package ecosystem? Even if a language were perfect, there’s no point in using it if you can’t substitute key libraries, unless you plan on implementing them, too.

You don’t have to answer them in this thread, these are just big picture things you should figure out before you take first steps.

Good news is Julia is very modular in nature, I would say that it stands out as as dynamic language used to compile ad hoc mixtures of generic algorithms and types from different packages, and multiple dispatch helps a lot.

  1. Generic functions are nice to avoid rewriting the same algorithms across different types. In statically typed languages, there are restrictions on the return types of generic functions, ie it must be the same as one of the argument’s. Julia is dynamic, so the return type can be whatever the compiler infers (though there is an art to writing the method to help the compiler). You can also write a highly reusable function or method in Python, it just won’t be compiled per call signature or do type inference, nor can you add more methods to the same function name to dispatch on the arguments.

  2. Use informal interfaces, a fixed set of methods, to extend a larger unbound set of generic algorithms to a new type. This happens in Python too with instances of subclasses falling back to methods in parent classes*, though you can probably tell at first glance that encapsulating methods in singly dispatched classes makes it awkward and dangerous to add more generic algorithms afterward, especially if the class was imported from a different package. In Julia, methods belong to functions, and it’s common to define a new function or extend an imported function on a mixture of new or imported argument types (but avoid extending imported functions on only imported argument types, that’s type piracy and belongs in the home package and its development).

* Python example
class AbstractA:
    def generic(self):
        return self.interface()

class A1(AbstractA):
    def interface(self):
        return 1
## different package ##
class A2(AbstractA):
    def interface(self):
        return "one"
  1. You mentioned Numba, an LLVM-based JIT compiler. But it is restricted to compiling valid Python and NumPy code, and thus lacks features like type parameters (great for compile-time data) and named composite types. (The closest thing to named composites in Python would be classes, but the instances are not supportable as scalars by Numba’s efficient nopython mode or as inlined elements in NumPy arrays. Unnamed structures are supported but infeasible to instantiate as scalars because the array’s specific dtype does not match a scalar’s vague Python type numpy.void or numpy.rec, depending on NumPy or Numba.) It is very awkward and sometimes unfeasible to rewrite Julia code in Numba, in my experience.
2 Likes

From a design perspective, the difference between dispatching on a type and branching on a value is extensibility. In short, if I want to build on your library and support a species that you didn’t consider, it’s easier and cleaner if you use dispatch rather than branches, because then I can extend your functions with new methods for my species. If your design is based on branching, your functions are final and I have to write wrappers to get the behavior I want. Here’s a toy example:

Branching
### Your code
module Zoo

struct Animal
    species::String
end

function eat(animal)
    if animal.species == "cat"
        carefully_consider_each_bite()
    elseif animal.species == "dog"
        scarf_it_down()
    else
        error("species not supported")
    end
end

end  # module Zoo


### My code
import Zoo: Animal, eat

function my_eat(animal)  # Wrapper to "extend" Zoo.eat
    if aminal.species == "snake"
        detach_jaw_and_swallow()
    else
        eat(animal)  # Fallback to Zoo.eat
    end
end

snake = Animal("snake")
eat(snake)
Dispatch
### Your code
module Zoo

abstract type Animal end

struct Cat <: Animal end
struct Dog <: Animal end

function eat(animal::Cat)
    carefully_consider_each_bite()
end

function eat(animal::Dog)
    scarf_it_down()
end

end  # module Zoo


### My code
import Zoo: Animal, eat

struct Snake <: Animal end

function eat(animal::Snake)  # Adding a method to Zoo.eat
    detach_jaw_and_swallow()
end

snake = Snake()
eat(snake)

Note that this example does not demonstrate multiple dispatch, only single dispatch, which accomplishes the same as OOP with classes and methods. Multiple dispatch is when you define specialized methods based on the types of more than one function argument; this enables designs that are not possible with OOP.

7 Likes

Thank you for the comments, this definitely got me thinking! I will look into the type features you mentioned as it sounds like type parameters in particular could be very useful.

1 Like

Thank you especially for the warning against type proliferation, that’s definitely something I should be wary of.

Pointing out the distinction between a single struct with an extra field and multiple types, you made me realize another doubt I have (I hope it is okay to write it here).

In OOP, it is possible for two classes to have exactly the same field and only different methods. If I wanted to deal with the same case in julia with dispatch, it would lead to having two different types which not only share the same Abstract type, but are also identical in terms of their composition; the only difference between them is that some methods act differently on them. Is this something that is accepted/encouraged in julia programming, or something that I should actively avoid?

1 Like

I’m not sure what “best practice” is, but this can be handled with composition.

abstract type AbstractSpecies end

struct Cat <: AbstractSpecies 
    is_tabby::Bool
end

struct Dog <: AbstractSpecies 
    some_dog_prop::Symbol 
end

mutable struct Animal{S}
    species::S
    height::Float64
    num_limbs::UInt8
end


cat(is_tabby, height, num_limbs) = Animal(Cat(is_tabby), height, num_limbs)

function grow(a::Animal, i) 
    a.height+=i 
end

If you don’t want the explosion of types (if you’re dealing with hundreds of species) then it might make more sense to use an enum and have nullable fields in the animal type.

1 Like

@wlosh Welcome to Julia! Please stick with it - everyone here is helpful and will answer questions you have. I can see sometimes here people get disheartened early on - no need. Just yell for help.

Not adding anything to the debate…
Animals are generally 5 sided - four limbs and a head. Any expert able to tell us when that evolutionary branch happened in the animal kingdom? As of course jellyfish and octopuses exist, but there is definitely a five sided branch.

ps.HAve you found Introduction · Agents.jl (juliadynamics.github.io)

1 Like

I don’t see any specific drawbacks with it: Generally it should be just OK for both readability and performance.

Here’s what the manual has to say about this:

One particularly distinctive feature of Julia’s type system is that concrete types may not subtype each other: all concrete types are final and may only have abstract types as their supertypes. While this might at first seem unduly restrictive, it has many beneficial consequences with surprisingly few drawbacks. It turns out that being able to inherit behavior is much more important than being able to inherit structure, and inheriting both causes significant difficulties in traditional object-oriented languages.

https://docs.julialang.org/en/v1/manual/types/

I’m not generally an expert, but specifically for cats I can tell you they have also tail too, in most cases.

2 Likes

This recent paper is quite fascinating and proves the number of limbs can be 0:

For how to proceed with the code port: See the following post by @tamasgal

One important consideration: When proceeding with the port bottom-up, as suggested above, you will acquire expertise as you proceed. For the top-down way (project architecture first) you need to be an expert at the very beginning.

I must also warn you: It looks like right now you see OOP as the most natural way of representation. After some time of using Julia it would appear rather artificial, with no way back.

In the terms of dog-oriented :slightly_smiling_face: programming:

play(dog, dog)
play(dog, cat)
play(cat, cat)
play(cat, mouse)

- the behavior is not primarily a property of just one object.

2 Likes

Yes, and no, multiple dispatch is the main native paradigm, and classes not supported natively, but them/such OOP are by packages.

Such (traditional) OOP is however a known performance trap, can be order of magnitude slower in even C++, at least if your methods are trivial (as mainstream advises).

That is not the OOP way, nor the Julia way. This means if you add classes you must change your code, why it is bad.

It helps to learn the idiomatic Julia way, it will be as fast as anything and readable, and extensible.

If not you are doing this wrong, and need to read the performance section of the manual.

You can reuse all Fortran code (at least non-OOP Fortan… it supports OOP by now, I think rarely used, and I wouldn’t still rule out reusing such).

In C++ the OOP way is slow and certainly in Python too, though it’s less clear to me if there its slower than non-OOP or not, but either will be very slow.

You CAN do repeated ifs instead of a switch in Julia, and usually it would be a code-smell, though maybe not slow.

Julia doesn’t have switch keyword, by design, since it’s redundant with (OOP and) Julia’s multiple dispatch.

Julia doesn’t have pattern matching, like some other languages, a replacement for switch, but has it in packages. Some people are fans of it others are detractors of the concept in general.

Here’s at least on article that show how you would do “OOP” natively in Julia:

I.e. not class-based.

FYI: You CAN do traditional class-based OOP in Julia, there’s a) based on Python’s OOP (and same or similar enough to C++), and b) on Lisp/CLOS, just letting you know so you don’t think exists if you feel it’s needed:

ObjectOriented.jl is a mechanical OOP programming library for Julia. The design is mainly based on CPython OOP but adapted for Julia.

The supported features:

  • multiple inheritances
  • using dot operators to access members
  • default field values
  • overloaded constructors and methods
  • Python-style properties (getters and setters)
  • generics for the OOP system
  • interfaces

Check out our documentation.

We recommend you to read How to Translate OOP into Idiomatic Julia before using this package. If you understand your complaints about Julia’s lack of OOP come from your refutation of the Julia coding style, feel free to use this.

Here’s a newer package, a different flavor (pun intended) of an OOP system (sadly the designer didn’t know of the one above):

[If you find JuliaObjectSystem.jl, then it’s the same was just renamed, before registering.]

1 Like

I had a similar frustration: to not be able to extend non-abstract types.
The best, most-julian way I found was:
@quasiabstract struct from:

It does not support multiple inheritance though.

(I used and contributed to OOPMacro.jl in the past but I think ReusePatters.jl plays better with multiple dispatch)

2 Likes