Some clarifications (and best practices) on the OO model in Julia?

question

#1

Hello,
I am trying to understand how the Object Oriented paradigm should be implemented in Julia, coming from a C++ approach. I bought the excellent new ebook “Julia: High Performance Programming” from Packt Publishing, but only a few pages are allocated to explaining custom types.

So, I have the following example (that doesn’t work) and I would like some help on some specific questions and how to frame it for best practices in Julia:

File Shoes.jl:

type Shoes
   shoesType::String
   colour::String
end

In C++ I normally place each class on its own file. I am a bit lost how type hierarchy interacts with module/package hierarchy.
Should I wrap all my types in a module MyType type MyType [...] end end fashion instead ?

File Person.jl

abstract Person
  myname::String
  age::Int8
end

I know this is wrong, because abstract types in Julia do not hold attributes, but which is the alternative? Create a normal type ? What I would like to achieve here is declaring a type that, while having some attributes shared by all its subtypes, it can not be instantiated, that is any objects bust be instantiated instead by a subtype (this is the meaning of abstract classes in C++)

File Student.jl:

include("Person.jl")
include("Shoes.jl")

type Student <: Person
   school::String
   shoes::Shoes
   function printMyActivity()
     println("I study at $school school")
   end
end

File Employee.jl:

include("Person.jl")
include("Shoes.jl")

type Employee <: Person
   monthlyIncomes::Float32
   company::String
   shoes::Shoes
  function printMyActivity()
    println("I work at $company company")
  end
end

Here I have two concerns:
a) How can I avoid that the same file is included several times (that is, which is the equivalent of #ifndef / #define / #endif ?)
b) How do I bind a function to a specific type (that is, create a method in C++) ? Also in this example the same function name is associated to two different implementations based on the the type of the objects. C++ (using pointers) can use this to implement run-time polymorphism. I understood Julia obtains the same (multiple-dispach?) using the argument signature of a function, but is it possible to get it also using the type of the calling object, or this concept doesn’t exists in Julia ?

File Main.jl

include("Person.jl")
include("Shoes.jl")
include("Employee.jl")
include("Student.jl")

gymShoes = Shoes("gym","white")
proShoes = Shoes("classical","brown")

Marc = Student("Divine School",gymShoes)
MrBrown = Employee("ABC Corporation Inc.", proShoes)

Marc.printMyActivity()
MrBrown.printMyActivity()

If I run the above file I have lot of nondefined errors. When I include, am I not already using the global namespace ?

In general, I would like to understand how this simple example should be reframed to match Julia (and not C++) OO paradigm, thank you.


#2

No, your type cannot have the same name as the module.

None ATM. Ref https://github.com/JuliaLang/julia/issues/4935

Include everything only once. There’s no header in Julia so the ifdef trick is not necessary.[quote=“sylvaticus, post:1, topic:693”]
How do I bind a function to a specific type
[/quote]

You can’t.

What error you get? The member functions you tried to define kills the constructor.


#3

Hi Sylvaticus,

Here are some of my thoughts on these as an intermediate user of julia for numerical modelling and data analysis - I am sure others can fill in the gaps.

The separation of types (data) and methods (behaviour) in Julia mean that it is not necessarily practical to make one file that contains all the information for a type. That’s because you might have behaviour defined in several different contexts. For example, take add(1, 2.5). Would you put this method with the definition of an integer, or with the definition of a floating point number? Better to define what “integer” and “floating point” mean elsewhere and then have a separate section in something like arithmetic_operations.jl where you deal with what adding actually means.

I tend to make a new file for each related chunk of behaviour. Then I have one main file that includes them all. If I had one type with a lot of associated behaviour, I would definitely consider splitting it out into its own file.

I would not wrap everything in modules. I hardly use them at all. Like @yuyichao says they can’t have the same name anyway. Modules are useful if you want a separate namespace for something. For example, you might have modules that define different sets of constants.

Yes, abstract types don’t hold fields. You are right: you would need to make an abstract type first and then manually include those fields in each subtype.

abstract Person

type Student <: Person
  myname::String
  age::Int8
  school::String
  [...]
end

type Employee <: Person
  myname::String
  age::Int8
  monthlyIncome::Float32
  [...]
end

This has the behaviour you want (Person can’t be instantiated). But obviously this is unnecessary duplication and prone to break if you change one subtype without the others. So this may change in future (http://github.com/JuliaLang/julia/issues/4935). I think I have seen tricks with macros before to emulate this behaviour as well.

Not sure on this but I don’t think you would have a problem if you included a file twice. See what others say?

You don’t do this; instead you define multiple methods for a generic function:

printActivity(e::Employee) = println("I work at $(e.company)")
printActivity(s::Student) = println("I study at $(s.school)")

#4

To follow up, here’s how I would structure your example if I were writing it:

shoes.jl

type Shoes
  shoeType::String
  colour::String
end

people.jl

abstract Person

type Student <: Person
  name::String
  age::Int8
  school::String
  shoes::Shoes
end

type Employee <: Person
  name::String
  age::Int8
  company::String
  shoes::Shoes
  monthlyIncome::Float32
end

printActivity(e::Employee) = println("I work at $(e.company)")
printActivity(s::Student) = println("I study at $(s.school)")

main.jl

include("shoes.jl")
include("people.jl")

gymShoes = Shoes("gym", "white")
proShoes = Shoes("classical", "brown")

Marc = Student("Marc", 19, "Divine School", gymShoes)
MrBrown = Employee("Mr. Brown", 45, "ABC Corporation Inc.", proShoes, 5000)

printActivity(Marc)
printActivity(MrBrown)

#5

Yep, thanks. I think I got the concept (this post on composition vs inheritance is what helped me mostly).

This is the revised example made in julia-way (in a single file for convenience):

type Person
  myname::String
  age::Int64
end

type Shoes
   shoesType::String
   colour::String
end

type Student
   s::Person
   school::String
   shoes::Shoes
end

function printMyActivity(self::Student)
   println("I study at $(self.school) school")
end

type Employee
   s::Person
   monthlyIncomes::Float64
   company::String
   shoes::Shoes
end

function printMyActivity(self::Employee)
  println("I work at $(self.company) company")
end

gymShoes = Shoes("gym","white")
proShoes = Shoes("classical","brown")

Marc = Student(Person("Marc",15),"Divine School",gymShoes)
MrBrown = Employee(Person("Brown",45),1200.0,"ABC Corporation Inc.", proShoes)

printMyActivity(Marc)
printMyActivity(MrBrown)

That’s said, I still “miss” the opportunity to organise my code across a diverse range of relations between objects, and not using only composition.
After all, the expressivity of a language is how directly the code can mirrors the problem under investigation. Traditional OO design can recognise the different concepts of relations between objects: specification (e.g. Person->Student), composition (Person->Arm), and weak relation (Person->Shoes).
Julia, for what I understood in a few hours, implements only composition. Yes, you can twinkle the code to get the other two forms of relations implemented, but to me it’s at cost of loosing in expressivity.
Cheers!

EDIT: Sorry, I was just finished to type the message when I saw you also sent me a redesigned implementation of the example). Thank you.


#6

Best you forget about OO whilst learning Julia, at least for me that helped a lot. OO-patterns are certainly good at times, but often not the most Julian design.


#7

I agree with you: for me, at least, composition is not an obvious approach to express a specification such as Student <: Person, as you put it. That’s why I think a way to copy fields between types will be useful. See also https://github.com/JuliaLang/julia/issues/19383 for more discussion on this. But I also agree that there is a definite degree of throwing away learned OO patterns. I never got as far as doing much OO stuff anyway so it was a very easy step for me :wink:


#8

I kind of like the way Optim.jl deals with multiple types that all have the same fields:

macro add_generic_fields()
    quote
        method_string::String
        n::Int64
        x::Array{T}
        f_x::T
        f_calls::Int64
        g_calls::Int64
        h_calls::Int64
    end
end

type LBFGSState{T}
    @add_generic_fields()
    x_previous::Array{T}
    g::Array{T}
    g_previous::Array{T}
    rho::Array{T}
    # ... more fields ... 
end

#9

First let me say that when you first start using Julia, you think you’ll miss OO programming and its principles, but you really won’t. Once you begin to actually design differently, the code is so much cleaner that you’ll realize why no one has really been working on inheritance and things like that: it’s just not needed in the vast majority of cases. Proper use of dispatch will do that with cleaner and faster code.

But to build off of what @cortner was said, the beefier version of what he’s doing there is the @def macro. You can always use the @def macro to copy code around:

  macro def(name, definition)
      return quote
          macro $name()
              esc($(Expr(:quote, definition)))
          end
      end
  end

It’s usage is very simple:

@def give_it_a_name begin
  a = 2
  println(a)
end

and now anywhere in your code you can plop @give_it_a_name and it will paste in that code at compile time. To show another use, it can be used for the Optim.jl fields:

@def add_generic_fields begin
        method_string::String
        n::Int64
        x::Array{T}
        f_x::T
        f_calls::Int64
        g_calls::Int64
        h_calls::Int64
end

and now

type LBFGSState{T}
    @add_generic_fields
    x_previous::Array{T}
    g::Array{T}
    g_previous::Array{T}
    rho::Array{T}
    # ... more fields ... 
end

Since it’s at compile time, it’s zero runtime cost for it to be from this macro, and it even gets line numbers correct for error messages and debugging. And since it is an easy way to enforce code re-usability, I find that it leads to very clean code.

I tend to use it when I know that I would want a function call, but that function would have like 10+ parameters, and I’d have that function in many different places which all have “the same setup” (I know by design the same variables will be accessible with the same names). There are multiple reasons:

  1. Functions with lots of of parameters don’t inline. In some cases this can incur a larger cost than I’d like, and I write libraries that I want to be super optimized (I removed a function call yesterday and replaced it with this kind of macro usage and got a 2x speedup in something that already runs faster than Fortran codes…)
  2. I don’t want to deal with “managing changes to huge function call signatures”. These huge parameter lists may be necessary if I made them functions. Sure, I can wrap everything in types and pass the types, but this is the easy way out that leads to clean code on both sides.
  3. If you copy/paste code into a function, it may not always act the same (one big difference: if you change mutables in a function, they only change outside the function if you return them. With the @def macros, you can plop in code that changes immutables without having to return anything. This can be very helpful in many cases I work with). If you prototyped with some copy/pasting, this macro will simply do that copy/pasting so you know it will work the first time, yet still get rid of duplicated code.

This kind of setup shows up a lot when dealing with inheritance-like ideas, so this is a quick and easy way to just do it.

In the end, if you want to have any generic piece of code re-used, this is a very simple way to do it. I know not everyone will be on-board with it because it violates locality (it doesn’t specify what variables will be used in there), but for prototyping and when you know you have repeated designs, this thing is a productivity and performance beast.


#10

That is much nicer - thanks for pointing that out.


#11

The short answer to the topic’s question is found in Julia itself:

julia> type MyType end

julia> typeof(MyType)
DataType

julia> typeof(DataType)
DataType

Therefore, Julia doesn’t have an Object model, but a DataType model (and a good one).

If people need real OO, an Object model will have to get implemented on top of the existing DataType model, probably as an external package. Of course Julia could take steps to make that easier (e.g. to allow overloading of the dot operator etc.) Meanwhile, the best practice is to use the existing model as it is intended to be used, instead of testing its expressiveness’ limits. Still Julia is expressive enough to make possible various workarounds, for those who may find them useful.


#12

In case you’re curious for how to do this:


#13

Although I wouldn’t call that real OO, I did play with closures in the past, when I made a quick translation of existing OO server code into Julia. While it did work and was fun for a while, I soon had enough and opted for a complete rewrite, which I didn’t regret. Cause when the code turned into proper Julia, I experienced a revelation of ways to improve the whole design, which were well hidden by the previous versions. Of course in following translations of other OO code I opted directly for proper Julia.


#14

As a general comment, relying on a well documented set of methods that has to be defined for your subtypes of an abstract type is cleaner than relying on subtypes to have certain fields. For instance, all valid subtypes of AbstractArray specialize the size() method but don’t have to have a size field.

Sometimes this will feel like standard old (boring) C++ getters and setters, but the approach allows for better abstractions (e.g. size might not exist as data, but be a property of the type).

I tend to organise my code around generic methods of abstract types in appropriate files, and seperate files containing concrete types and the small number of methods they need to conform to the interface. E.g. Making a new abstract array only needs 4 or 5 methods (including constructors) defined.