Did Modular just reinvent Julia?

And at the moment, Mojo doesn’t even support unicode strings (except for the filename extension…!).

3 Likes

I’d like to add a note that as I’ve learned more about writing code with borrow checking and static typing, and tried out Mojo itself, I don’t think this makes the language harder; if anything, it makes programming much easier by eliminating a lot of bugs. Borrow-checking is also very easy to learn; it just says “You can’t have multiple variables pointing to the same mutable object.” The complexity of Rust really comes from features of the language other than the borrow-checker; setting aside a few pathological cases like linked lists where it’s easy to opt out of it. Reference counting being slow doesn’t really matter if you hardly ever have to opt into it in practice.

I’m starting to see the major advantage of Julia over Mojo as being metaprogramming that lets you use simpler syntax. Mojo’s metaprogramming system is extremely simplistic and almost a one-for-one copy of Zig’s; it accomplishes roughly the equivalent of using constants like Val, Static.jl, or StaticArrays.jl (along with tools like LoopVec.jl). And also, of course, Mojo doesn’t exist yet and doesn’t have any kind of ecosystem, so I’m forced to evaluate Mojo more as a proposed language than an actual one. There’s also somewhat more boilerplate code than in Julia, and substantially more than in Python.

The main advantage of Mojo is it encourages static programming by making it “opt-out.” The equivalent to a Julia “type instability” in Mojo is declaring a dynamic Python class or def–you need to explicitly request dynamism to get it, instead of having it handed to you without a peep. The Mojo devs seem to think Julia is too dynamic to be a good language for writing libraries, pretty much the same way Python is. Besides that, they seem to be under the impression that contributing features to Julia would have been a major slog and would have taken forever (slow reviews, bikeshedding, etc.).

6 Likes

Further updates–I think I understand what Modular is doing now, and why they think they can make Python fast.

To oversimplify, there are 4 big things that make Python slow, and made people think it was impossible to fix Python performance:

  1. Reference counting–Python uses reference counting (RC) for its garbage collection, which is extremely slow. This can’t be replaced, because Python’s C interface requires objects to be destroyed as soon as they’re no longer needed.
    Solution: use a borrow checker for functions marked fn, destroying Python objects at no runtime cost. Surprisingly, this isn’t a major barrier to beginners. Borrow checking just says you can’t have multiple names for the same piece of mutable data. Most scientific Python code satisfies this already, because it’s simple array or dataframe manipulation; borrow checking is only really a pain if you’re creating complex data structures like trees, linked lists, or graphs.
  2. Dynamic typing–Python functions are always dynamically typed.
    Solution: Use Python’s built-in type declarations and type inference to identify types for most functions (already adopted in many projects). let people opt-out of static typing by using def instead of fn to define a function.
  3. Runtime class modifications–Python classes let you add or delete methods or fields for Python classes at runtime. In practice, about 0% of Python classes use this “feature” (honestly, it’s a pure footgun), so you can just slap a label like struct on your class promising not to do that.
  4. GIL–Python somehow thought it was a good idea to make it impossible to run Python on multiple threads. Luckily, they’re in the process of removing that.

It’s a pretty damn clever design. I like it, but the devil’s in the details.

7 Likes

This is literally the only way that you can add a field to a Python object. Some fields are added in __init__ (which happens at runtime), but many fields are added in other methods.

7 Likes

Right. What I mean is that almost all fields are added in __init__ or some other constructor function, and then no new fields are added after initialization. There are very few situations where the fields of a Python class actually need to be modified at runtime (i.e. couldn’t be declared statically instead).

1 Like

While I agree that dynamically adding fields or methods is certainly less common than being present in the constructor

this is completely wrong and I don’t understand why you are claiming it as fact

even in my own code base I use this “feature” handily once, and in (IMO) a controlled and clear way, so I do not consider it a hack or a footgun. and my code is not that large, so if I have found a use I am sure many others (aka >0%) have as well

3 Likes

Using it exactly once in your codebase sounds like “about 0%” to me :smile: Are you defining “about” as isapprox here? :wink:

If you want, that one class can be declared with class instead of struct, which will treat it like a dictionary (the way base Python handles classes). That has a performance cost but gives you that flexibility.


but it’s not

this package has only 5 types total, and one of them adds methods dynamically

feels a little bit more like 20% to me

2 Likes

It is pretty common to define fields in other methods. For example, a lot of fields in scikit-learn models get set in the fit method.

6 Likes

I’ve always thought the mutating fit was a terrible design. Just return a different type!

1 Like

Yes, but those fields could very easily be defined upfront. To convert from Python, you just need a tool that looks at all the methods, checks which fields get used, and then use the class to move self.field out of fit.

This is basically the Modular team’s plan–they think the vast majority of classes can be rewritten automatically (basically transpiling to Mojo) using very simple static analysis tools, and then the handful left behind can just stick to CPython.

2 Likes

in my situation they cannot

I am using a subclass to implement a method for the base class

yes, I am sure there is some HN-style blog post out there moaning about how this is Terrible and Ugly and I must refactor it at once, but tbh I suspect any other approach in this situation would be more verbose and harder to follow, so I’m going to carry on

It’s not a HN blog post so much as textbook programming principles that say parents shouldn’t depend on the behavior of their subtypes. If f(::AbstractArray) had code calling @invoke f(::Vector), that would indicate a problem with your design. (Part of why multidispatch is so great is because it makes it difficult to violate this rule without weird constructs like @invoke, so it automatically enforces dependency inversion.)

If your package has 5 classes and is just for personal use (not meant to be extensible), then sure, that’s fine. That being said, this would probably get a PR rejected in Python (probably caught and blocked by a linter).

I wouldn’t be surprised if hacked-together scripts use dynamic classes closer to 10-20% of the time, but those can just keep using class since they don’t really need performance.

2 Likes

textbook programming principles

the thing about textbooks is they don’t anticipate every real-world situation

I wouldn’t be surprised if hacked-together scripts

:man_shrugging: to be honest, I find this somewhat insulting. I don’t consider my code hacked-together, it was just the most straightforward and least verbose (and imo most understandble to a reader) approach even though it violates “basic principles.” when a hammer does the job, why use another tool?

please stop projecting your own style preferences onto “all code”

5 Likes

I’m sorry if calling it “hacked-together” came across as derogatory, which wasn’t my intent. When I said “Hacked together,” I meant it in the sense of e.g. Lisp hackers; a hacked-together project is just one that favors straightforward and less verbose code (for quick prototyping) over future extensibility (being able to subclass without breaking anything).

I think it’s a tradeoff rather than a style preference, and my point is that even if most code starts out like this, most high-performance code doesn’t end up looking like this after optimization. (That’s why Julia works!)

I don’t really care about the aesthetics; my point is that even if some Python code out there relies on this ability, you’re not going to find many uses of it in big Python libraries, which are the main draw for Python.

Also decorators often depend on the ability to introspect and sometimes change the internals of a passed in function or class. The have also become quite widespread in modern Python.

2 Likes

Which decorators do that?

@latexify extracts information from the AST and numba.@jit seems to work at the byte-code level. I’m sure there are more examples (had seen something recently, but cannot recall what it was at the moment) …

2 Likes

Assuming one out of four of all class or function ever coded in Python falls into the “special case” category (which I think is a veeery generous assumption), the point that @ParadaCarleton is trying to make still stands. Namely, that (I quote) “they think the vast majority of classes can be rewritten automatically”.
With this pessimistic assumption from above, you still optimize 75% of all classes & functions. Not bad!

As for numba.@jit, I then understand that things could go either faster by removing this decorator, or not slower by keeping it. That’s still a win (in the sense that the work done to accelerate Python code is not lost, and it’s easy to go further).

Side note: ParadaCarleton was making a general point (never intended to cover all special cases). In that frame, I don’t see the added value in bringing a special case if it doesn’t invalidate the logic.
(conversely, if the very purpose of the thread were to get a precise, thorough, and comprehensive understanding of the situation, then I understand)

Anyway, @ParadaCarleton, thanks for bringing that up. This suggests that Mojo may challenge Julia more directly than expected (at least to me), if they manage to keep the simplicity of Python syntax (compared to the extension they’ve defined).

2 Likes

I think that’s right, but these feel very tame compared with what people do in Julia. Python’s metaprogramming is clunky and pretty far behind Julia’s (thus PEP 638’s proposal for true syntactic macros, but it seems dead for now).