Is PyJulia org, e.g. juliacall (or pyjulia) Python packages maintained?

I mostly think of JuliaHub as a services company. I would expect to pay JuliaHub for help to accomplish a task as opposed to asking them to fund a specific project.

4 Likes

I see, thanks. Then I’m not sure what needs to be reallocated… surely there are some people being paid to develop Julia (core/standard lib) rather than as a volunteer? JuliaPy is in support of Julia itself, so I really think it should receive more support (whether in dev time or other) from whatever entity is supporting dev of Julia…

Julia is under the NumFocus nonprofit umbrella, so someone (with involvement from the package developers) could write a proposal to NumFocus to request some funded development of JuliaCall etc.

6 Likes

SGTM. If someone can lead the proposal (maybe Chris Doris if he is interested) I can do my best to help.

3 Likes

I’m not sure if it is “reallocation” that is needed here. We just need a way to allocate resources for this specific purpose.

In particular, using Julia from Python is perhaps of particular utility to Python users. I note that all current solutions to access Julia from Python do not seem to take advantage of the most efficient path, using Julia’s C API via a C extension or Cython.

It’s possible to create a subproject. For example, there is one for Pluto.jl:

I view Python-Julia interop as such a high return-on-investment project for Julia itself that it is almost analogous to the introductory documentation. In other words, I think Julia people may want to view this interop as more a duty to work on rather than one of many packages solving a particular problem.

Just like introductory documentation, interop is not useful for experienced Julia purists – but it is extremely important to the longterm success of Julia and its wider adoption.

I just want nitpick a bit here. While the immediate benefit is to Python users, the long term and much more significant benefit is to the Julia community. Good and reliable Python-Julia interop means:

  1. Organizations with large Python codebases will be able able to gradually replace parts of it with Julia.
    • This is happening right now with Rust, with companies gradually replacing large parts of C++ legacy codebases. They don’t need to start from scratch, they can just do it gradually. Until, at the very end, their codebase is pure Rust.
  2. Increasing adoption by industry would bring many benefits to the Julia community, including increased core language support, more open-source releases in Julia, and probably even funding.

I did a brief stint in big tech a while ago and probed a bit whether Julia could be used internally. It was basically dead on arrival due to lack of seamless interop with their existing codebase (mostly Python). Hence my motivation.


Aside: I think this is actually why Chris Lattner et al. have put such a large emphasis on the design of Mojo being easy to embed in Python scripts. From his experience in industry he likely knows this is the only way to get your foot in the door in a large legacy codebase – even if the final goal is to have people write entire codebases in pure Mojo.

16 Likes

I strongly second the view of @MilesCranmer here. Even for small companies and teams, this can be a huge hurdle for Julia adoption even though it is often quite easy to argue that Julia would be technically superior for a given task. From the POV of a team using Python libraries it should not make any difference whether they use Julia or C++ under the hood. If we can get to this level, with easy to make, efficient, and well-maintained ways to wrap Julia code from Python being unambiguously available, it would make a huge difference in my view.

6 Likes

Last time I checked, there was no way to call Mojo from Python.

As shown above, you can call out to Python modules from Mojo. However, there’s currently no way to do the reverse—import Mojo modules from Python or call Mojo functions from Python.

What Modular wants you to do is embed your Python code into Mojo.

1 Like

To be honest I don’t know much about Mojo – I just wanted to point out they have clearly have made Python interop a priority, and have articulated the reason for this as making it easier to transition from Python.

Nobody disagrees that we need to continue making the interoperability tooling the best that it can be. It’s more of a question of who or how. Let me first dispel a few myths to put us on the right path and then describe what are the concrete steps that can be made:

JuliaHub is a cloud computing company which focuses on building and supporting software for modeling and simulation, such as Pumas (pharmacometrics, developed by a separate company Pumas-AI), JuliaSim (general industrial), and Cedar (EDA, circuit simulation). While many major contributors to many open source projects have worked or are affiliated with JuliaHub, it as a company has to earn revenue and the open source contributions are not the business model. There are some teams focused more on open source than others, and those require grant funds to be sustained (to be described below).

Another source of a lot of contributors is the MIT Julia Lab, which for example is what funded Takafumi during his creation and maintenance of PyJulia. Most of the grants are focused on scientific projects, these days many are scientific machine learning or HPC related, or larger scientific collaborations like CESMIX. In that sense, there really isn’t anything special about the Julia Lab other than the fact that it has cultivated a culture where open source contributions are valued, and those who are very active in the open source communities are prioritized over “traditional academics”.

I am adamant that this is the right thing for academia in general. I would argue that open source scientific software developers are some of the most impactful and influential scientists in our current system. However, this bent of the lab is simply a cultural choice generated by Alan’s leadership and any other academic lab could also hire contributors. Indeed, I think there’s a good case to be made for doing so: instead of hiring a good neuroscientist, hire a neuroscientist who also happens to be working on the compiler for the language you’re using and can fix any bugs your research program is running into. For reference, Taka (pyjulia’s maintaine, developer of Transducers.jl and many threading capabilities in Julia) was a neuroscientist, and so was Valentin (who builds/maintains CUDA.jl, Enzyme.jl, and a lot of the Julia compiler), so this is only half a hypothetical. It’s not the average way to do science, many labs build one-off tools with workarounds rather than contributing to widely used ecosystems to make their experiments work, but I think there’s a good history to show that the latter has a more cumulative effect and in the end is a better choice!

So what can you do? A lot, since all grants are “open source” grants!

So okay, we’ve established that there’s nothing special. There’s JuliaHub and Julia Lab that do have a good number of contributors, but they are companies and scientific labs like any other, but just happen to also put the work in to value the time and have some (not all, and not the majority) individuals do open source contributions.

With that in mind, how could you do the same thing if you wanted to? Well for starters, there is nothing special about most of the funds to do this. Most of the grants are not “open source development” grants, those are very few and far between and could never sustain a community. Most of the funds are simply normal grants, climate models, pharmacology, etc. whatever the lab is doing. People who are “funded for open source” are actually funded for projects like “improved performance of climate models”, where you hire (or have a PhD student) a HPC/GPU specialist who happens to be a person who maintains CUDA.jl or part of the LLVM stack. This means that every grant is an “open source development” grant if you look at it the right way, and with every grant you can think of adding “Aim 3: Make it Scale” and have someone dedicated to the software sustaintability and scaling aspect.

But if I don’t have “extra” cap space and want to start writing, what are good directions?

So if you don’t have an established lab with a lot of folks in there to just happen to have some hires that could be understood to part time maintain libraries like pyjulia, then you need to find grant money. The good thing is, it’s not impossible and pretty much anyone can start doing this. I started writing grant after grant in my first your after PhD, so basically anyone with a PhD and an academic position that grants PI privledges can do it (that may seem to exclude postdocs by default, but in most institutions you can petition for PI privledges on a per-grant basis, so you can use this option and it can be used to bolster your academic CV).

So then you want to start writing, what grant? Here’s a few that I think have been the most helpful.

  1. NSF CSSI. https://new.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific . NSF grants are quite a PITA but these last awhile (3 years IIRC) and can have a focus on cyberinfrastructure. These grants sustained many projects like JuMP, automatic differentiation work, and more.
  2. NumFOCUS periodically has the Small Development Grants (SDG) . Small Development Grants - NumFOCUS These are usually on the order of $10k, so it is usually only a side project unless you’re using it to help give someone early career or in a developing country some time for independent focus work.
  3. The Chan-Zuckerburg Initiative has grants to software development (Science, Education, Community, and Other Grants - CZI). These are much smaller, basically 1 person max for 2 years (Science, Education, Community, and Other Grants - CZI). Notably, the a lot of the precompilation improvements were funded in part through a CZI helping fund a lab manager for Tim Holy’s lab to free up his hands (SciML Receives Chan Zuckerberg Institute Funding: Spatial SSAs, Identifiability, and Compile Times).
  4. NASA has the ROSES program that has some open source funding. For example, if someone has a current ROSES grant, you can submit a supplement by March 29th as part of Amendment 33: Supplement for Open-Source Science Final Text - NASA Science for open source development! Larger opportunities are every 3 years NASA Open Science Funding Opportunities - NASA Science.
  5. The Simons Foundation periodically has grants for open source Funding Opportunities.

And there’s many more I haven’t listed here. But again, usually domain-based funding sources (climate modeling, ecology, pharmacology, etc.) have much more funding, and so finding a way to integrate open source work into a “normal” grant is usually much more productive than looking for “open source grants”.

Conclusion: There is no Silver Bullet

If there was someone who was “allocating resources” of the Julia community, then it would probably be me. But as I have hopefully showed, there’s no silver bullet or wand of magic that can be waved to get even two developers to suddenly show up. Open source is actually built by volunteers, and any funds that exist in the system are also generated by efforts on volunteer time!

But, I hope that is actually empowering. While it may seem completely daunting, with internal thoughts of “how do we get a better Python to Julia bridge, what company can get $10mil to make this work?”, hopefully this shows that the community has never had any resources remotely close to this. Everything is built by having small contributions add up. It’s hiring a postdoc who is active in the community and having them spend 1/3 of their time maintaining the compiler, it’s putting the time in to get a $30k grant and fully funding someone in a developing country for a year, it’s collaborating with another lab on a grant to get them a lab manager so that they can do more contributions over the next year. This community wasn’t made on multi-million dollar contributions but the collective work of many individuals helping with whatever they could.

Call To Action

If you want to get something in motion, do feel free to reach out. I through the Julia Lab or JuliaHub do a ton of these (for reference, I probably at least write a letter of collaboration every other week), but each takes a lot of time and effort and so it would be great to have more people in the community to take on leadership roles. In the immediate term, anyone with a NASA ROSES grant would be in a great position to lead an open source supplement as mentioned above. That’s only $50k ($25k if the academic institution takes half as overhead :sweat_smile:) but it can get someone started and is due end of March. Then the next NumFOCUS SDG CfP should open in about a week, and if someone is interested I could help you get this written (SciML’s SDGs are already known, so this would likely need to be something JuliaLang, but that of course makes more sense).

And if you have funds but don’t know who to hire, or want to add this information to the grant but don’t know who you can put as a potential hire, please get in touch. I could name more than one person off the top of my head who might be interested in such a role but don’t want to commit them publically (especially without knowing caveats like if they’d have to move :sweat_smile:)

Meme-based tl;dr:

(now seriously though, off to finishing the next grant :sweat_smile:)

18 Likes

Julia’s easy interoperability with other programming languages (R, MATLAB, Python
C/C++,…) is very attactive to me as a Julia user. It gives convenient “training wheels”
for me as I come of the Julia learning curve(s) and allows me to maintain collaboration
with colleagues who are doing their work in another language and its environment.

However, for any sustained development effort you need to have a demand in the
form of customers who can provide needed resources such as developers and/or
funding…

For sustainable open source projects you need developers which typically come
from the users of the project or capability.

For context, Python has almost 50X more developers than Julia and github repositories
Languages tables show:

PythonCall & JuliaCall      94% Julia     6% Python
PyJulia                      9% Julia    89% Python     2% Other

I would say that PythonCall & JuliaCall are well suited for support and maintenance
by the Julia Community because it is basically a Julia package.

On the other hand, almost all the development in PyJulia is in Python and needs
support and developers that I believe must come from the Python Community if it
is to be viable. However, it might be difficult to recruit Python users for a project that
has a goal of enabling them to not use Python.

I prefer to see Julia development resources going toward things that benefit Julia
users (i.e. PythonCall & JuliaCall) and not towards things that tilted to the benefit
of Python users while giving little or no direct benefit to Julia users.

2 Likes

I really don’t think it “should”!

I.e. I think juliacall is the future. What I do know is that PythonCall.jl just works, better than PyCall.jl, and is the future. Its juliacall half is a small addition, and I think it also works.

What I do not know is, while you can use PythonCall and PyCall in the same process, say by two of your dependencies, if your main language is Python then the same should apply:

But I’m not sure, can you use juliacall and pyjulia together? Relatively few use either, I have no exact count, but likely more (for now), though few still I think, wrapper packages use PyJulia? If using together is a problem, then one way is to make it work, or just abandon it for the better juliacall; people are migrating to it, I don’t know of anyone migrating into the legacy direction, to PyJulia. Does PyJulia in its (main) docs link to juliacall, if not it should!

All software has bugs, and I’m not ruling that out for juliacall (or any julia-python interop package), but if they need (financial) support it should go to juliacall.

It benefits Julia users, and juliacall benefits Python users, and indirectly Julia users.

Python users are used to relying on other languages, knowingly or not. Julia just needs to be as easy as using C or C++ from Python (or R). Actually I think that’s actually already the case, easy, just not a small or ideal dependency for packages for Python.

For R, C++ is used for the heavy lifting, and I think, it requires users to have a C++ compiler installed. The compilation still automatic for those R dependencies. We want the slower languages to use the much easier language Julia, and the same code works for Python and R. Already we have such examples. So we are not in competition, it’s their benefit, and helps Julia package development too. The option they have is use e.g. Rust for speed, also an understandable choice… [and if they do then we can also use that code, but it would be less generic, and for R users require also Rust compiler…(?)]

1 Like

We (and we all?) are very much in agreement, it’s just a question of how.

Isn’t that what juliacall already does, use Julia’s C API (or at least could do)? Or maybe not, i.e. what PythonCall.jl does is use Python’s (stable) C API (I suppose) to interact with it, and juliacall, its addition, is some small extra code. However it’s done, which languages’ C API, may not matter, but not Julia’s C API isn’t fully stable. I.e. it keep changing, people complain. Those I guess a subset of it is very stable.

If you fully compile your Julia package, or at least for the machine types, integers, and Float64, they should be usable from Python without the full Julia runtime. Right now, you get the full runtime, whether you need it or not, also in case not all the code is fully compiled.

The runtime, its code (e.g. GC) isn’t even that large, it’s the LLVM and other dependencies, that you CAN already do without.

I think it might actually be worse to use Python’s C API (unless abstracted away with juliacall), in case you cant your code also working from R. The goal should be just make a Julia package for just Julia, or Julia + Python, but also R, and who wants to use.

You mean because of PyJulia, your (more?) familiar with, before juliacall, or also it not good enough?

@mkitti, Pluto.jl is a Julia-only project, AFAIK, the notebook files are pure Julia, so what do you mean? You can use Python and other languages from it, ie.g. with PythonCall, i.e. Julia still the main language, but you mean Python can be the main language (and use juliacall possibly)? I don’t think it’s possible unless I missed something, and I don’t see you supporting it. Did you mean to point to people wanted it and donate such work implemented?

1 Like

Mojo can call Python (I don’t know about possible limitations, maybe for Dicts?).

It can’t “currently” (likely will change, on some roadmap) call in the other direction, so we are ahead, and should explain that. They also support Linux and macOS, so implied not Windows, and we can call to and from on Windows, and all Julia supported platforms.

They have some syntax similarity, but Mojo is a much more complex language than Python. They are catching up to Julia, and a bit far ahead in at least one area (along with Rust), ownership model.

Maybe Julia should add such, but also maybe should have an official subset, like D has one, “better C”; we already have such informal subset, with StaticCompiler, where GC not allowed (and Dict not working, or all relying on allocations/GC, or exceptions…), and Windows didn’t work, but solution found.

We are behind Python is some areas, classic ML, but I see at Mojo, also:

MAX Engine incorporates best-in-class compiler and runtime technologies to create the world’s fastest and most extensible unified inference engine. It supercharges models in any format (including TensorFlow, PyTorch, and ONNX), runs on any hardware backend, and includes a Mojo graph extensibility API.

Maybe we should start thinking of interop with Mojo…, rather than (or more likely in addition) to Python :slight_smile: and it will take care of going to Python…

I understand why people adopt Rust, static languages, for used by e.g. Python, but Julia is simpler than it or Mojo, so people should rather consider it. It’s not bad for Python users, as is.

1 Like

Quick comment:

@Palli I think you are getting confused by the names — “JuliaPy” is the host GitHub organisation of both PyJulia / PyCall.jl and JuliaCall / PythonCall.jl. When I say “JuliaPy should get more support”, that refers to all projects under that umbrella, not PyJulia (I can totally understand the confusion!). I don’t care how interop gets done; whatever works best.

1 Like

I don’t have much use for python integration, but I feel that the lack of integration for Arrow objects between Julia, and the underlying libraries that are used by python/r, is a major issue in Julia. Ideally it should be easy to move an arrow object back and forth between Julia and rust as zero copy, but Arrow.jl lacks the c interface for arrow objects, and jlrs can’t implement IntoJulia for arrow-rs objects. The closest I’ve seen is Polars.jl. That’s one area where I feel R and python are a bit better integrated. It would open up opportunities to implement algorithms on data in rust and benefit from its static analysis and concurrency, but I recognize that it makes the code more difficult to distribute to different platforms.

3 Likes

Yes, I always need to do a google search, I can’t get these names in my brain, it’s so frustrating…