How can we create a leaner ecosystem for Julia?

I am also a fan of modular, nicely interoperating packages that do one thing well (forming a flexible “toolbox”), as opposed to a single umbrella library (“suite” or “toolkit”) that tries to do everything. People coming from other languages where such packages are the norm often find the Julia ecosystem too confusingly diverse, but it works rather well. Cf

8 Likes

Would it make sense to use the global registry to appoint “coordinators” per big domain area? (broadly corresponding to the subcategories on this forum). People submitting new packages would be expected to select keywords (possibly even in the manifest?) The coordinators would be people who agree to look at the package registrations in their area, to ensure things run smoothly, that conflicts between packages get resolved amicably, to orient new contributors, to make the usual “hey, this package looks great! Can you clarify what it adds wrt X, Y and Z? Perhaps this feature could be merged into that package?” comments, etc. These people should not necessarily be core contributors to key packages, but rather people who have some time to devote to this and know the community. To sweeten the deal these people could get free participation to julia con or something.

4 Likes

I don’t understand what kind of conflicts you are talking about here, can you please clarify?

I am not sure we should demand this as part of registration. People should feel free to experiment with new approaches without worrying about a justification.

Of course, if people feel like they have the time and resources to start some kind of curated registry, they should go ahead, it is pretty easy these days. Personally I think that what you are proposing is a lot of thankless work and it may not be possible to incentivize it easily, especially for people who have the necessary skills.

I think that we should just improve discoverability instead. FWIW, I think that asking on this forum is generally the best approach for selecting a package.

5 Likes

Conflicts in terms of different method names for similar things, that usually get resolved with DomainBase type packages.

Of course we should not demand it, but we should certainly suggest it. It’s very often the case that people are not aware of what others are working on (perhaps because it’s wip, perhaps because it’s niche, etc.) and such comments are welcomed

RE thankless work : it’s work some people are already doing voluntarily right now. Being “blessed” in some organizational way provides a sense of ownership that incentives this further

2 Likes

For example, I’d like my Grassmann.jl package to replace various obsolete functionality from the GeometryTypes package, such as the Point and Simplex types. However, I know that if I started proposing such things, it probably wouldn’t gain much approval from other developers because it would require a bunch of work to change things. Overall, it would help with generalizing much functionality in the Julia ecosystem to transition to Grassmann, which will provide differential geometric algebra functionality. It would make me very sad though if I am expected to only live off of my $17 dollars a month, while I am laboriously making contributions that would enable the work of lots of people who are paid full salaries.

This is why I am going to continue working solo and making my own ecosystem. However, I would be open to start the discussion for replacing GeometryTypes with the more versatile Grassmann algebra. There is much else that could benefit from geometric algebra, which is a unification of various ideas. However, I will probably just continue working on my designs alone and build up my own personal ecosystem around it, since working alone allows me to think of much more advanced concepts for generalization without needing to explain and justify myself to other people (for free).

It would be nice to have my programming work integrated into the Julia ecosystem, but I have hesitated from proposing such things because it is not worth the effort to explain my ideas to people. When I am working for free and living off of $17/month, it is better for me to just use my time to design my own ecosystem instead of getting into huge discussions about design with people who are paid full salaries but don’t understand my design choices.

GitHub organizations (or GitLab groups, Bitbucket teams, etc.) are a nice tool to attract individual projects, and drive their development under a common umbrella to reduce redundancies and improve consistency between them.

I want to share my positive experience in this regard: I started to develop RecurrenceAnalysis as one of those “hobby projects”. Time after I was invited to join the JuliaDynamics organization and move my package there. This started a fruitful collaboration that helped to improve the package, providing it with much better documentation — consistent with all the other packages of the organisation, a narrower focus — moving some features to other packages (e.g. DelayEmbeddings), and better performance.

Regarding the question of “more packages” vs. “better packages”, I think that JuliaDynamics is again an example of this not being a real issue: that organization has a set of packages dedicated to particular tasks, and also the DynamicalSystems library that installs them together.

As others have already said, this does not happen magically; it is necessary one person (or a few) that leads and coordinates the collaboration — in this case it is @Datseris, whose excellent work I want to praise. In my opinion this is a success story, that may be used as an example of how to move towards such a leaner ecosystem. Perhaps George might share his experience, and tell what resources he has needed to achieve this, good practices, etc.

9 Likes

Thanks a lot for pointing this out @heliosdrm . I’ll try to add my side, although there isn’t much more to add other than what you said. I’ll try to outline how the JuliaDynamics story became a successful story. In principle I think I can summarize my though process in trying to make JuliaDynamics a “successful” org (even though this is totally subjective), in the following points:

  1. As you pointed out, someone (or a group of people) should have a drive (and willfulness to spend some extra time) to bring a high-quality organization for a specific “genre” into Julia. This group will help the org by coordinating things. My “genre” was dynamical systems and nonlinear dynamics and for JuliaDynamics I do most of the coordination.
  2. Documentations are extremely important and they attract users as well as developers to join the org. Everyone involved, should be spending time building quality documentation: for the community, a high-quality documented functionality is better than just having one more function implemented.
  3. Avoid redundancy and duplication: as pointed out in the very first post of this thread, there are dozens of packages that do similar things. To this, a community should collectively say “NO”. There should as little as possible packages, the best, that do all (or most) things the best way and just work flawlessly with each other (this is exactly what happens in JuliaDynamics e.g.). As @heliosdrm pointed out, this was yet one more benefit of joining the org: we compared the same features we had in two different packages, and kept the best implementations. I am very aware of how “wrong” this statement sounds to many here as the claim is that “this is not how open source works, everyone makes what they want”, etc. Well, in my eyes the best thing for a language is to have the MINIMUM AMOUNT of packages that are the BEST in what they do, while REUSING as much code as possible. Its up the community of the genre under discussion to think about what they want…
  4. The members of this org should actively “scout” relevant packages, and invite people to join. Having things together in an org is better for the community. And when you do find and invite some package, be good about it: help them join, help them get better documentation, help them get better code, help them get hooked up into the ecosystem. I think this was a really successful thing with RecurrenceAnalysis.jl and Agents.jl where I believe both packages saw both real code improvements, but also increase in popularity, after they joined. But @heliosdrm and @Ali_Vahdati should say for themselves, they were the original owners.
  5. An important thing that only recently I’ve come to realize, that is necessary for packages to join an organization and improve themselves and the organization is the following: developers should care more about getting a better ecosystem for Julia, than having their “name as the name of the owner of a package”. (You might think that this point is “ridiculous”, yet I claim that it is one of the driving reasons that there are 2 dozen packages that do the same thing)

Ultimately, and this has been discussed in this post many times, this takes not only time, but also willingness to collaborate and also drop parts of your “status” regarding a repo. It is up to the community at hand whether there is someone that is willing to spend some extra time or not.

As far as my life goes, I can say it is totally worth it to spend extra time trying to coordinate JuliaDynamics: everyone involved, including myself, just becomes happier because things improve collectively and collaboratively, and that is just cool!

14 Likes

Possibly a state is reached like the Task View for the R environment? It is clear that Julia is still a rather young programming language and the establishment of the task view did not fall from the sky for R either.

But here the wheat is separated from the chaff and the user can rely on the quality of the packages. Task View’s are not expensive, but they cost effort and time.

Perhaps invest a few million$ and hire some devs to make your dream come true?

What I don’t want is for an organization with a lot of money to take care of this problem and then let its own structure dominate everything (R’ler know what I mean!)

3 Likes

The most similar to TaskView in Julia I think is https://pkg.julialang.org/docs/ and the tags. At the current state, the descriptions are not good to separate them for the other ones and some tags sometimes are not clear enough. But it is very useful to know useful and interesting packages.

1 Like

I already said it, but want to confirm that I totally agree with this.

As a side note, regarding the concern of “dropping the ownership” when a package is transferred to an organization, I think that it is pointless to worry about that. Authorship should be acknowledged in the documentation and in the Project.toml file, since the name of the repository is very weak credit for a project in the case of free software. (Some projects have very good stuff but the original owners may abandon them, or for some reason the community of users may not be happy about the route taken by their development, and eventually they are forked to somewhere else.)

Actually, I think that being part of a team may be a motivation to help authors keep interest in their own developments for a longer time.

1 Like

You made me seriously think about moving https://github.com/cesaraustralia/DynamicGrids.jl to JuliaDynamics. But I don’t think it’s always as simple as you suggest. In this case it’s important that cesar gets some credit for paying for all the work I’ve done. Moving something to an org also complicates who eventually publishes the methods papers - that’s where having my name on it actually does matter to me.

Edit: Additionally, as we are still in development of Dispersal.jl and other packages that have specific goals and reports they need to produce for our contracts, I need to be able to change how things work frequently without consulting anyone! Organizations add some friction to that.

1 Like

I think you bring up a very important thing that I haven’t seen discussed. Recent years have seen a big push in community standards to create a more friendly environment, but a lot of us need citations to keep a job. Organization should seriously consider an official standard for citation and respecting contributions.

4 Likes

In most cases I’ve seen, moving a package to a JuliaX organization doesn’t affect credit and attribution much. (And in the long term, it might help because the package will be used more widely and cited more widely.) You can still write papers. You can plaster the citations at the top of your docs and/or README.md. I don’t see a problem with acknowledging funding or other support.

If you have a sudden influx of contributions, I think the expanded functionality would be worth some dilution in credit.

11 Likes

Does that happen? It feels like a lot of packages under orgs are “collectively unmaintained”

5 Likes

I think it’s rare to have a sudden influx. I wrote that because I interpreted @Raf’s post as being concerned with losing recognition, and getting a sudden influx of contributors is one way that could happen.

1 Like

Thanks for sharing this! I’m trying to kick off a project in the community for developing a tool many of us need. The first step was making a git org and inviting everyone who contributed to the RFC thread.

I think credit hogging is ridiculous… What I wouldn’t give to dilute credit from my only other real package… I’ve offered people to contribute and we’d write a paper, etc, and nothing. Tried to give my code away to similar efforts, nothing. Ego is a very real thing…

I have a day job - I’m not going to become a millionaire making opensource software in a fringe language, nothing to lose as long as if I do a lot of work people give me a shout out that’s fine by me. Just want to help people, learn things, and make some friends!

4 Likes

I like this idea @Albert_Zevelev, each domain on Discourse could have a set of coordinators organizing the existing features, packages, what is missing, etc.

I also like this idea very much @antoine-levitt, a set of coordinators that could help organize the plenty of packages out there with overlapping functionality. I volunteer for spatial statistics for example or anything spatial for that matter.

@Tamas_Papp I don’t think there is any demand here from @antoine-levitt’s comment. He is saying that packages with tags could be reviewed in the General registry. At the moment the bot gives time to merge new packages, but rarely someone from the domain reviews the work or adds it to a pool of packages in a coherent way. We can have guidelines for coordinators or reviewers.

What is being discussed here is not curation. It is organization of research and integration across different packages in a given domain.

@chakravala I think you are disregarding the fact that very few people know Grassmann geometry and the related theories. You have to keep in mind that people from other domains do not have time to dive into theories when they have other priorities to address. There are so many theories that I myself would like to learn more at some point, but I can’t find the time. I can tell you for sure that you won’t succeed convincing anyone to join if you can’t talk their language.

From what I have seen on this forum and from many colleagues I’ve made here, you are again taking a direction that is not aligned with open source and collaborative communities. People seeking collaboration do not pretend to be superior nor diminish the work of others. Show that you are smart by teaching as opposed to showing off. I am certain that your work would be more valued.

Also, let’s not deviate too much from the original goal of this thread, which is to improve the organization of work done by the community. We have enough gripes for the start of the year. Let’s delay them to some later time and in the #gripes channel on slack.

That is a great story @Datseris, thanks for sharing. I fully agree with points 2 and 3 about good documentation and about avoiding duplication/redundancy. The other points related to organizations on GitHub are also interesting, but I am not sure I fully agree with them. As I mentioned in another post, organizations are great when there are multiple people with a similar vision of the domain working on various related packages. When the domain is too niche, and contributors hadn’t show up yet, moving the package to an organization can lead to frictions in the development cycle that actually make things move slower than when the authors have full control over the commits that get merged, reviews, etc.

On the other hand I understand that by replacing the name of the package creator on GitHub with the name of an organization we can perhaps motivate more people from the domain to contribute. Perhaps the potential contributors were always there but didn’t want to contribute because of the name.

Another point that is tricky about organizations is that they may only reflect part of the story. Take my GeoStats.jl package as example. It could possibly live in JuliaGeo or JuliaStats, which one is more appropriate? Which community does it belong to? I thought about it multiple times, and my conclusion is that perhaps a new organization is needed like JuliaGeoStats that is in the intersection of these communities. However, creating a new organization kind of defeats the point in this case because people from both JuliaGeo and JuliaStats are potential contributors. Whenever I think about this organization issue, I come back to the original status quo, which is that no organization is needed at the moment, but perhaps in the future when things settle down a bit more and people with similar vision and skills have joined the story.

2 Likes

I think it’s great that you are all doing such a good job of collaborating!

I did not diminish the work of others. The reason why I say GeometryTypes is partially obsolete is due to mathematical reasons, it has nothing to do with the quality of the work, but about the design intentions.

Changing people’s intentions and desires is a lot of work, which is why I have held back from collaborating.

If you’re confident that you could help improve the situation there now would probably be the time to speak up recently there’s been an effort to simplify things using GeometryBasics.

1 Like

@Zach_Christensen that’s nice, but the Grassmann package has the geometry features I want, the starting point would be the Grassmann algebra for me, the algebra is at the root of everything and is a more fundamental and unified description of geometry, which would also make GeometryBasics obsolete from an algebra perspective also. It’s better for me to start with the algebra and build up from there.