A serious data start-up structured around a Julia data manipulation framework for larger-than-RAM data

Yeah, I think a Spark replacement probably has more commercial opportunity than a Pandas/Polars replacement. As you mentioned, adding in distributed ML algorithms can help. Spark has libraries for both distributed ML and distributed graph algorithms.

That being said, I want the library to be free and open-source, so I’m not sure exactly how commercialization would work. Some kind of cloud-computing services? Integration with JuliaHub? :joy:

2 Likes