I think I understand where this is coming from but I don’t see a problem.
Firstly, I don’t think
JuliaDB have “the same goal”.
DataFrames is a lightweight package for in-memory operations, perhaps akin to Python’s
pandas and R’s
data.frame (although there’s nothing lightweight about
JuliaDB is a more ambitious project for persistent datasets and parallel out-of-core processing, perhaps akin to Python’s
Dask and R’s (Microsoft’s)
RevoScaleR. So they serve different audiences.
Secondly, I think this kind of thing is natural for open source - people can and will develop different things based on their needs and interests. This might indeed sometimes be frustrating and suboptimal for end users but it is also good for innovation (and seems to be encouraged by the Julia community). The reason Python and R seem to have it less is, in my opinion, rather deceiving. I think Python doesn’t have it because all of the data stack in Python is kind of “bolted on” and the initial investment is so high that it makes sense to gravitate around the initial/large projects like
Pandas regardless of their shortcomings. R has it less because most of the fundamental data structures (like
data.frames) are built into the language but consistency there is little - compare, for example, base R,
data.table way of doing things. Or think about the numerous plotting libraries available in both Python and R. On top of that, there is just the factor of time - R and Python have had enough time that some packages / approaches have started to dominate and become de facto standard although there were/are alternatives. The reason Julia has many packages for the same thing just reflects the youth and quality of the language (quality because one can actually write this kind of fundamental packages with relative ease in Julia instead of some lower level language).
As for what would be the best future, I think a shared API and/or query language, both of which are being worked on - there are query frameworks (
JuliaDBMeta) as well as active work around what a “table” interface should look like in general. Both of these efforts will provide a consistent API for end-users regardless of the ‘backend’ storage/manipulation format.
In sum, I think the “unsettling” feeling is just a normal reaction to learning and things are actually quite well in Julia in this regard.