KuzuDB or general GraphDBs

So a few days ago KuzuDB, which is basically the sqlite of graph databases, got archived.

Which is unfortunate since I actually thought about building a pipeline with treesitter, cocoindex and kuzu that builds a GraphRAG and feeds the Julia repositories and their documentation directly into a Chatbot. (Since ~60% of the time i’m using Julia)

Now Kuzu is still usable. Maybe one should do a Kuzu.jl package. Or maybe one should look into doing a similar Graph database (in Julia).

  • Is Julia a suitable programming language for such a project?
  • What are Julias options for serialization of it’s structs and functions? Can one store Julia’s data structures and functions on disk and load them if the program requires their execution?
  • Now with PackageCompiler.jl and JuliaC.jl is it realistic to compile for Android and WASM? Is it in general realistic to compile such that this database is usable for other languages without shipping Julia?
  • Would someone be interested in that?

They are a YC backed startup though. I wonder why they ditched their project as their database isn’t useless at all and quite some projects already depend on them. Maybe they don’t have a good way of monetizing it.

2 Likes

I believe it’s at least possible, and Julia might be a very nice language to implement (such) a database in. One has already been done JuliaDB.jl, though it’s now unmaintained, and it wasn’t a graph database. But Julia has I think great graph capabilities in packages, so I wouldn’t rule such a DB out.

I see:

As of version 0.11.0, a Kuzu database is now just a file on disk instead of a directory. This single-file design makes your Kuzu databases much more portable and easier to share or archive.

It’s unclear to me how large this file is, i.e. is this an “in-memory database”, a sort of oxymoron?

What does KuzuDB give you over e.g. Neo4j that you can use with:

[or Neo4j.jl, though the other driver seems better, maintained, and a replacement.]

Note, written in Julia (at one point I thought it maybe a database, but I didn’t look into it since proprietary, and I believe also using Snowflake for a database):

https://x.com/relationalai?lang=en

RelationalAI is the industry’s first relational knowledge graph coprocessor for your data clouds, with a mission to empower every decision with intelligence.

While JuliaDB is archived, it depends on MemPool.jl not archived, that might be useful. Also Blobs.jl might also be.

I see also:

Kuzu is working on something new! .. For those using Kuzu currently, prior Kuzu releases will continue to be usable in the same way without modifications to your code.

So it seems premature to drop KuzuDB, and are they planning a replacement? I’m not sure people would want to make KuzuDB.jl client for an archived profuct that may nothave a future, unless the replacement will be compatible, with same client access protocol.

There’s a lot to think about when making a database, which query language, SQL, or graph-oriented like Cypher or other. I see KuzuDB uses it, same as Neo4j, so maybe something could be reused:

2 Likes

The number one reason for KuzuDB over Neo4j is it’s speed and latency. As well as ease of use. In Rust you would just add it to Cargo.toml import it and be done, no dealing with http, docker or anything that requires you to run an instance of a database. Like sqlite, but with all the features one needs let’s say for a GraphRAG for a decent Chatbot, so one can store vector embeddings, do Cypher queries, etc.. Bigger data and documents one would replace it with an UUID and load them from other databases like Redis or MongoDB, thus reducing the amount of data actually stored in Kuzu.
For something like a chatbot this latency is quite an advantage. If one does something like a knowledge graph of your obsidian vault or actually requires the code stored in a code graph for actual computation, then Kuzu shines over all other graphDB options.
There are faster options than Neo4j by now. like FalkorDB, they claim that they even work in Julia:
Redirecting… apparently via GitHub - xyxel/RedisGraph.jl: RedisGraph Julia client
another option is HelixDB, NebulaGraph, CozoDB.
They all are faster than Neo4j.
The two things I had in mind with something like kuzu was:

  • build a CodeRAG, that feeds the exact context (and nothing else) of what one actually does into the LLM. Like let’s say I use some stuff from Lux.jl then it will exactly search for the types I’m using right now and exactly the functions I call, and the current state of the documentation and the piece of code I actually write, and not be like some Chatbot distracted by training data from some years ago.
  • Some cases require fast retrieval and execution of code. Like certain proofs in Coq that formal verification engineers use that have lot’s of intermediary steps that one can reuse. That might be even applicable to Julia as Julia more than other languages relies more on precompilation, and storing intermediaries in agraphDB might be beneficial for the project.

I really hope they don’t give up on such a project. Maybe I wait a little bit but it should be beneficial to eventually make that usable in Julia or parse Julia through it.

1 Like

Seems possible: Graph Database in Julia - #6 by jeffleesyn