pip install kuzu Then head to the Kuzu documentation . The v0 release is stable enough to build on.
import kuzu db = kuzu.Database("./my_graph") conn = kuzu.Connection(db) Create schema (optional - you can also create on the fly) conn.execute("CREATE NODE TABLE Person(id INT64, name STRING, PRIMARY KEY(id))") conn.execute("CREATE REL TABLE Knows(FROM Person TO Person, since INT64)") Insert data conn.execute("CREATE (p:Person {id: 1, name: 'Alice'})") conn.execute("MATCH (a:Person), (b:Person) WHERE a.id = 1 AND b.id = 2 CREATE (a)-[:Knows {since: 2020}]->(b)") Query: 2-hop neighbors results = conn.execute(""" MATCH (p:Person)-[:Knows 1..2]->(friend:Person) WHERE p.name = 'Alice' RETURN friend.name, COUNT( ) """) kuzu v 0
Kuzu stores properties . Want the average age of all Person nodes? That's a sequential scan of one integer column. Want to count how many Knows relationships each person has? That's a column scan of src and dst . pip install kuzu Then head to the Kuzu documentation
arrives to fix that friction. It is not a wrapper. It is not a key-value store pretending to be a graph. It is a purpose-built, embedded, columnar graph database written in C++. Want the average age of all Person nodes
If you have ever tried to run graph algorithms on a dataset with millions of relationships, you know the drill: spin up a Neo4j instance, manage Docker, worry about memory, or hack something together with NetworkX and watch it crash.
What graph workload would you run embedded? Let me know in the comments. Disclaimer: I am not affiliated with the Kuzu team — just an engineer who appreciates well-designed data infrastructure.
Zero-Setup Graph Analytics: Diving into Kuzu v0 Subtitle: An embedded columnar graph database that actually feels like a library.