During the 2nd edition of the DuckCon in Brussels, Motherduck-blogger Mehdi Ouazza interviewed DuckDB co-creator Hannes Mühleisen. Hannes has been working in a group called Database Architectures for ten years, where they research how data systems should be built.
How it started
In his work, Hannes discovered that some data practitioners, particularly in the R community, were not using databases at all. Instead, they used hand-rolled dataframe engines and dataframes in memory. However, these dataframes were slow and limited because of how the engines were structured. That was the first bit that inspired DuckDB to be created.
Databases are cumbersome for local development
Data practitioners were not excited about traditional databases because they’re difficult to install and configure. It’s not smooth to run a database locally. Plus, the client protocol of databases like JDBC, built in the 90s, hasn’t faced significant upgrades. Hannes wanted to research how he could build a database for these people while removing the hassle of managing one.
Database engine SQLite was a big inspiration for DuckDB. SQLite has no server, and it’s in-process with a simple library. However SQLite was designed for transactional workloads (with row-based storage). This limited the performance of SQLite for these use cases and presented an opportunity. In-process analytics database are a brand new class of databases, which was exciting for Hannes as a researcher.
This was just the beginning of the story and not even close to what we know today as 'DuckDB'. But Hannes isn’t done with the DuckDB project: “My definition of success as a researcher is not to write papers but to have an impact. In the area of data systems, it is required to make something that will see widespread use in order to achieve impact.”
Check out the full interview below.
Header photo: Shutterstock