Building a SQLite extension with Rust
Table of contents
This note captures my experience building a SQLite extension in Rust.
SQLite has the ability to load extensions at runtime. Some examples are application-defined SQL functions, collating sequences, virtual tables and virtual file systems. The SQLite project offers a few examples of extensions, for example the FTS5 virtual table to enable full-text search.
In Rust, the main SQLite library is Rusqlite. However, as per v0.29, it only allows building extensions that are loadable by a Rust application using Rusqlite. This means that Rusqlite is not able to generate an artefact that can be dynamically loaded by, for example, the SQLite CLI or the Python SQLite module.
There is a long standing proposal (circa 2019) that suggests that eventually the project will allow for this feature but there is no expected delivery time. In the meantime we need an alternative.
sqlite-loadable is a young (v0.0.5) attempt offering such feature using an interface very similar to the one offered by Rusqlite for building non-dynamic extensions. The author is using it to build a set of extensions already. Worth checking out sqlite-xsv and sqlite-regex as well as the article Introducing sqlite-loadable-rs: A framework for building SQLite Extensions in Rust.
So I built a SQLite extension that lets you query against a collection of TOML files using sqlite-loadable.
# A virtual table for collections of TOML files
The goal was to build a SQLite extension offering virtual table that would allow a user to query against a set of TOML files doing something like:
CREATE VIRTUAL TABLE temp.recipe USING toml(dirname="recipes");
SELECT DISTINCT
json_extract(ingredient.value, '$.name') AS ingredient_name
FROM
temp.recipe, json_each(json_extract(temp.recipe.value, '$.ingredients')) AS ingredient
ORDER BY ingredient_name
In other words, the extension should surface a virtual table constructor toml()
with a parameter dirname
to provide the root directory for locating TOML files.
And for each TOML file, create a row in the virtual table with its contents serialised as JSON.
A future iteration could allow passing the list of expected columns instead but this approach keeps things simple with a fixed set of columns: filename
and value
.
With that and the JSON functions and operators it is possible to query, normalise and manipulate the data freely.
# Building blocks
Any extension requires an entry point with a shape as follows:
The sqlite_entrypoint
takes care of the boilerplate required by SQLite.
A virtual table needs two structs, one implementing the VTab
trait and another one implementing the VTabCursor
trait.
VTab
takes care of processing the given parameters, creating the virtual table and handling the cursor for a given query.
The main focus is on the connect
method, where the work of checking the incoming parameters are valid should happen. It expects returning the schema as a CREATE TABLE
statement. In this case a static:
"filename" TEXT,
"value" TEXT
);
(
Indexes are to be built using the best_index
method. I opted for a non-optimal but easy no-index situation so every query runs a full scan.
Finally, the open
method takes care of initialising the cursor. In my implementation I read from disk and load into memory all data at this point.
VTabCursor
takes care of a row from the virtual table, a-la Iterator
. It requires implementing the methods filter
, next
, column
, eof
and rowid
.
For this implementation filter
is a no-op method given that we are not using indexes.
next
checks that the current rowid is in bounds of the data stored in memory and if so, increments rowid.
column
is where the magic happens. It reads from the data in memory and returns (via a shared Context object) the value for the current {row, column}. In this naive implementation all values are TEXT
but in a fancier version with a custom set of columns this would be the point to determine the right column affinity.
And that's it. sqlite-loadable takes care of the rest.
# Closing thoughts
It took me a while to realise that Rusqlite was not able to handle loadable extensions which was a bit frustrating. sqlite-loadable was great though. It is designed after Rusqlite which means that pretty much all the work I did initally was portable within an hour.
That said, the fact that sqlite-loadable is an unstable large set of unsafe
code makes me not recommend it for general use. It looks highly promising though.