DuckDB
sq’s DuckDB driver implements connectivity for
DuckDB. It makes use of the backing
duckdb/duckdb-go/v2 library,
which statically links libduckdb. The driver implements all optional
sq driver features.
Add source
Use sq add to add a source. The location argument is the
filepath to the DuckDB file. For example:
# Relative path
$ sq add ./sakila.duckdb
# Absolute path
$ sq add /Users/neilotoole/sakila.duckdb
sq can detect that a file is a DuckDB datafile
via the .duckdb or .ddb extension, or by the DUCK magic header at byte
offset 8. If auto-detection doesn’t trigger, specify the driver explicitly:
$ sq add --driver=duckdb ./sakila.duckdb
Use the duckdb:// scheme to pass connection parameters or to open an
in-memory database:
# Persistent file via URI
$ sq add 'duckdb:///abs/path/sakila.duckdb'
# In-memory database
$ sq add 'duckdb://:memory:'
# File with connection parameters
$ sq add 'duckdb:///path/sakila.duckdb?memory_limit=4GB&threads=4'
Bundled extensions
The driver statically links the standard set of in-tree DuckDB extensions.
They are available immediately — no INSTALL or LOAD required:
| Extension | Purpose |
|---|---|
json | JSON read/write functions |
parquet | Parquet read/write (read_parquet()) |
icu | ICU collations and Unicode functions |
fts | Full-text search |
httpfs | HTTP(S) and S3 file access |
excel | Excel read (excel_open()) |
inet | IP address types and functions |
autocomplete | SQL auto-completion helpers |
tpch | TPC-H benchmark tables |
tpcds | TPC-DS benchmark tables |
Because the extensions are statically bundled, queries like the following work without any setup:
-- Query a Parquet file directly
SELECT * FROM read_parquet('file.parquet');
-- Query a remote CSV via HTTPS
SELECT * FROM read_csv_auto('https://example.com/data.csv');
-- Query an S3 object (set AWS credentials first)
SELECT * FROM read_parquet('s3://bucket/key.parquet');
Connection parameters
Pass parameters as URL query strings after the file path:
$ sq add 'duckdb:///path/db.duckdb?memory_limit=4GB&threads=4'
| Parameter | Values | Description |
|---|---|---|
access_mode | READ_WRITE, READ_ONLY | Open the database read-only |
memory_limit | e.g. 4GB | Maximum memory DuckDB may use |
threads | integer | Number of threads |
default_order | ASC, DESC | Default sort order |
default_null_order | NULLS_FIRST, NULLS_LAST | Default NULL sort position |
enable_external_access | true, false | Allow reading from external files/URLs |
enable_object_cache | true, false | Cache metadata for remote objects |
temp_directory | path | Directory for temporary files |
wal_autocheckpoint | e.g. 1000 | WAL autocheckpoint threshold (pages) |
DuckDB supports many more settings; the list above covers the parameters most commonly used from the CLI.
Read-only access by default
For commands that don’t write to the source, sq opens DuckDB databases with
access_mode=READ_ONLY. This applies to sq inspect,
sq, sq diff, and sq ping.
The benefits: sq does not touch the file’s WAL or modification time, multiple sq
processes can read the same file concurrently, and sq inspect works on files you have
read-only access to.
Commands that write (sq tbl copy,
sq tbl drop,
sq tbl truncate) continue to open READ_WRITE.
For sq sql, the default remains READ_WRITE because the SQL
statement is opaque to sq. Use sq sql --readonly (or --ro) to opt in to
read-only mode.
To override the default for any command, set access_mode explicitly in the source
location:
$ sq add 'duckdb:///data/inventory.duckdb?access_mode=READ_WRITE' --handle @inv
An explicit URL access_mode always wins over the implicit defaults for sq inspect,
sq slq, sq diff, and sq ping. For sq sql --readonly (or --ro), the explicit
flag and an explicit URL access_mode=READ_WRITE are treated as a contradiction and
sq returns a conflict error: the flag says “do not write”, the URL says “open for
writes”, and the user has to resolve the ambiguity before any open happens.
Type mapping
| DuckDB type | sq kind | Notes |
|---|---|---|
BOOLEAN | bool | |
TINYINT … BIGINT, UTINYINT, USMALLINT, UINTEGER | int | |
HUGEINT, UHUGEINT, UBIGINT, INT128 | decimal | Promoted to decimal because values can exceed int64. |
FLOAT, REAL, DOUBLE | float | |
DECIMAL(p,s) | decimal | |
VARCHAR, TEXT, STRING | text | |
BLOB, BYTEA | bytes | |
DATE | date | |
TIME, TIME WITH TIME ZONE | time | |
TIMESTAMP, TIMESTAMP_S/MS/NS, TIMESTAMPTZ | datetime | |
INTERVAL | text | DuckDB’s native text form; typed kinds tracked in #681. |
UUID | text | Hex-dash form. |
JSON | text | Already JSON. |
LIST / ARRAY | text | Go-style stringification; JSON projection planned (#609). |
STRUCT | text | Go-style stringification; JSON projection planned (#609). |
MAP | text | Go-style stringification; JSON projection planned (#609). |
ENUM | text | Underlying value. |
BIT | text | Bit-string. |
Composite types (LIST, STRUCT, MAP) are currently stringified via Go’s
default formatting (fmt.Sprintf("%v", v)). First-class JSON projection via
DuckDB’s to_json(col) is planned as a follow-up — see #609.
Limitations
- DuckDB enforces a single writer per database file. If two
sqprocesses open the same.duckdbfile simultaneously, the second will receive a lock error. Forsq’s typical single-shot CLI usage this is rarely a problem, but scripts that parallelizesqagainst the same file should serialize writes. Read-only access (access_mode=READ_ONLY) from multiple processes is safe.