DuckDB
sq’s DuckDB driver implements connectivity for
DuckDB
. It makes use of the backing
duckdb/duckdb-go/v2
library,
which statically links libduckdb. The driver implements all optional
sq driver features.
Add source
Use sq add
to add a source. The location argument is the
filepath to the DuckDB file. For example:
# Relative path
$ sq add ./sakila.duckdb
# Absolute path
$ sq add /Users/neilotoole/sakila.duckdb
sq can detect
that a file is a DuckDB datafile
via the .duckdb or .ddb extension, or by the DUCK magic header at byte
offset 8. If auto-detection doesn’t trigger, specify the driver explicitly:
$ sq add --driver=duckdb ./sakila.duckdb
Use the duckdb:// scheme to pass connection parameters or to open an
in-memory database:
# Persistent file via URI
$ sq add 'duckdb:///abs/path/sakila.duckdb'
# In-memory database
$ sq add 'duckdb://:memory:'
# File with connection parameters
$ sq add 'duckdb:///path/sakila.duckdb?memory_limit=4GB&threads=4'
Bundled extensions
The driver statically links the standard set of in-tree DuckDB extensions.
They are available immediately — no INSTALL or LOAD required:
| Extension | Purpose |
|---|---|
json | JSON read/write functions |
parquet | Parquet read/write (read_parquet()) |
icu | ICU collations and Unicode functions |
fts | Full-text search |
httpfs | HTTP(S) and S3 file access |
excel | Excel read (excel_open()) |
inet | IP address types and functions |
autocomplete | SQL auto-completion helpers |
tpch | TPC-H benchmark tables |
tpcds | TPC-DS benchmark tables |
Because the extensions are statically bundled, queries like the following work without any setup:
-- Query a Parquet file directly
SELECT * FROM read_parquet('file.parquet');
-- Query a remote CSV via HTTPS
SELECT * FROM read_csv_auto('https://example.com/data.csv');
-- Query an S3 object (set AWS credentials first)
SELECT * FROM read_parquet('s3://bucket/key.parquet');
Connection parameters
Pass parameters as URL query strings after the file path:
$ sq add 'duckdb:///path/db.duckdb?memory_limit=4GB&threads=4'
| Parameter | Values | Description |
|---|---|---|
access_mode | READ_WRITE, READ_ONLY | Open the database read-only |
memory_limit | e.g. 4GB | Maximum memory DuckDB may use |
threads | integer | Number of threads |
default_order | ASC, DESC | Default sort order |
default_null_order | NULLS_FIRST, NULLS_LAST | Default NULL sort position |
enable_external_access | true, false | Allow reading from external files/URLs |
enable_object_cache | true, false | Cache metadata for remote objects |
temp_directory | path | Directory for temporary files |
wal_autocheckpoint | e.g. 1000 | WAL autocheckpoint threshold (pages) |
DuckDB supports many more settings ; the list above covers the parameters most commonly used from the CLI.
Type mapping
| DuckDB type | sq kind | Notes |
|---|---|---|
BOOLEAN | bool | |
TINYINT … BIGINT, UTINYINT, USMALLINT, UINTEGER | int | |
HUGEINT, UHUGEINT, UBIGINT, INT128 | decimal | Promoted to decimal because values can exceed int64. |
FLOAT, REAL, DOUBLE | float | |
DECIMAL(p,s) | decimal | |
VARCHAR, TEXT, STRING | text | |
BLOB, BYTEA | bytes | |
DATE | date | |
TIME, TIME WITH TIME ZONE | time | |
TIMESTAMP, TIMESTAMP_S/MS/NS, TIMESTAMPTZ | datetime | |
INTERVAL | text | "<months> months <days> days <micros> μs" (see #612
). |
UUID | text | Hex-dash form. |
JSON | text | Already JSON. |
LIST / ARRAY | text | Go-style stringification; JSON projection planned (#609 ). |
STRUCT | text | Go-style stringification; JSON projection planned (#609 ). |
MAP | text | Go-style stringification; JSON projection planned (#609 ). |
ENUM | text | Underlying value. |
BIT | text | Bit-string. |
Composite types (LIST, STRUCT, MAP) are currently stringified via Go’s
default formatting (fmt.Sprintf("%v", v)). First-class JSON projection via
DuckDB’s to_json(col) is planned as a follow-up — see #609
.
Limitations
- DuckDB enforces a single writer per database file. If two
sqprocesses open the same.duckdbfile simultaneously, the second will receive a lock error. Forsq’s typical single-shot CLI usage this is rarely a problem, but scripts that parallelizesqagainst the same file should serialize writes. Read-only access (access_mode=READ_ONLY) from multiple processes is safe.