swiss-army knife for data
sq
is a free/libre open-source swiss-army knife
to inspect, query, join,
import, and export data. You can think of it as jq for
relational data, whether that data is in a document or database.
sq
has a rudimentary jq
-style query language (formally known as SLQ
).
This
query language is limited in scope. Fortunately, for complex
queries, you can always fall back to database-native SQL queries.
brew install neilotoole/sq/sq
scoop bucket add sq https://github.com/neilotoole/sq
scoop install sq/bin/sh -c "$(curl -fsSL https://sq.io/install.sh)"
There are other installation options available.
Be sure to install shell completion.
For help, $ sq help
is your starting point. And then see
the docs.
sq
is pronounced like seek. Its query language, SLQ
,
is pronounced like sleek.
Some feature highlights are shown below. For more, see the docs, including the tutorial and cookbook.
Insert the contents of an Excel XLSX worksheet (from a sheet named actor
) into
a new Postgres table named xl_actor
. Note that the import mechanism
is reasonably sophisticated in that it tries to preserve data types.
$ cat ./sakila.xlsx | sq .actor --opts header=true --insert @sakila_pg9.xl_actor
$ sq add mysql://sakila@192.168.50.136/sakila -p <<< $PASSWD
$ sq inspect @sakila_my -j | jq -r '.tables[] | select(.name == "actor") | .columns[] | .name'
Even easier, just get the metadata for the table you want:
$ sq inspect @sakila_my.actor -j | jq -r '.columns[] | .name'
The --json
flag to sq inspect
outputs schema and other metadata in JSON.
Note that the output exposes connection string details. Typically the output is piped to
jq
to select the interesting elements.
$ sq inspect @sakila_my --json
$ sq add "sqlserver://sakila:$PASSWD@192.168.50.130?database=sakila"
$ sq add ./sakila.db --handle @sakila_sl3
$ sq sql --src=@sakila_mssql "SELECT * FROM actor WHERE first_name = 'PENELOPE'" --insert @sakila_sl3.actor_copy
Get the (JSON) metadata for the active source; pipe that JSON to jq
and extract
the table names; pipe the table names to xargs
, invoking sq
once for
each table, outputting a CSV file per table.
$ sq inspect -j | jq -r '.tables[] | .name' | xargs -I % sq .% --csv --output %.csv
If you instead wanted to use sql
mode:
$ sq inspect -j | jq -r '.tables[] | .name' | xargs -I % sq sql 'SELECT * FROM %' --csv --output %.csv
Commands to add, set active, list, ping, or remove sources.
$ sq src # show active source
$ sq add ./actor.tsv # add a source
$ sq src @actor_tsv # set active source
$ sq ls # list sources
$ sq ls -v # list sources (verbose)
$ sq ping --all # ping all sources
$ sq rm @actor_tsv # remove a source
Convenient commands that act on database tables.
Note that sq tbl copy
only applies within a single database. That is, to copy a table from one database
to another, use a query with --insert @target.tbl
.
$ sq tbl copy .actor .actor2
$ sq tbl truncate .actor2
$ sq tbl drop .actor2
JSONL output is a row of JSON per line (hence "JSON Lines"). Lots of log output is like this.
We can use sq
's own log output (typically in ~/.config/sq/sq.log
)
as an example:
{"level":"debug","time":"06:49:13.751707","caller":"cli/cli.go:100:ExecuteWith","msg":"EXECUTE: ls"}
{"level":"debug","time":"06:49:13.751838","caller":"cli/cli.go:101:ExecuteWith","msg":"Build: v0.15.0-wip cd6d90a8 2021-02-27T06:49:07-0700"}
{"level":"debug","time":"06:49:13.751851","caller":"cli/cli.go:102:ExecuteWith","msg":"Config (cfg version \"v0.14.9-wip\") from: /Users/neilotoole/.config/sq/sq.yml"}
Pluck out the level
and msg
fields from the first three lines of that input:
$ cat ~/.config/sq/sq.log | sq '.data | .[0:3] | .level, .msg'
sq
supports input from and output to common databases.
In addition to SQL sources, sq
can import data from these formats:
sq
can output to a plethora of file formats, or insert results directly
to a database table.
INSERT
to Postgres, SQLite, SQL Server / Azure SQL Edge, MySQL