sq add
Use sq add
to add a data source. The source can be a SQL database, or a document
such as a CSV or Excel file. This action will add an entry to sq
’s
config file.
If you later want to change the source, generally the easiest solution is to
sq rm @handle
and then sq add
again. However, you can also directly edit
the config file (e.g. vi ~/.config/sq/sq.yml
).
A data source has three main elements:
driver type
: such aspostgres
, orcsv
. You may also see this referred to as the source type or simply type.handle
: such as@sakila_pg
. A handle always starts with@
. The handle is used to refer to the data source.location
: such aspostgres://user:p_ssW0rd@@localhost/sakila
. For a document source, location may just be a file path, e.g./Users/neilotoole/sakila.csv
.
The format of the command is:
sq add [--handle HANDLE] [--driver DRIVER] [--active] LOCATION
For example, to add a postgres data source:
$ sq add postgres://sakila:p_ssW0rd@localhost/sakila
@sakila_pg postgres sakila@localhost/sakila
Note that flags can generally be omitted. If --handle
is omitted,
sq
will generate a handle. In the example above, the generated handle
is @sakila_pg
. Usually --driver
can also be omitted, and sq
will determine the driver type. The --active
flag immediately sets
the newly-added source as the active source (this also happens regardless if there is
not currently an active source).
To add a document source, you can generally just add the file path:
sq add ~/customers.csv
Password visibility
In the Postgres example above, the location string includes the database password. This is a
security hazard, as the password value is visible on the command line, and in
shell history etc. You can use the --password
/ -p
flag to be prompted
for the password.
$ sq add 'postgres://user@localhost/sakila' -p
Password: ****
You can also read the password from a file or a shell variable. For example:
# Add a source, but read password from an environment variable
$ export PASSWORD='open:;"_Ses@me'
$ sq add 'postgres://user@localhost/sakila' -p <<< $PASSWORD
# Same as above, but instead read password from file
$ echo 'open:;"_Ses@me' > password.txt
$ sq add 'postgres://user@localhost/sakila' -p < password.txt
Header row
File formats like CSV/TSV often have a header row. sq
can usually auto-detect
if a header row is present. But depending on the nature of the data file,
it may be necessary to explicitly tell sq
to use a header row (or not).
$ sq add ./actor.csv --ingest.header
Reference
Add data source specified by LOCATION, optionally identified by @HANDLE.
Usage:
sq add [--handle @HANDLE] LOCATION
Examples:
When adding a data source, LOCATION is the only required arg.
$ sq add ./actor.csv
@actor csv actor.csv
Note that sq generated the handle "@actor". But you can explicitly specify
a handle.
# Add a postgres source with handle "@sakila/pg"
$ sq add --handle @sakila/pg 'postgres://user:pass@localhost/sakila'
This handle format "@sakila/pg" includes a group, "sakila". Using a group
is entirely optional: it is a way to organize sources. For example:
$ sq add --handle @dev/pg 'postgres://user:pass@dev.db.example.com/sakila'
$ sq add --handle @prod/pg 'postgres://user:pass@prod.db.acme.com/sakila'
The format of LOCATION is driver-specific, but is generally a DB connection
string, a file path, or a URL.
DRIVER://USER:PASS@HOST:PORT/DBNAME
/path/to/local/file.ext
https://sq.io/data/test1.xlsx
If flag --handle is omitted, sq will generate a handle based
on LOCATION and the source driver type.
It's a security hazard to expose the data source password via
the LOCATION string. If flag --password (-p) is set, sq prompt the
user for the password:
$ sq add 'postgres://user@localhost/sakila' -p
Password: ****
However, if there's input on stdin, sq will read the password from
there instead of prompting the user:
# Add a source, but read password from an environment variable
$ export PASSWD='open:;"_Ses@me'
$ sq add 'postgres://user@localhost/sakila' -p <<< $PASSWD
# Same as above, but instead read password from file
$ echo 'open:;"_Ses@me' > password.txt
$ sq add 'postgres://user@localhost/sakila' -p < password.txt
There are various driver-specific options available. For example:
$ sq add actor.csv --ingest.header=false --driver.csv.delim=colon
If flag --driver is omitted, sq will attempt to determine the
type from LOCATION via file suffix, content type, etc.. If the result
is ambiguous, explicitly specify the driver type.
$ sq add --driver=tsv ./mystery.data
Available source driver types can be listed via "sq driver ls". At a
minimum, the following drivers are bundled:
sqlite3 SQLite
postgres PostgreSQL
sqlserver Microsoft SQL Server / Azure SQL Edge
mysql MySQL
csv Comma-Separated Values
tsv Tab-Separated Values
json JSON
jsona JSON Array: LF-delimited JSON arrays
jsonl JSON Lines: LF-delimited JSON objects
xlsx Microsoft Excel XLSX
If there isn't already an active source, the newly added source becomes the
active source (but the active group does not change). Otherwise you can
use flag --active to make the new source active.
More examples:
# Add a source, but prompt user for password
$ sq add 'postgres://user@localhost/sakila' -p
Password: ****
# Explicitly set flags
$ sq add --handle @sakila_pg --driver postgres 'postgres://user:pass@localhost/sakila'
# Same as above, but with short flags
$ sq add -n @sakila_pg -d postgres 'postgres://user:pass@localhost/sakila'
# Add a SQL Server source; will have generated handle @sakila
$ sq add 'sqlserver://user:pass@localhost?database=sakila'
# Add a sqlite db, and immediately make it the active source
$ sq add ./testdata/sqlite1.db --active
# Add an Excel spreadsheet, with options
$ sq add ./testdata/test1.xlsx --ingest.header=true
# Add a CSV source, with options
$ sq add ./testdata/person.csv --ingest.header=true
# Add a CSV source from a URL (will be downloaded)
$ sq add https://sq.io/testdata/actor.csv
# Add a source, and make it the active source (and group)
$ sq add ./actor.csv --handle @csv/actor
# Add a currently unreachable source
$ sq add 'postgres://user:pass@db.offline.com/sakila' --skip-verify
Flags:
-t, --text Output text
-h, --header Print header row (default true)
-H, --no-header Don't print header row
-j, --json Output JSON
-c, --compact Compact instead of pretty-printed output
-y, --yaml Output YAML
-d, --driver string Explicitly specify the driver to use
-n, --handle string Handle for the source
-p, --password Read password from stdin or prompt
--skip-verify Don't ping source before adding it
-a, --active Make this the active source
--ingest.header Treat first row of ingest data as header
--driver.csv.empty-as-null Treat empty CSV fields as null (default true)
--driver.csv.delim string CSV delimiter: one of comma, space, pipe, tab, colon, semi, period (default "comma")
--help help for add
Global Flags:
--config string Load config from here
--log Enable logging
--log.file string Path to log file; empty disables logging
--log.level string Log level: one of DEBUG, INFO, WARN, ERROR
-M, --monochrome Don't colorize output
-v, --verbose Verbose output