sq add
Use sq add
to add a data source. The source can be a SQL database, or a document
such as a CSV or Excel file. This action will add an entry to sq
’s
config file.
If you later want to change the source, generally the easiest solution is to
sq rm @handle
and then sq add
again. However, you can also directly edit
the config file (e.g. vi ~/.config/sq/sq.yml
).
A data source has three main elements:
driver type
: such aspostgres
, orcsv
. You may also see this referred to as the source type or simply type.handle
: such as@sakila_pg
. A handle always starts with@
. The handle is used to refer to the data source.location
: such aspostgres://user:p_ssW0rd@localhost/sakila
. For a document source, location may just be a file path, e.g./Users/neilotoole/sakila.csv
.
The format of the command is:
sq add [--handle HANDLE] [--driver DRIVER] [--active] LOCATION
For example, to add a postgres data source:
$ sq add postgres://sakila:p_ssW0rd@localhost/sakila
@sakila_pg postgres sakila@localhost/sakila
Note that flags can generally be omitted. If --handle
is omitted,
sq
will generate a handle. In the example above, the generated handle
is @sakila_pg
. Usually --driver
can also be omitted, and sq
will determine the driver type. The --active
flag immediately sets
the newly-added source as the active source (this also happens regardless if there is
not currently an active source).
To add a document source, you can generally just add the file path:
sq add ~/customers.csv
Password visibility
In the Postgres example above, the location string includes the database password. This is a
security hazard, as the password value is visible on the command line, and in
shell history etc. You can use the --password
/ -p
flag to be prompted
for the password.
$ sq add 'postgres://user@localhost/sakila' -p
Password: ****
You can also read the password from a file or a shell variable. For example:
# Add a source, but read password from an environment variable
$ export PASSWORD='open:;"_Ses@me'
$ sq add 'postgres://user@localhost/sakila' -p <<< $PASSWORD
# Same as above, but instead read password from file
$ echo 'open:;"_Ses@me' > password.txt
$ sq add 'postgres://user@localhost/sakila' -p < password.txt
Location completion
It can be difficult to remember the format of database URLs (i.e. the source location).
To make life easier, sq
provides shell completion for the sq add LOCATION
field. To
use it, just press TAB
after $ sq add
.
For location completion to work, do not enclose the location in single quotes. However,
this does mean that the inputted location string must escape special shell characters
such as ?
and &
.
# Location completion not available, because location is in quotes.
$ sq add 'postgres://sakila@192.168.50.132/sakila?sslmode=disable'
# Location completion available: note the escaped ?.
$ sq add postgres://sakila@192.168.50.132/sakila\?sslmode=disable
The location completion mechanism suggests usernames, hostnames (from history),
database names, and even values for query params (e.g. ?sslmode=disable
) for
each supported database. It never suggests passwords.
Header row
File formats like CSV/TSV or Excel often have a header row. sq
can usually auto-detect
if a header row is present. But depending on the nature of the data file,
it may be necessary to explicitly tell sq
to use a header row (or not).
$ sq add ./actor.csv --ingest.header
Reference
Add data source specified by LOCATION, optionally identified by @HANDLE.
Usage:
sq add [--handle @HANDLE] LOCATION
Examples:
When adding a data source, LOCATION is the only required arg.
$ sq add ./actor.csv
@actor csv actor.csv
Note that sq generated the handle "@actor". But you can explicitly specify
a handle.
# Add a postgres source with handle "@sakila/pg"
$ sq add --handle @sakila/pg postgres://user:pass@localhost/sakila
This handle format "@sakila/pg" includes a group, "sakila". Using a group
is entirely optional: it is a way to organize sources. For example:
$ sq add --handle @dev/pg postgres://user:pass@dev.db.acme.com/sakila
$ sq add --handle @prod/pg postgres://user:pass@prod.db.acme.com/sakila
The format of LOCATION is driver-specific, but is generally a DB connection
string, a file path, or a URL.
DRIVER://USER:PASS@HOST:PORT/DBNAME?PARAM=VAL
/path/to/local/file.ext
https://sq.io/data/test1.xlsx
If LOCATION contains special shell characters, it's necessary to enclose
it in single quotes, or to escape the special character. For example,
note the "\?" in the unquoted location below.
$ sq add postgres://user:pass@localhost/sakila\?sslmode=disable
A significant advantage of not quoting LOCATION is that sq provides extensive
shell completion when inputting the location value.
If flag --handle is omitted, sq will generate a handle based
on LOCATION and the source driver type.
It's a security hazard to expose the data source password via
the LOCATION string. If flag --password (-p) is set, sq prompt the
user for the password:
$ sq add postgres://user@localhost/sakila -p
Password: ****
However, if there's input on stdin, sq will read the password from
there instead of prompting the user:
# Add a source, but read password from an environment variable
$ export PASSWD='open:;"_Ses@me'
$ sq add postgres://user@localhost/sakila -p <<< $PASSWD
# Same as above, but instead read password from file
$ echo 'open:;"_Ses@me' > password.txt
$ sq add postgres://user@localhost/sakila -p < password.txt
There are various driver-specific options available. For example:
$ sq add actor.csv --ingest.header=false --driver.csv.delim=colon
If flag --driver is omitted, sq will attempt to determine the
type from LOCATION via file suffix, content type, etc. If the result
is ambiguous, explicitly specify the driver type.
$ sq add --driver=tsv ./mystery.data
Available source driver types can be listed via "sq driver ls". At a
minimum, the following drivers are bundled:
sqlite3 SQLite
postgres PostgreSQL
sqlserver Microsoft SQL Server / Azure SQL Edge
mysql MySQL
csv Comma-Separated Values
tsv Tab-Separated Values
json JSON
jsona JSON Array: LF-delimited JSON arrays
jsonl JSON Lines: LF-delimited JSON objects
xlsx Microsoft Excel XLSX
If there isn't already an active source, the newly added source becomes the
active source (but the active group does not change). Otherwise you can
use flag --active to make the new source active.
More examples:
# Add a source, but prompt user for password
$ sq add postgres://user@localhost/sakila -p
Password: ****
# Explicitly set flags
$ sq add --handle @sakila_pg --driver postgres postgres://user:pass@localhost/sakila
# Same as above, but with short flags
$ sq add -n @sakila_pg -d postgres postgres://user:pass@localhost/sakila
# Specify some params (note escaped chars)
$ sq add postgres://user:pass@localhost/sakila\?sslmode=disable\&application_name=sq
# Specify some params, but use quoted string (no shell completion)
$ sq add 'postgres://user:pass@localhost/sakila?sslmode=disable&application_name=sq''
# Add a SQL Server source; will have generated handle @sakila
$ sq add 'sqlserver://user:pass@localhost?database=sakila'
# Add a SQLite DB, and immediately make it the active source
$ sq add ./testdata/sqlite1.db --active
# Add an Excel spreadsheet, with options
$ sq add ./testdata/test1.xlsx --ingest.header=true
# Add a CSV source, with options
$ sq add ./testdata/person.csv --ingest.header=true
# Add a CSV source from a URL (will be downloaded)
$ sq add https://sq.io/testdata/actor.csv
# Add a source, and make it the active source (and group)
$ sq add ./actor.csv --handle @csv/actor
# Add a currently unreachable source
$ sq add postgres://user:pass@db.offline.com/sakila --skip-verify
Flags:
-t, --text Output text
-h, --header Print header row (default true)
-H, --no-header Don't print header row
-j, --json Output JSON
-c, --compact Compact instead of pretty-printed output
-y, --yaml Output YAML
-d, --driver string Explicitly specify driver to use
-n, --handle string Handle for the source
-p, --password Read password from stdin or prompt
--skip-verify Don't ping source before adding it
-a, --active Make this the active source
--ingest.header Ingest data has a header row
--driver.csv.empty-as-null Treat ingest empty CSV fields as NULL (default true)
--driver.csv.delim string Delimiter for ingest CSV data (default "comma")
--help help for add
Global Flags:
--config string Load config from here
--debug.pprof string pprof profiling mode (default "off")
--error.format string Error output format (default "text")
-E, --error.stack Print error stack trace to stderr
--log Enable logging
--log.file string Log file path (default "/Users/neilotoole/Library/Logs/sq/sq.log")
--log.format string Log output format (text or json) (default "text")
--log.level string Log level, one of: DEBUG, INFO, WARN, ERROR (default "DEBUG")
-M, --monochrome Don't print color output
--no-progress Don't show progress bar
--no-redact Don't redact passwords in output
-v, --verbose Print verbose output