sq add

Use sq add to add a data source. The source can be a SQL database, or a document such as a CSV or Excel file. This action will add an entry to sq’s config file.

If you later want to change the source, generally the easiest solution is to sq rm @handle and then sq add again. However, you can also directly edit the config file (e.g. vi ~/.config/sq/sq.yml).

A data source has three main elements:

  • driver type: such as postgres, or csv. You may also see this referred to as the source type or simply type.
  • handle: such as @sakila_pg. A handle always starts with @. The handle is used to refer to the data source.
  • location: such as postgres://user:p_ssW0rd@localhost/sakila. For a document source, location may just be a file path, e.g. /Users/neilotoole/sakila.csv.

The format of the command is:

sq add [--handle HANDLE] [--driver DRIVER] [--active] LOCATION

For example, to add a postgres data source:

$ sq add postgres://sakila:p_ssW0rd@localhost/sakila
@sakila_pg  postgres  sakila@localhost/sakila

Note that flags can generally be omitted. If --handle is omitted, sq will generate a handle. In the example above, the generated handle is @sakila_pg. Usually --driver can also be omitted, and sq will determine the driver type. The --active flag immediately sets the newly-added source as the active source (this also happens regardless if there is not currently an active source).

To add a document source, you can generally just add the file path:

sq add ~/customers.csv

Password visibility

In the Postgres example above, the location string includes the database password. This is a security hazard, as the password value is visible on the command line, and in shell history etc. You can use the --password / -p flag to be prompted for the password.

$ sq add 'postgres://user@localhost/sakila' -p
Password: ****

You can also read the password from a file or a shell variable. For example:

# Add a source, but read password from an environment variable
$ export PASSWORD='open:;"_Ses@me'
$ sq add 'postgres://user@localhost/sakila' -p <<< $PASSWORD

# Same as above, but instead read password from file
$ echo 'open:;"_Ses@me' > password.txt
$ sq add 'postgres://user@localhost/sakila' -p < password.txt

Location completion

It can be difficult to remember the format of database URLs (i.e. the source location). To make life easier, sq provides shell completion for the sq add LOCATION field. To use it, just press TAB after $ sq add.

For location completion to work, do not enclose the location in single quotes. However, this does mean that the inputted location string must escape special shell characters such as ? and &.

# Location completion not available, because location is in quotes.
$ sq add 'postgres://sakila@192.168.50.132/sakila?sslmode=disable'

# Location completion available: note the escaped ?.
$ sq add postgres://sakila@192.168.50.132/sakila\?sslmode=disable

The location completion mechanism suggests usernames, hostnames (from history), database names, and even values for query params (e.g. ?sslmode=disable) for each supported database. It never suggests passwords.

Header row

File formats like CSV/TSV or Excel often have a header row. sq can usually auto-detect if a header row is present. But depending on the nature of the data file, it may be necessary to explicitly tell sq to use a header row (or not).

$ sq add ./actor.csv --ingest.header

Reference

Add data source specified by LOCATION, optionally identified by @HANDLE.

Usage:
  sq add [--handle @HANDLE] LOCATION

Examples:

When adding a data source, LOCATION is the only required arg.

  $ sq add ./actor.csv
  @actor  csv  actor.csv

Note that sq generated the handle "@actor". But you can explicitly specify
a handle.

  # Add a postgres source with handle "@sakila/pg"
  $ sq add --handle @sakila/pg postgres://user:pass@localhost/sakila

This handle format "@sakila/pg" includes a group, "sakila". Using a group
is entirely optional: it is a way to organize sources. For example:

  $ sq add --handle @dev/pg postgres://user:pass@dev.db.acme.com/sakila
  $ sq add --handle @prod/pg postgres://user:pass@prod.db.acme.com/sakila

The format of LOCATION is driver-specific, but is generally a DB connection
string, a file path, or a URL.

  DRIVER://USER:PASS@HOST:PORT/DBNAME?PARAM=VAL
  /path/to/local/file.ext
  https://sq.io/data/test1.xlsx

If LOCATION contains special shell characters, it's necessary to enclose
it in single quotes, or to escape the special character. For example,
note the "\?" in the unquoted location below.

  $ sq add postgres://user:pass@localhost/sakila\?sslmode=disable

A significant advantage of not quoting LOCATION is that sq provides extensive
shell completion when inputting the location value.

If flag --handle is omitted, sq will generate a handle based
on LOCATION and the source driver type.

It's a security hazard to expose the data source password via
the LOCATION string. If flag --password (-p) is set, sq prompt the
user for the password:

  $ sq add postgres://user@localhost/sakila -p
  Password: ****

However, if there's input on stdin, sq will read the password from
there instead of prompting the user:

  # Add a source, but read password from an environment variable
  $ export PASSWD='open:;"_Ses@me'
  $ sq add postgres://user@localhost/sakila -p <<< $PASSWD

  # Same as above, but instead read password from file
  $ echo 'open:;"_Ses@me' > password.txt
  $ sq add postgres://user@localhost/sakila -p < password.txt

There are various driver-specific options available. For example:

  $ sq add actor.csv --ingest.header=false --driver.csv.delim=colon

If flag --driver is omitted, sq will attempt to determine the
type from LOCATION via file suffix, content type, etc. If the result
is ambiguous, explicitly specify the driver type.

  $ sq add --driver=tsv ./mystery.data

Available source driver types can be listed via "sq driver ls". At a
minimum, the following drivers are bundled:

  sqlite3    SQLite
  postgres   PostgreSQL
  sqlserver  Microsoft SQL Server / Azure SQL Edge
  mysql      MySQL
  csv        Comma-Separated Values
  tsv        Tab-Separated Values
  json       JSON
  jsona      JSON Array: LF-delimited JSON arrays
  jsonl      JSON Lines: LF-delimited JSON objects
  xlsx       Microsoft Excel XLSX

If there isn't already an active source, the newly added source becomes the
active source (but the active group does not change). Otherwise you can
use flag --active to make the new source active.

More examples:

  # Add a source, but prompt user for password
  $ sq add postgres://user@localhost/sakila -p
  Password: ****

  # Explicitly set flags
  $ sq add --handle @sakila_pg --driver postgres postgres://user:pass@localhost/sakila

  # Same as above, but with short flags
  $ sq add -n @sakila_pg -d postgres postgres://user:pass@localhost/sakila

  # Specify some params (note escaped chars)
  $ sq add postgres://user:pass@localhost/sakila\?sslmode=disable\&application_name=sq

  # Specify some params, but use quoted string (no shell completion)
  $ sq add 'postgres://user:pass@localhost/sakila?sslmode=disable&application_name=sq''

  # Add a SQL Server source; will have generated handle @sakila
  $ sq add 'sqlserver://user:pass@localhost?database=sakila'

  # Add a SQLite DB, and immediately make it the active source
  $ sq add ./testdata/sqlite1.db --active

  # Add an Excel spreadsheet, with options
  $ sq add ./testdata/test1.xlsx --ingest.header=true

  # Add a CSV source, with options
  $ sq add ./testdata/person.csv --ingest.header=true

  # Add a CSV source from a URL (will be downloaded)
  $ sq add https://sq.io/testdata/actor.csv

  # Add a source, and make it the active source (and group)
  $ sq add ./actor.csv --handle @csv/actor

  # Add a currently unreachable source
  $ sq add postgres://user:pass@db.offline.com/sakila --skip-verify

Flags:
  -t, --text                       Output text
  -h, --header                     Print header row (default true)
  -H, --no-header                  Don't print header row
  -j, --json                       Output JSON
  -c, --compact                    Compact instead of pretty-printed output
  -y, --yaml                       Output YAML
  -d, --driver string              Explicitly specify driver to use
  -n, --handle string              Handle for the source
  -p, --password                   Read password from stdin or prompt
      --skip-verify                Don't ping source before adding it
  -a, --active                     Make this the active source
      --ingest.header              Ingest data has a header row
      --driver.csv.empty-as-null   Treat ingest empty CSV fields as NULL (default true)
      --driver.csv.delim string    Delimiter for ingest CSV data (default "comma")
      --help                       help for add

Global Flags:
      --config string         Load config from here
      --debug.pprof string    pprof profiling mode (default "off")
      --error.format string   Error output format (default "text")
  -E, --error.stack           Print error stack trace to stderr
      --log                   Enable logging
      --log.file string       Log file path (default "/Users/neilotoole/Library/Logs/sq/sq.log")
      --log.format string     Log output format (text or json) (default "text")
      --log.level string      Log level, one of: DEBUG, INFO, WARN, ERROR (default "DEBUG")
  -M, --monochrome            Don't print color output
      --no-progress           Don't show progress bar
      --no-redact             Don't redact passwords in output
  -v, --verbose               Print verbose output