Concepts

sq

sq is the command-line utility itself. It is free/libre open-source software, available under the MIT License. The code is available on GitHub. sq was created by Neil O’Toole.

SLQ

SLQ is the formal name of sq’s query language, similar to jq’s syntax. The Antlr grammar is available on GitHub.

Source

A source is a data source such as a database instance (SQL source), or an Excel or CSV file (document source). A source has a driver type, location and handle.

Learn more in the sources section.

Driver type

This is the driver type used to connect to the source, e.g. postgres, sqlserver, csv, etc. You can specify the type explicitly when invoking sq add, but usually sq can detect the driver automatically.

Location

The source location is the URI or file path of the source, such as postgres://sakila:****@localhost/sakila or /Users/neilotoole/sq/xl_demo.xlsx. You specify the source location when invoking sq add.

Handle

The handle is how sq refers to a data source, such as @sakila or @customer_csv. The handle must begin with @. You specify the handle when adding a source with sq add. The handle can also be used to specify a source group, e.g. @prod/sales, @dev/sales.

Active source

The active source is the source upon which sq acts if no other source is specified.

By default, sq requires that the first element of a query is the source handle:

$ sq '@sakila | .actor | .first_name, last_name'

But if an active source is set, you can omit the handle:

$ sq '.actor | .first_name, .last_name'

You can use sq src to get or set the active source.

SQL source

A SQL source is a source backed by a “real” DB, such as Postgres. Contrast with document source.

Document source

A document source is a source backed by a document or file such as CSV or XLSX. Some functionality is not available for document sources. For example, sq doesn’t provide a mechanism to insert query results into a CSV file. Contrast with SQL source. A document source’s data is automatically ingested by sq. The document source’s location can be a local filepath or an HTTP URL.

Group

The group mechanism organizes sources into groups, based on path-like names. Given handles @prod/sales, @dev/sales and @dev/test, we have three sources, but two groups, prod and dev. See the groups docs.

Active group

Like active source, there is an active group. Use the sq group command to get or set the active group.

Driver

A driver is a software component implemented by sq for each data source type. For example, Postgres or CSV.

Use sq driver ls to view the available drivers.

Monotable

If a source is monotable, it means that the source type is really only a single table, such as a CSV file. sq always names that single table data. You access that table like this: @actor_csv | .data.

Note that not all document sources are monotable. For example, XLSX sources have multiple tables, where each worksheet is effectively equivalent to a DB table.

Metadata

sq inspect returns metadata about a source. At a minimum, sq inspect is useful for a quick reminder of table and column names:

sq inspect source text

sq inspect comes into its own when used with the --json flag, which outputs voluminous info on the data source. It is a frequent practice to combine sq inspect with jq . For example, to list the tables of the active source:

$ sq inspect -j | jq -r '.tables[] | .name'
actor
address
category
[...]

See more examples in the cookbook.

Ingest DB

Ingest DB refers to the temporary ("ingest") database that sq uses for under-the-hood activity such as converting a document source like CSV to relational format. By default, sq uses an embedded SQLite instance for the ingest DB.

Join DB

Join DB is similar to Ingest DB, but is used for cross-source joins. By default, sq uses an embedded SQLite instance for the Join DB.

Schema & catalog

Database implementations have the concept of a schema, which you can think of as a namespace for tables. Some databases go further and support the concept of a catalog, which is a collection of schemas. Often the terms catalog and database are used interchangeably, and, in practice, the various terms are used inconsistently and confusedly.

Here’s what the hierarchy looks like for Postgres (credit):

Hierarchy

Each of the sq DB driver implementations supports the concept of a schema in some way, but some drivers don’t support the catalog mechanism. Here’s a summary:

DriverDefault schemaCatalog support?
PostgrespublicYes
SQLitemainNo
MySQLConnection-dependentNo
SQL serverdboYes

The SLQ functions schema() and catalog() return the schema and catalog of the active source. See the docs for details of how each driver implements these functions.