XLSX (Excel)

The sq Excel driver implements connectivity for Microsoft XLSX files and variants.

Supported file formats

The driver supports .xlsx, .xlam, .xlsm,.xltm and .xltx. Note that even if the file format is, say, .xlam, the driver type is still xlsx. The driver does not support the older .xls and .xlsb formats.

Add source

When adding an XLSX source via sq add, the location string is simply the filepath. For example:

$ sq add ./sakila.xlsx
@sakila_xlsx  xlsx  sakila.xlsx

You can also pass an absolute filepath (and, in fact, any relative path is expanded to an absolute path when saved to sq’s config).

Worksheets

When an XLSX source is added, sq treats each sheet as a separate database table. Thus a sheet named actor can be queried as @sakila_xlsx.actor.

Empty sheets are ignored, and can’t be queried.

Header row

Excel sheets will often have a header row containing column names. If the sheet doesn’t have a header row, by default sq will name the columns A, B, C, etc. (Note that the column naming behavior is configurable.

Generally, sq will automatically detect, for each sheet, whether or not the first row is a header row. If the header row detection is having difficulty with your workbook, you can explicitly specify that a header row is present (or not) via --ingest.header.

# Explicitly specify that a header row exists (in each sheet)
$ sq add --ingest.header ./sakila.xlsx

# Explicitly specify no header row
$ sq add --ingest.header=false ./sakila-no-header.xlsx

Duplicate columns

If the header row has duplicate column names, the later columns are renamed. For example, these columns:

actor_id, first_name, actor_id

become:

actor_id, first_name, actor_id_1

The renaming behavior is configurable via ingest.column.rename.

Column kind

When ingesting an Excel workbook, sq attempts to detect the data “kind” of each column (int, float, text, etc.). Thus an Excel date becomes a date in the backing DB, an Excel number becomes an int or a float, and various date & time values are parsed into an appropriate DB type. See the column kind section for more on this mechanism.