Skip to main content

dbt-synapse

custom dbt adapter for Azure Synapse. Major credit due to @mikaelene and his dbt-sqlserver custom adapter.

related packages#

To get additional functionality, check out:

major differences b/w dbt-synapse and dbt-sqlserver#

  • macros use only Azure Synapse T-SQL. Relevant GitHub issue
  • use of Create Table as Select (CTAS) means you don't need post-hooks to create indices (see Table Materializations section below for more info)
  • rewrite of snapshots because Synapse doesn't support MERGE.

status & support#

as of now, only support for dbt 0.18.0

Passing all tests in dbt-adapter-tests, except test_dbt_ephemeral_data_tests

outstanding work:#

  • ephemeral materializations (workaround for non-recursive CTEs) see #25
  • officially rename the adapter from sqlserver to synapse see #40
  • Make seed creation more fault-tolerant #36

Installation#

Easiest install is to use pip (not yet registered on PyPI).

First install ODBC Driver version 17.

pip install dbt-synapse

Authentication#

Please see the Authentication section of dbt-sqlserver's README.md

Table Materializations#

CTAS allows you to materialize tables with indices and distributions at creation time, which obviates the need for post-hooks to set indices.

Example#

You can also configure index and dist in dbt_project.yml.

`models/stage/absence.sql#

{{    config(        index='HEAP',        dist='ROUND_ROBIN'        )}}
select *from ...

is turned into the relative form (minus __dbt's _backup and _tmp tables)

  CREATE TABLE ajs_stg.absence_hours    WITH(      DISTRIBUTION = ROUND_ROBIN,      HEAP      )    AS (SELECT * FROM ajs_stg.absence_hours__dbt_tmp_temp_view)

Indices#

  • CLUSTERED COLUMNSTORE INDEX (default)
  • HEAP
  • CLUSTERED INDEX ({COLUMN})
  • CLUSTERED COLUMNSTORE INDEX ORDER({{COLUMN}}) # see docs for performance suggestions

Distributions#

  • ROUND_ROBIN (default)
  • HASH({COLUMN})
  • REPLICATE

Changelog

See CHANGELOG.md