dbt-synapse
custom dbt adapter for Azure Synapse. Major credit due to @mikaelene and his dbt-sqlserver custom adapter.
related packages#
To get additional functionality, check out:
- fishtown-analytics/dbt-external-tables which allows for easy staging of blob sources defined in YAML, and
- dbt-msft/tsql-utils enables dbt-synapseto use dbt-utils: the much-loved, extremely-useful collection of dbt macros.
major differences b/w dbt-synapse and dbt-sqlserver#
- macros use only Azure Synapse T-SQL. Relevant GitHub issue
- use of Create Table as Select (CTAS) means you don't need post-hooks to create indices (see Table Materializations section below for more info)
- rewrite of snapshots because Synapse doesn't support MERGE.
status & support#
as of now, only support for dbt 0.18.0
Passing all tests in dbt-adapter-tests, except test_dbt_ephemeral_data_tests
outstanding work:#
- ephemeralmaterializations (workaround for non-recursive CTEs) see #25
- officially rename the adapter from sqlservertosynapsesee #40
- Make seed creation more fault-tolerant #36
Installation#
Easiest install is to use pip (not yet registered on PyPI).
First install ODBC Driver version 17.
pip install dbt-synapseAuthentication#
Please see the Authentication section of dbt-sqlserver's README.md
Table Materializations#
CTAS allows you to materialize tables with indices and distributions at creation time, which obviates the need for post-hooks to set indices.
Example#
You can also configure index and dist in dbt_project.yml.
`models/stage/absence.sql#
{{    config(        index='HEAP',        dist='ROUND_ROBIN'        )}}
select *from ...is turned into the relative form (minus __dbt's _backup and _tmp tables)
  CREATE TABLE ajs_stg.absence_hours    WITH(      DISTRIBUTION = ROUND_ROBIN,      HEAP      )    AS (SELECT * FROM ajs_stg.absence_hours__dbt_tmp_temp_view)Indices#
- CLUSTERED COLUMNSTORE INDEX(default)
- HEAP
- CLUSTERED INDEX ({COLUMN})
- CLUSTERED COLUMNSTORE INDEX ORDER({{COLUMN}})# see docs for performance suggestions
Distributions#
- ROUND_ROBIN(default)
- HASH({COLUMN})
- REPLICATE
Changelog
See CHANGELOG.md