lesson

dbt Fundamentals

Models, materializations, tests, sources, and the dbt project structure.

dbt Fundamentals

What is dbt?

dbt (data build tool) is the T in ELT — it transforms data already in your warehouse using SQL SELECT statements. It adds:

  • Version control for SQL transforms
  • Automated testing (not null, unique, referential integrity, custom)
  • Lineage documentation (DAG of model dependencies)
  • Modular, reusable SQL via refs and macros

  • Project Structure

    my_project/
    ├── dbt_project.yml        # project config
    ├── profiles.yml           # connection config (usually ~/.dbt/profiles.yml)
    ├── models/
    │   ├── staging/           # raw → cleaned, 1-1 with source tables
    │   │   └── stg_orders.sql
    │   ├── intermediate/      # business logic, joins
    │   │   └── int_order_items.sql
    │   └── marts/             # final, consumer-facing models
    │       └── fct_orders.sql
    ├── tests/
    │   └── assert_order_total_positive.sql
    ├── macros/
    │   └── generate_surrogate_key.sql
    └── seeds/
        └── payment_methods.csv


    Models and Materializations

    Every dbt model is a .sql file with a SELECT:

    -- models/staging/stg_orders.sql
    SELECT
      id          AS order_id,
      customer_id,
      created_at  AS order_date,
      total_cents / 100.0 AS total_usd
    FROM {{ source('raw', 'orders') }}
    WHERE created_at >= '2020-01-01'

    Materializations (how dbt writes results):

    MaterializationWhen to use
    view (default)Fast iteration; small models
    tableSlow queries; infrequently changed
    incrementalLarge tables; append or upsert new rows only
    ephemeralReusable CTE — not persisted in warehouse
    Incremental pattern:
    {{ config(materialized='incremental', unique_key='order_id') }}

    SELECT ... FROM {{ source('raw', 'orders') }} {% if is_incremental() %} WHERE created_at > (SELECT MAX(order_date) FROM {{ this }}) {% endif %}


    Testing

    Schema tests (in schema.yml):

    models:
      - name: stg_orders
        columns:
          - name: order_id
            tests:
              - unique
              - not_null
          - name: customer_id
            tests:
              - not_null
              - relationships:
                  to: ref('stg_customers')
                  field: customer_id

    Run tests: dbt test


    Refs and Sources

  • {{ ref('model_name') }} — references another dbt model; builds the DAG
  • {{ source('schema', 'table') }} — references a raw source table; enables source freshness tests
  • SELECT o.*, c.country
    FROM {{ ref('stg_orders') }} o
    JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.customer_id


    Macros (Jinja)

    -- macros/cents_to_dollars.sql
    {% macro cents_to_dollars(col) %}
      ({{ col }} / 100.0)
    {% endmacro %}

    -- usage in a model: SELECT {{ cents_to_dollars('total_cents') }} AS total_usd

    Sign in to use the AI study buddy on this lesson.

    Resources