lesson
Data Modelling
Star schema, snowflake schema, normalisation, slowly changing dimensions, and wide tables.
Data Modelling
Dimensional Modelling
Kimball-style dimensional modelling organises data for analytics — optimised for reads, not writes.
Fact Tables
Store measurable events or metrics. Rows are thin — mostly foreign keys and measures.fact_orders (order_id, date_key, customer_key, product_key, quantity, revenue, cost)Dimension Tables
Descriptive context for facts. Wide and denormalised for easy filtering.dim_customer (customer_key, name, email, city, country, segment, created_at)Star Schema vs Snowflake Schema
| Star | Snowflake | |
|---|---|---|
| Dimensions | Denormalised (flat) | Normalised (split into sub-dims) |
| Query complexity | Simple JOINs | More JOINs |
| Storage | More redundancy | Less redundancy |
| BI tool performance | Faster (fewer joins) | Slower |
| Recommended | Most analytics cases | When dimension tables are very large |
Slowly Changing Dimensions (SCD)
How do you handle dimension changes over time? (e.g. customer changes city)
| Type | Strategy | Use when |
|---|---|---|
| SCD Type 1 | Overwrite old value | History not needed |
| SCD Type 2 | Add new row with valid_from/valid_to | Full history needed |
| SCD Type 3 | Add "previous_value" column | Only current + one prior value needed |
| SCD Type 4 | Separate history table | High-change dimensions |
dim_customer (
customer_key SERIAL PRIMARY KEY, -- surrogate key
customer_id INT, -- natural/business key
name TEXT,
city TEXT,
valid_from DATE NOT NULL,
valid_to DATE, -- NULL = current record
is_current BOOLEAN DEFAULT TRUE
)Wide Tables (One Big Table)
Modern cloud warehouses (Snowflake, BigQuery, Redshift) handle wide tables well. Flattening joins into one wide table:
final or mart layerTradeoff: harder to maintain, larger storage, schema changes require full refresh.
Normalisation
| Normal Form | Rule |
|---|---|
| 1NF | Atomic values, no repeating groups |
| 2NF | 1NF + no partial dependency on composite PK |
| 3NF | 2NF + no transitive dependency |
| BCNF | Stricter 3NF — every determinant is a candidate key |
Sign in to use the AI study buddy on this lesson.