Cert-Pass
Log in Sign up
dbt dbt Certified auto_stories Free Compressed Course : 20% preview

dbt Analytics Engineering Certification Course

bolt Everything you need to pass : in one free course.

20 expert modules derived from 1200+ real exam questions. Covers every domain, exam trap, and scenario : organized by blueprint weight so you study what matters most.

Full access from โ‚ฌ29.00 One-time ยท No subscription
check_circle 5 of 20 modules free ยท No account needed
20
Modules
1200+
Questions
111
Domains
dbt Analytics Engineering
500+ dbt Certified 93% First-Attempt Pass Rate 4.9/5 Rating
dbt

About This Course

dbt Analytics Engineering ยท 20 modules

This course covers every domain tested on the dbt Analytics Engineering exam. Based on our 1200+ real practice questions and prepared by certification experts.

info What you'll learn:

  • Every exam domain with detailed explanations
  • Common exam traps that catch unprepared candidates
  • Key concepts, syntax, and configurations
  • Real-world scenarios from actual exam questions
  • Quick-reference cheat sheets for last-minute review

Your dbt Analytics Engineering Roadmap

dbt Analytics Engineering certification preparation infographic

You're viewing 5 of 20 free modules

The remaining 15 modules cover advanced topics, exam traps, and scenarios that appear on the real exam.

Unlock All : โ‚ฌ29.00

1. Exam Overview

What the exam is testing

The dbt Analytics Engineering Certification validates whether you can use dbt in a production analytics workflow, not just whether you can remember commands. The exam expects you to reason through realistic project situations such as:

  • building a clean model DAG from raw sources to marts;
  • choosing the right materialization for performance and maintainability;
  • using ref() and source() correctly instead of hard-coded object names;
  • testing assumptions about source data and transformed models;
  • debugging model, YAML, SQL, dependency, and pipeline failures;
  • applying model governance features such as access, groups, versions, contracts, and grants;
  • using state-aware workflows for CI, slim CI, retries, and production-safe deployments;
  • managing packages, source freshness, exposures, and documentation so downstream users trust the data.

The real exam is scenario-heavy. A typical question gives you a project problem, then asks for the dbt-native fix. The best answer is usually the option that improves lineage, repeatability, maintainability, and production safety without over-engineering.

Current official exam structure to know

The official dbt Analytics Engineering Certification page currently lists these logistics and domains:

Item Current detail
Duration 2 hours
Questions 65
Passing score 65%
Supported version dbt 1.11
Expected background SQL proficiency and practical dbt experience
Question style Practical and scenario-based; expect multiple-choice and interactive-style reasoning

Older official guides and older question banks often mention dbt Core 1.7 and include a separate documentation domain. The current public exam page lists 7 domains and no separate documentation domain. Documentation still matters because it appears across sources, model properties, lineage, exposures, and governance scenarios.

How to think like the exam

Think like a production analytics engineer:

  1. Use dbt abstractions before warehouse-specific shortcuts. Prefer ref(), source(), model configs, selectors, tests, contracts, and exposures over manual warehouse object names.
  2. Preserve the DAG. Anything that hides lineage, bypasses dependencies, or relies on manually ordered SQL is suspicious.
  3. Choose the simplest materialization that satisfies usage. Do not make everything incremental or table. Do not make a dashboard-facing mart ephemeral.
  4. Test assumptions, not implementation trivia. Use tests where they express business rules or data quality guarantees.
  5. Debug from dbt outward. Check logs, compiled SQL, YAML validity, dependencies, target/profile config, and then warehouse-specific SQL/errors.
  6. Separate development from production. Use state, defer, clone, CI jobs, and PR review to avoid rebuilding or querying expensive production tables unnecessarily.
  7. Govern public interfaces. If other teams depend on a model, use model access, versioning, contracts, docs, exposures, and deprecation instead of silently changing schemas.

How to use this course

Read Sections 1โ€“3 once to orient yourself. Then study each domain in Section 4 and use Sections 5โ€“10 as quick revision material. The course intentionally merges repeated CSV themes into decision frameworks so you can answer new scenarios rather than memorize answers.


2. Exam Domains

Current domain list

Priority from source bank Official/current-style domain Approx. share in analyzed source What to master
1 Developing and optimizing dbt models ~26% materializations, ref, source, sources, modular SQL, Jinja/macros, seeds, snapshots, configs, DAG, git workflow
2 Implementing dbt tests ~17% generic/singular/custom tests, source tests, test config, severity, filtering, assumptions, CI testing
3 Debugging data modeling errors ~14% compiled SQL, logs, YAML errors, SQL vs dbt issues, profiles, dependencies, model fixes
4 Troubleshooting and optimizing dbt pipelines ~13% DAG failures, selectors, retries, clone, scheduling, CI, orchestration boundaries, production failure handling
5 Managing dbt models governance ~12% access, groups, contracts, versions, deprecation, grants, stable public interfaces
6 Leveraging the dbt state ~10% state selectors, result selectors, defer, slim CI, manifest/run_results, modified nodes, retry
7 Implementing and Maintaining External Dependencies ~9% packages, dbt deps, package compatibility, exposures, source freshness, downstream dependency awareness

Priority notes

The largest share of the analyzed bank is model development and optimization. This makes sense because almost every other topic depends on a correct mental model of dbt resources, DAG lineage, materializations, and configuration precedence.

High-yield cross-domain concepts:

  • ref() vs source();
  • model materializations: view, table, incremental, ephemeral;
  • tests: generic vs singular vs custom generic;
  • source freshness and source testing;
  • compiled SQL for debugging;
  • YAML indentation and resource properties;
  • dbt build vs dbt run + dbt test;
  • model contracts, versions, access, and groups;
  • state selection, defer, and CI efficiency;
  • packages and macro compatibility.

What matters most

If the question says... The exam usually wants you to think about...
Hard-coded schema/table names inside models Replace with ref() for models or source() for raw tables
Model builds in wrong order Missing ref() dependencies
Raw table dependency is not documented/tested Define a source in YAML and use source()
Large append-only table Incremental model with correct is_incremental() logic
Small stable mapping file Seed
Point-in-time history of slowly changing source data Snapshot
Business users query it often Table or incremental, not ephemeral
Reusable CTE not directly queried Ephemeral or macro depending on purpose
Downstream teams depend on the model Public/protected access, versions, contracts, docs
Only changed models should run in CI State selectors and defer
Failed job should resume safely dbt retry / result selectors, not manual partial guessing
Source is late or stale Source freshness, not a generic model test
BI dashboard depends on model Exposure

3. Start-to-Finish Study Path

Foundation phase: build the dbt mental model

Learn first:

  • What a dbt project is: dbt_project.yml, models, macros, seeds, snapshots, tests, sources, packages.
  • How dbt builds a DAG from ref() and source().
  • Difference between development, CI, staging, and production environments.
  • Basic commands: dbt debug, dbt compile, dbt run, dbt test, dbt build, dbt docs generate, dbt deps, dbt seed, dbt snapshot, dbt source freshness.

Hands-on checklist:

  • Create a source YAML file with at least one source and table.
  • Create staging models using source().
  • Create intermediate/mart models using ref().
  • Run dbt compile and inspect target/compiled.
  • Generate docs and inspect lineage.

Intermediate phase: learn production patterns

Focus on:

  • materializations and their tradeoffs;
  • incremental models and is_incremental();
  • test types and test configuration;
  • model property YAML;
  • packages and macros;
  • git workflow and PR review;
  • docs and exposures.

Hands-on checklist:

  • Build one view, one table, one incremental model, one ephemeral model.
  • Add generic tests: not_null, unique, relationships, accepted_values.
  • Add a singular test for a business rule.
  • Add source freshness to a source.
  • Add an exposure for a dashboard.
  • Install a package and run dbt deps.

Advanced phase: governance, state, and debugging

Focus on:

  • model contracts and column-level constraints;
  • model versions and deprecation;
  • access levels and groups;
  • grants for warehouse permissions;
  • slim CI using state:modified+ and --defer;
  • result selectors and retries;
  • debugging YAML, SQL, package, and pipeline failures.

Hands-on checklist:

  • Add a contract to a model and intentionally break it.
  • Add a v2 model and deprecate v1.
  • Make a protected/public model and test dependency behavior.
  • Run state selection against a previous manifest.
  • Debug a failing test from the failure output and compiled SQL.

Final review phase

In the final review, do not reread everything equally. Focus on scenario triggers:

  • If the model is downstream-facing, think governance.
  • If the model is expensive and append-only, think incremental.
  • If dependencies are invisible, think ref()/source().
  • If only changed work should run, think state/defer.
  • If source data timeliness is the issue, think freshness.
  • If a dashboard breaks, think exposure, model contract/versioning, docs, and lineage.
  • If the error is unclear, inspect logs and compiled SQL before changing dbt configs.

4. Core Concepts by Domain

Domain 1 โ€” Developing and Optimizing dbt Models

Concepts

This is the highest-yield domain. You must know how dbt turns modular SQL files into a dependency graph and production data objects.

Key resource types:

Resource What it represents Common exam signal
Model A SQL or Python transformation managed by dbt Build clean staging/intermediate/mart layers
Source Raw data object loaded outside dbt Use when referring to raw tables
Seed Static CSV version-controlled in the project Small lookup/mapping/reference data
Snapshot Point-in-time history of mutable source records SCD-style history where source overwrites changes
Macro Reusable Jinja logic Repeated SQL pattern or generated logic
Test Data assertion Validate uniqueness, not null, relationships, accepted values, business rules
Exposure Downstream asset such as BI dashboard, notebook, ML job Show downstream dependency and ownership
Package External reusable dbt project Shared macros/models/tests from dbt Hub or Git

ref() and source()

ref() is for dbt models. source() is for raw objects loaded outside dbt.

Need Use Why
A mart depends on stg_orders {{ ref('stg_orders') }} Creates DAG dependency and environment-aware relation name
A staging model reads raw stripe.payments {{ source('stripe', 'payments') }} Documents raw dependency and supports source tests/freshness
A model directly queries analytics_prod.stg_orders Replace with ref() Hard-coding breaks lineage and environment portability
A model directly queries raw.shopify.orders Replace with source() Raw dependencies belong in source YAML

Exam trap: If the question says models build in the wrong order, do not choose a scheduler workaround. dbt order comes from ref() dependencies.

Materializations

Materialization Use when Avoid when Exam trap
View Logic should stay lightweight and always query fresh upstream data Heavy repeated dashboard queries need fast performance View does not store results; it can push cost to query time
Table Model is expensive to compute and queried often Data changes frequently and full rebuild is too expensive Table rebuilds entire relation each run
Incremental Large table, small new/changed subset per run Large percentage updates each run or logic cannot isolate changes Requires correct filter and unique key/strategy where needed
Ephemeral Reusable intermediate logic not queried directly Many downstream refs create repeated SQL; business users need to query it Ephemeral is inlined as CTE, not created as a database object
Seed Small static CSV controlled in git Large dynamic data or frequently updated operational data Seeds are not an ingestion system
Snapshot Track historical changes in mutable source records You only need latest state or immutable event data Snapshot is not a materialization for performance

Incremental models

Use incremental when a model has many rows and only a small subset is added or changed each run.

Core reasoning:

  • is_incremental() gates logic that should only run on incremental runs.
  • The SQL must be valid for both full-refresh and incremental runs.
  • Use a reliable event/update timestamp or high-water mark.
  • Use unique_key and an incremental strategy when records can update.
  • Use --full-refresh when logic changes require rebuilding historical rows.
  • Schema changes may require on_schema_change handling or full refresh depending on the warehouse and change type.

Common bad answers:

  • โ€œIncremental models are always rebuilt.โ€ False.
  • โ€œIncremental models are always best for large tables.โ€ Not if most rows change each run.
  • โ€œYou never need is_incremental().โ€ Usually false for selective processing.
  • โ€œUse ephemeral to make a large dashboard model faster.โ€ Usually wrong; ephemeral can duplicate heavy SQL.

Sources

A source maps to a raw data location, commonly database + schema, with tables underneath. Use sources to centralize raw object naming and document external dependencies.

Good source YAML includes:

  • source name;
  • database/schema where needed;
  • tables;
  • source and column descriptions;
  • tests on raw data assumptions;
  • freshness where timeliness matters;
  • loaded timestamp field for freshness checks.

Exam trap: If multiple raw tables are in the same database/schema, they are usually one source with multiple tables, not multiple sources.

Modularity and DRY SQL

Good dbt modeling decomposes SQL into layers:

Layer Typical purpose Typical materialization
Staging Clean, rename, cast, standardize one source View, sometimes ephemeral/table
Intermediate Reusable transformations and joins Ephemeral, view, table depending on cost
Mart Business-facing facts/dimensions Table/incremental for performance

Use macros for reusable logic patterns, not for hiding business-critical model lineage. Use models when you need DAG visibility and testable transformation steps.

Jinja, variables, and environment config

Feature Use case Trap
var() Project variables provided in dbt_project.yml or CLI Do not store secrets in vars
env_var() Environment-specific values and secrets Must be available in runtime environment
target Branch logic by target/profile Avoid excessive target-specific model logic that makes behavior hard to test
config() Model-level configs in SQL Remember config precedence
dbt_project.yml Default project/folder configs Bad indentation or wrong resource path breaks expectations

Python models

Python models are used for transformations easier in Python than SQL, such as advanced data science-style transformations. Exam reasoning remains dbt-native:

  • They are still models in the DAG.
  • They can use dbt.ref() and dbt.source().
  • They are not a replacement for simple SQL transformations.
  • Support depends on the adapter/platform.

Git workflow

Expected git skills:

  • create feature branches;
  • commit changes;
  • pull from main/head branch to stay updated;
  • resolve conflicts;
  • open pull requests;
  • use CI before merging.

Exam trap: If the question says your branch is behind main, the correct general action is to pull/reconcile with the head branch, not manually copy files or merge straight into production.

Patterns

  • Raw table reference โ†’ define source + use source().
  • Model-to-model dependency โ†’ use ref().
  • Repeated business logic โ†’ refactor into staging/intermediate models or macros.
  • Expensive dashboard model โ†’ table or incremental.
  • Append-only high-volume data โ†’ incremental.
  • Static mapping table โ†’ seed.
  • Source overwrites values but history needed โ†’ snapshot.
  • Direct warehouse object permissions required โ†’ grants.

Traps

  • Choosing a materialization only because it is โ€œfasterโ€ without considering freshness, cost, and query pattern.
  • Using hard-coded schema names in dbt models.
  • Making every model a table, which increases rebuild time/storage.
  • Making every staging model ephemeral, which can duplicate heavy SQL downstream.
  • Treating seeds as ingestion for operational data.
  • Using macros where a model would provide better lineage and testing.
  • Forgetting that a table model fully rebuilds by default.

lock

Domain 2 โ€” Managing dbt Models Governance

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Domain 3 โ€” Debugging Data Modeling Errors

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Domain 4 โ€” Troubleshooting and Optimizing dbt Pipelines

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Domain 5 โ€” Implementing dbt Tests

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Domain 6 โ€” Leveraging the dbt State

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Domain 7 โ€” Implementing and Maintaining External Dependencies

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Documentation Skills Across Domains

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

5. Service Selection Guide

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

6. Architecture Patterns

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

7. Exam Traps

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

8. Quick Memory Rules

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

9. Final Revision Notes

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

10. Exam-Day Checklist

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Appendix A โ€” Source Bank Pattern Extraction

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

lock

Appendix B โ€” Official Reference Notes

This module is part of the full course. Unlock all 20 modules + 1200+ practice questions.

lock_open Unlock for โ‚ฌ29.00

One-time payment ยท 48h money-back guarantee

What others say

star star star star star

"The compressed course cut my study time in half. Passed on first attempt after 3 days."

Sarah M.

Data Engineer

Scored 94%
star star star star star

"Every domain explained clearly. The exam traps section alone was worth it."

James K.

Cloud Architect

Scored 91%

Unlock the full course

Stop guessing. Start passing.

5 of 20 modules free. The remaining 15 modules contain the exam traps, edge cases, and domain deep-dives that make the difference.

48h money-back guarantee Secure PayPal checkout

Unlock all 20 modules

From โ‚ฌ29.00 ยท One-time payment

Unlock

Course Modules

20 modules

Unlock All Modules

Get full access to all 20 modules

auto_stories More Guides