4. Core Concepts by Domain
Domain 1: Snowflake AI Data Cloud Features and Architecture
Concepts
This domain is the backbone of the exam. It tests whether you understand how Snowflake is structured and how its platform features fit together.
Key concepts:
- Three-layer architecture: storage, compute, cloud services.
- Independent scaling of compute and storage.
- Virtual warehouses and multi-cluster warehouses.
- Object hierarchy and context: database, schema, role, warehouse.
- Table types: permanent, transient, temporary, external, Iceberg, hybrid.
- View types: standard, secure, materialized.
- Semi-structured data using VARIANT, OBJECT, ARRAY, and functions such as
FLATTEN.
- Snowflake interfaces: Snowsight, SnowSQL, Snowflake CLI, connectors, drivers, APIs.
- Snowpark for programmatic development using languages such as Python, Java, and Scala.
- Snowflake Cortex and AI features for LLM and ML use cases.
- Snowflake Notebooks for interactive development and analysis.
Services and features
| Feature |
Use when |
Do not choose when |
| Virtual warehouse |
You need compute for queries, loads, transformations, or Snowpark |
You need to store files or govern access |
| Multi-cluster warehouse |
Many concurrent queries queue on the same warehouse |
One single complex query is slow |
| Auto-suspend |
Warehouse is idle intermittently and cost matters |
You need guaranteed always-warm low latency |
| Snowpark |
You need programmatic processing inside Snowflake |
You only need simple SQL transformations |
| Snowflake Cortex |
You need built-in AI/LLM/ML functions in Snowflake |
The issue is RBAC, loading, or sharing |
| Snowflake Notebooks |
You need interactive analysis, Python, or exploratory workflows |
You need a scheduled production ingestion service |
| Secure view |
You need to hide view definition or protect exposed logic |
You need persisted precomputed results |
| Materialized view |
Repeated query pattern benefits from precomputed maintenance |
Highly volatile data or arbitrary ad hoc queries dominate |
| External table |
Query files in external cloud storage using metadata |
You need to physically load data into Snowflake tables |
| Iceberg table |
Open table format / interoperability with external engines matters |
You only need normal managed Snowflake storage |
Patterns
- Workload isolation: use separate warehouses for ELT, BI, data science, and loading. This avoids one workload starving another.
- Compute cost control: use auto-suspend, auto-resume, right-sized warehouses, resource monitors, and workload-specific warehouses.
- Concurrency handling: multi-cluster warehouses add clusters for concurrent queries. Scaling up a warehouse helps individual query resources but does not always solve queuing.
- Semi-structured ingestion: store flexible JSON in VARIANT, then query paths or flatten arrays as needed.
- Platform-native AI: Cortex is the exam-safe answer when the scenario asks for Snowflake-native AI/LLM capabilities.
Traps
- Warehouse size versus multi-cluster: size helps one query; multi-cluster helps many concurrent queries.
- Storage versus compute: warehouses execute work; they do not own table storage.
- Stages versus tables: stages hold files; tables hold structured data.
- Secure view versus materialized view: secure view protects definition/exposure; materialized view precomputes results.
- Temporary versus transient: temporary is session-scoped; transient persists until dropped but has no Fail-safe.
- SnowSQL versus Snowsight: SnowSQL is CLI; Snowsight is browser UI.
Domain 2: Account Management and Data Governance
Concepts
This domain tests secure administration, access control, monitoring, and governance. The most important theme is least privilege through roles.
Key concepts:
- Role-based access control.
- System-defined roles such as
ORGADMIN, ACCOUNTADMIN, SECURITYADMIN, SYSADMIN, USERADMIN, and PUBLIC.
- Custom functional roles and access roles.
- Granting privileges to roles, not directly to users.
- Role hierarchy and inherited privileges.
- Managed access schemas for centralized object grant control.
- Network policies, authentication, MFA/SSO concepts.
- Masking policies, row access policies, projection policies, aggregation policies, and join policies.
- Tags and object classification.
- Resource monitors and budgets.
ACCOUNT_USAGE, INFORMATION_SCHEMA, Access History, Query History, and Warehouse Metering.
- Trust Center and security posture monitoring.
Services and features
| Feature |
Best use |
Common wrong use |
| RBAC roles |
Least privilege, privilege inheritance, user access |
Direct grants to users for long-term access |
SECURITYADMIN |
Manage grants and roles |
Daily data engineering or BI querying |
ACCOUNTADMIN |
Top-level account administration |
Routine development work |
| Managed access schema |
Centralized grant management by schema owner |
Temporary personal sandbox grants |
| Masking policy |
Show different column values based on role/context |
Filter rows from a table |
| Row access policy |
Filter visible rows based on policy logic |
Hide only part of a column value |
| Tags |
Classify objects, track ownership, support governance |
Enforce query performance directly |
| Resource monitor |
Limit/alert warehouse credit consumption |
Improve query pruning or join design |
ACCOUNT_USAGE |
Historical account-level metadata with latency |
Real-time object metadata only |
INFORMATION_SCHEMA |
Current database-level metadata |
Long historical metering trend analysis |
Patterns
- Least privilege pattern: grant object privileges to access roles, assign access roles to functional roles, and assign functional roles to users.
- Admin separation pattern: avoid using
ACCOUNTADMIN for routine tasks. Use lower administrative roles for security, users, warehouses, and objects.
- Sensitive column pattern: use masking policies for PII columns and row access policies for row-level restrictions.
- Cost-control pattern: combine auto-suspend, warehouse sizing, multi-cluster where needed, resource monitors, query history, and warehouse metering.
- Governance metadata pattern: use tags for classification, ownership, data domains, and cost attribution.
Traps
- Masking versus row access: masking changes displayed values; row access filters rows.
- Resource monitor versus performance feature: resource monitors control credits, not query speed.
- ACCOUNT_USAGE versus INFORMATION_SCHEMA: account-level historical views often have latency; information schema is narrower and more current.
- More privileges do not fix performance: granting admin roles does not improve pruning or warehouse capacity.
- Direct grants are usually not exam-safe: the exam favors scalable role-based privilege models.
Domain 3: Data Loading, Unloading, and Connectivity
Concepts
This domain tests how data enters and leaves Snowflake and how clients connect to it.
Key concepts:
- Internal stages: Snowflake-managed file storage.
- External stages: cloud storage references such as S3, Azure Blob/ADLS, or Google Cloud Storage.
- Named stages, table stages, and user stages.
- File formats for CSV, JSON, Avro, ORC, Parquet, XML.
COPY INTO <table> for loading.
COPY INTO <location> for unloading.
VALIDATION_MODE and load history for troubleshooting.
- Snowpipe for continuous file ingestion.
- Snowpipe Streaming for low-latency row ingestion.
- Streams for change tracking and tasks for scheduling/orchestration.
- Dynamic tables for declarative incremental transformations.
- Connectors, drivers, SnowSQL, Snowflake CLI, Python connector, JDBC/ODBC, Spark connector.
- Semi-structured data using VARIANT and
FLATTEN.
Services and features
| Requirement |
Best answer |
Why |
| Load files from a stage into a table |
COPY INTO <table> |
Standard bulk load mechanism |
| Export table/query results to files |
COPY INTO <location> |
Unloads data to stage/cloud storage |
| Store temporary files in Snowflake-managed storage |
Internal stage |
Snowflake manages storage for staged files |
| Reference cloud object storage files |
External stage |
Points Snowflake to external cloud storage |
| Query files without loading into Snowflake table |
External table |
Uses external file metadata |
| Continuous file ingestion |
Snowpipe |
Loads new staged files as they arrive |
| Low-latency streaming rows |
Snowpipe Streaming |
Avoids file-based ingestion latency |
| Track changed rows |
Stream |
Captures inserts/updates/deletes for consumption |
| Run scheduled SQL |
Task |
Executes SQL or procedures on schedule/dependency |
| Declarative incremental pipeline |
Dynamic table |
Snowflake maintains refresh based on target lag |
Patterns
- Bulk batch load: files land in internal/external stage → file format parses data →
COPY INTO table loads rows → errors are validated or inspected.
- Continuous file load: cloud events notify Snowpipe → staged files are automatically ingested.
- CDC-style transformation: stream tracks changes → task processes stream rows → target table is updated.
- Managed incremental transformation: dynamic table defines target query → Snowflake refreshes according to target lag.
- Semi-structured landing: ingest JSON to VARIANT → flatten or project columns when needed.
Traps
- Snowpipe is not for arbitrary app compute: it loads data, usually from files.
- COPY direction matters:
COPY INTO table loads; COPY INTO location unloads.
- Stage is not a table: a stage stores files, not relational rows.
- External stage versus external table: external stage references storage; external table lets Snowflake query external files using metadata.
- Streams do not schedule work: streams track changes; tasks run work.
- Dynamic tables are not temporary tables: dynamic tables maintain transformation results; temporary tables disappear with the session.
- File format choices matter: CSV options do not solve Parquet logical type behavior, and JSON should usually land in VARIANT.
Domain 4: Performance Optimization, Querying, and Transformation
Concepts
This domain is heavily scenario-based. The exam wants you to identify the bottleneck correctly before choosing a feature.
Key concepts:
- Query profile interpretation.
- Warehouse sizing and scaling.
- Queuing, concurrency, and multi-cluster warehouses.
- Micro-partitions, metadata pruning, clustering keys, clustering depth.
- Search Optimization Service for selective point lookups and specific access patterns.
- Query Acceleration Service for eligible scan-heavy queries.
- Result cache, local disk cache, and metadata cache concepts.
- Materialized views and precomputation.
- Join design, exploding joins, selective filters, predicate pushdown.
- Window functions, aggregations, CTEs, and repeated subqueries.
- Streams/tasks/dynamic tables for transformations.
- Transaction behavior and query history.
Performance decision table
| Symptom in question |
Think first |
Do not jump to |
| Many queries are queued |
Increase concurrency / multi-cluster warehouse |
Clustering keys |
| One large query is slow due to insufficient resources |
Scale warehouse size |
Multi-cluster only |
| Date-filtered large table scans too many partitions |
Clustering / query filters / pruning |
More privileges |
| Repeated point lookups on selective columns |
Search Optimization Service |
Bigger warehouse as the only fix |
| Repeated expensive aggregate query |
Materialized view or refactor |
Secure share |
| Same query rerun with unchanged data |
Result cache |
Permanent indexing |
| Massive intermediate rows after join |
Fix join predicate/cardinality |
Fail-safe or Time Travel |
| Expensive repeated subquery |
Refactor, temp/intermediate table, CTE/materialization where appropriate |
Authentication change |
| Need scheduled transformation |
Task or dynamic table |
Snowpipe by itself |
Patterns
- Start with bottleneck identification: queuing, scanning, spilling, joins, remote I/O, or repeated computation.
- Use the Query Profile: it tells you where time and data movement occur.
- Design for pruning: selective predicates on columns aligned with micro-partitions reduce scanned data.
- Separate workloads: BI dashboards, ELT, and data science should not always share one warehouse.
- Avoid over-tuning early: Snowflake automatically maintains micro-partition metadata; manual clustering is useful only when natural clustering is insufficient.
- Precompute stable repeated results: materialized views can help when query patterns repeat and maintenance cost is justified.
Traps
- Traditional indexes are not the primary Snowflake pattern: Snowflake relies on micro-partitions, pruning, search optimization, and automatic optimization services.
- Scaling up versus scaling out: up helps a single query; out helps concurrency.
- Result cache is not permanent storage: it is invalidated when underlying data or conditions change.
- Materialized view versus secure view: one optimizes repeated computation; the other protects view details.
- Fail-safe does not fix query performance: it is a recovery mechanism.
- Clustering is not free: it may improve pruning but adds maintenance cost.
Domain 5: Data Collaboration
Concepts
This domain tests data sharing, Marketplace/listings, reader accounts, cloning, replication, Time Travel, and Fail-safe. It is smaller but full of easy-to-confuse recovery and sharing concepts.
Key concepts:
- Secure Data Sharing for live access to governed data without copying files.
- Direct shares and listings.
- Private listings for selected consumers.
- Public listings/Marketplace for discoverable data products.
- Reader accounts for consumers without Snowflake accounts.
- Provider and consumer roles in sharing.
- Secure views and policies in shared data products.
- Zero-copy cloning for fast environment copies.
- Time Travel for querying/restoring historical data within retention.
- Fail-safe for Snowflake-managed emergency recovery after Time Travel.
- Replication and failover for business continuity across accounts/regions/clouds where supported.
Services and features
| Requirement |
Best answer |
Why |
| Share live governed data with another Snowflake account |
Secure Data Sharing |
No file copies; consumer queries shared data |
| Publish data product to selected partners |
Private listing |
Controlled distribution |
| Make data product discoverable broadly |
Public listing / Marketplace |
Discoverability and distribution |
| Consumer has no Snowflake account |
Reader account |
Provider enables access for non-Snowflake consumer |
| Fast test/dev copy of production table/schema/database |
Zero-copy clone |
Metadata-based copy, storage-efficient initially |
| Recover dropped table inside retention |
Time Travel |
User-accessible historical recovery |
| Recovery after Time Travel retention ends |
Fail-safe |
Snowflake-managed, not normal user restore |
| Business continuity across regions/accounts |
Replication/failover |
Copies account/database objects as designed |
Patterns
- Data product distribution: build secure views/policies → create share/listing → grant to consumers → consumers query live data.
- Non-Snowflake consumer pattern: create reader account and expose shared data through provider-managed account.
- Recovery pattern: use Time Travel first; Fail-safe is last-resort Snowflake recovery.
- Environment cloning pattern: clone prod to dev/test for fast experimentation without full copy at creation time.
- DR pattern: replication/failover, not Time Travel alone.
Traps
- Secure sharing is not file export: it provides live access, not copied files.
- Reader account is not needed for every share: it is mainly for consumers without Snowflake accounts.
- Time Travel versus Fail-safe: Time Travel is user-accessible; Fail-safe is Snowflake-managed emergency recovery.
- Clone versus replication: clone is same-account/object copy pattern; replication/failover handles continuity across accounts/regions.
- Private listing versus public listing: private is selected consumers; public is discoverable.