Data Transformation Overview
dbt (Data Build Tool) is the backbone of SyncSage's data transformation layer. It enables you to:
- Transform raw data into analysis-ready datasets using SQL
- Automate workflows with version-controlled, modular code
- Collaborate across teams with built-in testing and documentation
With dbt, raw data from Snowflake is transformed into business-ready models for analytics, reporting, and machine learning.
2.1 Workflows
Workflows define end-to-end data transformation pipelines.

Example: Daily Customer Metrics
- Input: Raw customer data from Snowflake
- Steps:
- Clean data
- Calculate lifetime value (LTV)
- Aggregate daily metrics
- Output: customer_metrics table for dashboards
How to Create a Workflow
- Go to dbt Workflows > New Workflow
- Define dependencies (e.g., stg_customers → customer_360)
- Schedule execution (e.g., daily at 2:00 AM)
2.2 Transformation Jobs
Jobs execute specific transformation tasks.

Job Name | Description | Schedule |
---|---|---|
Daily Customer Aggregation | Aggregates customer data daily | 2:00 AM UTC |
Product Inventory Update | Updates inventory hourly | Every 60 mins |
2.3 Templates
Templates standardize common transformations.

Template Name | Use Case |
---|---|
Customer Data Cleansing | Deduplicates and standardizes data |
Sales Data Aggregation | Aggregates sales by region/product |
How to Use a Template
- Go to Transformation Templates > Use Template
- Customize SQL logic for your data
For more examples and guidelines, visit our examples page.
SyncSage supports two environments for dbt: Cloud IDE (for beginners) and Cloud CLI (for advanced users).
3.1 dbt Cloud IDE
A browser-based interface for building and testing transformations without local setup.

Key Features
- Visual Editor: Write SQL/Python in a code-friendly interface
- Syntax Highlighting: Distinguish SQL elements for readability
- AI Copilot: Generate code, tests, and documentation via natural language prompts
- Auto-Completion: Suggest table/column names and ref()/source() macros
- Code Formatting & Linting: Standardize SQL with SQLFluff and Prettier
- Version Control: Commit changes, resolve conflicts, and prune branches directly in the IDE
- Real-Time Documentation: Preview project docs during development
3.2 dbt Cloud CLI
A command-line interface for advanced users to run dbt locally or in CI/CD pipelines.
Key Features
- Local Execution: Develop and test transformations offline
- CI/CD Integration: Automate deployments with GitHub Actions
- Advanced Config: Customize profiles and environments
For detailed installation instructions, visit the dbt Cloud CLI documentation.
Models are the core building blocks of your dbt project. Each model is a single SQL file that transforms your data and creates a new table or view in your data warehouse.
Component | Description |
---|---|
Models | Each model lives in a single file and contains logic that either transforms raw data into a dataset ready for analytics or serves as an intermediate step in such a transformation. |
Snapshots | A way to capture the state of your mutable tables so you can refer to it later. |
Seeds | CSV files with static data that you can load into your data platform with dbt. |
Data Tests | SQL queries that you can write to test the models and resources in your project. |
Macros | Blocks of code that you can reuse multiple times. |
Docs | Documentation for your project that you can build. |
Sources | A way to name and describe the data loaded into your warehouse by your Extract and Load tools. |
Exposures | A way to define and describe a downstream use of your project. |
Metrics | A way for you to define metrics for your project. |
Groups | Groups enable collaborative node organization in restricted collections. |
Analysis | A way to organize analytical SQL queries in your project, such as the general ledger from QuickBooks. |
Semantic Models | Define foundational data relationships in MetricFlow and the dbt Semantic Layer, enabling you to query metrics using a semantic graph. |
Saved Queries | Organize reusable queries by grouping metrics, dimensions, and filters into nodes visible in the dbt DAG. |
Creating a Model in SyncSage
SyncSage provides a visual editor for creating dbt models, making it easy to build complex transformations without writing SQL from scratch.
Step-by-Step Guide
- 1. Start a New Model: Click on "Create a new model" in the dbt interface.
- 2. Add Operators: Drag operators from the toolbar and drop them onto the canvas.
- Click on an operator to configure it
- Connect operators by dragging between their connection points
- 3. Configure Operators: Each operator has specific configuration options:
Input
- Input model: Select the model and columns you want to use.
Transform
- Join: Define the join conditions and choose columns from both tables.
- Select: Pick the columns you need from the model.
- Aggregate: Specify the aggregation functions and the columns they apply to.
- Formula: Add the formula to create a new column. Use the built-in AI code generator to help generate SQL code by clicking on the question mark (?) icon.
- Filter: Set the conditions to filter data.
- Order: Select the columns to sort by and the sort order.
- Limit: Set the maximum number of rows you want to return.
Output model
- Output model: The final transformed dataset generated by a dbt model.
- Currently, you can only have one output model in the visual editor.
- 4. Preview and Validate: Use the Output tab to preview the data from each configured node.
- 5. Review SQL Code: The Code tab displays the SQL code generated by your visual configuration.
Model Best Practices
Naming Conventions
- Use snake_case for model names (e.g., customer_orders)
- Use a consistent prefix for staging models (e.g., stg_)
- Use a consistent prefix for intermediate models (e.g., int_)
- Use a consistent prefix for dimension/fact models (e.g., dim_, fct_)
Model Organization
- Staging models: Clean and standardize source data
- Intermediate models: Transform and join staging models
- Dimension models: Represent business entities (e.g., customers, products)
- Fact models: Represent business processes (e.g., orders, transactions)
Testing
- Add tests to ensure data quality (e.g., uniqueness, not_null)
- Test relationships between models
- Create custom tests for complex business logic
Documentation
- Document models and columns
- Include business definitions
- Explain complex transformations
For more detailed information, visit the Model Creation page.
Data tests are assertions you make about your models and other resources in your dbt project (e.g. sources, seeds and snapshots). When you run dbt test, dbt will tell you if each test in your project passes or fails.
You can use data tests to improve the integrity of the SQL in each model by making assertions about the results generated. Out of the box, you can test whether a specified column in a model only contains non-null values, unique values, or values that have a corresponding value in another model (for example, a customer_id for an order corresponds to an id in the customers model), and values from a specified list. You can extend data tests to suit business logic specific to your organization – any assertion that you can make about your model in the form of a select query can be turned into a data test.
Data tests return a set of failing records. Generic data tests (f.k.a. schema tests) are defined using test blocks.
Like almost everything in dbt, data tests are SQL queries. In particular, they are select statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks after nulls. If the data test returns zero failing rows, it passes, and your assertion has been validated.
Types of Data Tests
There are two ways of defining data tests in dbt:
- Singular data test: If you can write a SQL query that returns failing rows, you can save that query in a .sql file within your test directory. It's now a data test, and it will be executed by the dbt test command.
- Generic data test: A parameterized query that accepts arguments. The test query is defined in a special test block (like a macro). Once defined, you can reference the generic test by name throughout your .yml files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic data tests built in.
Built-in Generic Tests
- unique: tests if column values are unique
- not_null: tests if column values are not null
- accepted_values: tests if column values are in a given set
- relationships: tests referential integrity between tables
Example
Here's an example of using generic tests in a YAML file:
version: 2 models: - name: orders columns: - name: order_id tests: - unique - not_null - name: status tests: - accepted_values: values: ['placed', 'shipped', 'completed', 'returned'] - name: customer_id tests: - relationships: to: ref('customers') field: id
For more detailed information, visit the Testing & Documentation page.