Refresh Your Pipeline
Refresh operations update your data by executing the transformations defined in your pipeline. A refresh can update your entire pipeline or specific Jobs that you've created to manage subsets of your data.
What Is Refresh?
Refresh is responsible for running the data transformations defined in your data warehouse metadata. This typically involves Data Manipulation Language (DML) SQL statements such as MERGE, INSERT, UPDATE, and TRUNCATE which will perform transformations on the actual data. Use refresh when you want to update your pipeline with any new changes from your data warehouse.
What Are Jobs?
Jobs are a subset of Nodes, created by the selector query, that are run during a refresh. To refresh only specific parts of your pipeline, create and use Jobs.
Jobs can only be run if they have been deployed first. Review our Deployment Overview to learn different ways to deploy your pipeline.
Refresh Methods
The Coalesce Scheduler lets you automate refresh operations directly in the application, making it easier to maintain regular data updates without external tools. You can also refresh your pipeline using:
- The Coalesce Scheduler
- The Coalesce App. Only existing, deployed Jobs can be run from the Coalesce App.
- CLI
- Jobs API
- Third-Party Scheduling Tools
Types of Jobs
Coalesce offers three ways to refresh your data pipeline: pre-configured Jobs with specific IDs, ad-hoc Jobs for manual execution, and full pipeline refreshes.
Name | Job ID | Method | Description |
---|---|---|---|
Jobs | Yes | API, CLI, Coalesce Scheduler, Coalesce App | Any Jobs you created in the Coalesce app on the Build page. They have a Job ID and are started using the Coalesce Scheduler, API, Coalesce App, or CLI. |
Ad-Hoc | None | API or CLI | Jobs that run manually using the API or CLI. They use include and exclude syntax. They aren't created in the app and can be run in addition to existing Jobs. These are standard within Coalesce and can't be removed from the Deploy page. |
Refreshed All Jobs | None | API or CLI | Refresh all the nodes in your pipeline. They don't use include or exclude syntax. They aren't created in the app and can be run in addition to existing Jobs. These are standard within Coalesce and can't be removed from the Deploy page. |
Steps to Refresh or Run Jobs
Deploy and refresh jobs are triggered for individual environments. Once an environment has been deployed, it can be refreshed using the Scheduler, API, or CLI.
- Create your Jobs
- Configure your Environment
- Configure your Git Integration
- Set your Parameters (optional). You can set them on the environment level or during the deploy processes.
- Refresh your pipeline.
You can only refresh if you've deployed your pipeline.
📄️ Creating and Run Jobs
Comprehensive guide on executing data pipeline jobs in Coalesce using the API, CLI, or web interface. Learn how to monitor active jobs, understand job states, and manage running jobs effectively.
📄️ Refreshing Your Pipeline Using the Coalese Scheduler
The Coalesce Scheduler gives you the ability to schedule data pipeline refreshes directly in the Coalesce app. You no longer need to use a third-party solution simplifying your IT dependency and allowing you to generate data insights faster.
📄️ Managing Jobs
Learn how to edit, monitor, and rerun data pipeline jobs in Coalesce. Master job scheduling, status tracking, and failure recovery to maintain efficient data transformations.