48 docs tagged with "Dagster supported"

Dagster-supported integrations.

Dagster & Azure Data Lake Storage Gen 2

Dagster helps you use Azure Storage Accounts as part of your data pipeline. Azure Data Lake Storage Gen 2 (ADLS2) is our primary focus but we also provide utilities for Azure Blob Storage.

Dagster & Airbyte

Using this integration, you can trigger Airbyte syncs and orchestrate your Airbyte connections from within Dagster, making it easy to chain an Airbyte sync with upstream or downstream steps in your workflow.

Dagster & Airbyte

Orchestrate Airbyte connections and schedule syncs alongside upstream or downstream dependencies.

Dagster & Airlift

Airlift is a toolkit for integrating Dagster and Airflow.

Dagster & AWS Athena

This integration allows you to connect to AWS Athena, a serverless interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Using this integration, you can issue queries to Athena, fetch results, and handle query execution states within your Dagster pipelines.

Dagster & AWS CloudWatch

This integration allows you to send Dagster logs to AWS CloudWatch, enabling centralized logging and monitoring of your Dagster jobs. By using AWS CloudWatch, you can take advantage of its powerful log management features, such as real-time log monitoring, log retention policies, and alerting capabilities.

Dagster & AWS ECR

This integration allows you to connect to AWS Elastic Container Registry (ECR). It provides resources to interact with AWS ECR, enabling you to manage your container images.

Dagster & AWS EMR

The AWS integration provides ways orchestrating data pipelines that leverage AWS services, including AWS EMR (Elastic MapReduce). This integration allows you to run and scale big data workloads using open source tools such as Apache Spark, Hive, Presto, and more.

Dagster & AWS Glue

The AWS integration library provides the PipesGlueClient resource, enabling you to launch AWS Glue jobs directly from Dagster assets and ops. This integration allows you to pass parameters to Glue code while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Dagster & AWS Lambda

Using this integration, you can leverage AWS Lambda to execute external code as part of your Dagster pipelines. This is particularly useful for running serverless functions that can scale automatically and handle various workloads without the need for managing infrastructure. The PipesLambdaClient class allows you to invoke AWS Lambda functions and stream logs and structured metadata back to Dagster's UI and tools.

Dagster & AWS Redshift

Using this integration, you can connect to an AWS Redshift cluster and issue queries against it directly from your Dagster assets. This allows you to seamlessly integrate Redshift into your data pipelines, leveraging the power of Redshift's data warehousing capabilities within your Dagster workflows.

Dagster & AWS S3

The AWS S3 integration allows data engineers to easily read, and write objects to the durable AWS S3 storage enabling engineers to a resilient storage layer when constructing their pipelines.

Dagster & AWS Secrets Manager

This integration allows you to manage, retrieve, and rotate credentials, API keys, and other secrets using AWS Secrets Manager.

Dagster & AWS Systems Parameter Store

The Dagster AWS Systems Manager (SSM) Parameter Store integration allows you to manage and retrieve parameters stored in AWS SSM Parameter Store directly within your Dagster pipelines. This integration provides resources to fetch parameters by name, tags, or paths, and optionally set them as environment variables for your operations.

Dagster & Databricks

The Databricks integration library provides the `PipesDatabricksClient` resource, enabling you to launch Databricks jobs directly from Dagster assets and ops. This integration allows you to pass parameters to Databricks code while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Dagster & Datadog

While Dagster provides comprehensive monitoring and observability of the pipelines it orchestrates, many teams look to centralize all their monitoring across apps, processes and infrastructure using Datadog's 'Cloud Monitoring as a Service'. The Datadog integration allows you to publish metrics to Datadog from within Dagster ops.

Dagster & dbt

Dagster orchestrates dbt alongside other technologies, so you can schedule dbt with Spark, Python, etc. in a single data pipeline.

Dagster & dbt

Put your dbt transformations to work, directly from within Dagster.

Dagster & dbt Cloud

Dagster allows you to run dbt Cloud jobs alongside other technologies. You can schedule them to run as a step in a larger pipeline and manage them as a data asset.

Dagster & Delta Lake

Delta Lake is a great storage format for Dagster workflows. With this integration, you can use the Delta Lake I/O Manager to read and write your Dagster assets.

Dagster & dlt

The dltHub open-source library defines a standardized approach for creating data pipelines that load often messy data sources into well-structured data sets.

Dagster & Docker

The Docker integration library provides the PipesDockerClient resource, enabling you to launch Docker containers and execute external code directly from Dagster assets and ops. This integration allows you to pass parameters to Docker containers while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Dagster & DuckDB

This library provides an integration with the DuckDB database, and allows for an out-of-the-box I/O Manager so that you can make DuckDB your storage of choice.

Dagster & Embedded ELT

The Embedded ELT package provides a framework for building ELT pipelines with Dagster through helpful asset decorators and resources. It includes the dagster-dlt and dagster-sling packages, which you can also use on their own.

Dagster & GCP BigQuery

Integrate with GCP BigQuery.

Dagster & GCP Dataproc

Using this integration, you can manage and interact with Google Cloud Platform's Dataproc service directly from Dagster. This integration allows you to create, manage, and delete Dataproc clusters, and submit and monitor jobs on these clusters.

Dagster & GCP GCS

This integration allows you to interact with Google Cloud Storage (GCS) using Dagster. It provides resources, I/O Managers, and utilities to manage and store data in GCS, making it easier to integrate GCS into your data pipelines.

Dagster & GitHub

This library provides an integration with GitHub Apps by providing a thin wrapper on the GitHub v4 GraphQL API. This allows for automating operations within your GitHub repositories and with the tighter permissions scopes that GitHub Apps allow for vs using a personal token.

Dagster & Jupyter Notebooks

Dagstermill eliminates the tedious "productionization" of Jupyter notebooks.

Dagster & Kubernetes

The Kubernetes integration library provides the PipesK8sClient resource, enabling you to launch Kubernetes pods and execute external code directly from Dagster assets and ops. This integration allows you to pass parameters to Kubernetes pods while Dagster receives real-time events, such as logs, asset checks, and asset materializations, from the initiated jobs. With minimal code changes required on the job side, this integration is both efficient and easy to implement.

Dagster & Looker

The Looker integration allows you to monitor your Looker project as assets in Dagster, along with other data assets.

Dagster & OpenAI

The OpenAI library allows you to easily interact with the OpenAI REST API using the OpenAI Python API to build AI steps into your Dagster pipelines. You can also log OpenAI API usage metadata in Dagster Insights, giving you detailed observability on API call credit consumption.

Dagster & PagerDuty

This library provides an integration between Dagster and PagerDuty to support creating alerts from your Dagster code.

Dagster & Pandas

Implement validation on pandas DataFrames.

Dagster & Pandera

The Pandera integration library provides an API for generating Dagster Types from Pandera dataframe schemas. Like all Dagster types, Pandera-generated types can be used to annotate op inputs and outputs.

Dagster & Power BI

Your Power BI assets, such as semantic models, data sources, reports, and dashboards, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Power BI assets and upstream data assets you are already modeling in Dagster. You can also use Dagster to orchestrate Power BI semantic models, allowing you to trigger refreshes of these models on a cadence or based on upstream data changes.

Dagster & Prometheus

This integration allows you to push metrics to the Prometheus gateway from within a Dagster pipeline.

Dagster & Sigma

Your Sigma assets, including datasets and workbooks, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Sigma assets and upstream data assets you are already modeling in Dagster.

Dagster & Slack

This library provides an integration with Slack to support posting messages in your company's Slack workspace.

Dagster & Sling

Sling provides an easy-to-use YAML configuration layer for loading data from files, replicating data between databases, exporting custom SQL queries to cloud storage, and much more.

Dagster & Snowflake

This library provides an integration with the Snowflake data warehouse. Connect to Snowflake as a resource, then use the integration-provided functions to construct an op to establish connections and execute Snowflake queries. Read and write natively to Snowflake from Dagster assets.

Dagster & Spark

Running Spark code often requires submitting code to a Databricks or EMR cluster. The Pyspark integration provides a Spark class with methods for configuration and constructing the spark-submit command for a Spark job.

Dagster & SSH/SFTP

This integration provides a resource for SSH remote execution using Paramiko. It allows you to establish secure connections to networked resources and execute commands remotely. The integration also provides an SFTP client for secure file transfers between the local and remote systems.

Dagster & Azure Data Lake Storage Gen 2

Dagster & Airbyte

Dagster & Airbyte

Dagster & Airlift

Dagster & AWS Athena

Dagster & AWS CloudWatch

Dagster & AWS ECR

Dagster & AWS EMR

Dagster & AWS Glue

Dagster & AWS Lambda

Dagster & AWS Redshift

Dagster & AWS S3

Dagster & AWS Secrets Manager

Dagster & AWS Systems Parameter Store

Dagster & Databricks

Dagster & Datadog

Dagster & dbt

Dagster & dbt

Dagster & dbt Cloud

Dagster & Delta Lake

Dagster & dlt

Dagster & Docker

Dagster & DuckDB

Dagster & Embedded ELT

Dagster & GCP BigQuery

Dagster & GCP Dataproc

Dagster & GCP GCS

Dagster & GitHub

Dagster & Jupyter Notebooks

Dagster & Kubernetes

Dagster & Looker

Dagster & OpenAI

Dagster & PagerDuty

Dagster & Pandas

Dagster & Pandera

Dagster & Power BI

Dagster & Prometheus

Dagster & Sigma

Dagster & Slack

Dagster & Sling

Dagster & Snowflake

Dagster & Spark

Dagster & SSH/SFTP

Dagster & Tableau

Dagster & Twilio

Dagster & TypeScript

Using Dagster with Airbyte Cloud

Using Dagster with Fivetran