Converting an existing project to use dg
dg
and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.
Suppose we have an existing Dagster project. Our project defines a Python
package with a a single Dagster asset. The asset is exposed in a top-level
Definitions
object in my_existing_project/definitions.py
. We'll consider
both a case where we have been using uv with pyproject.toml
and pip
with setup.py
.
- uv
- pip
tree
.
├── my_existing_project
│ ├── __init__.py
│ ├── assets.py
│ ├── definitions.py
│ └── py.typed
├── pyproject.toml
└── uv.lock
2 directories, 6 files
tree
.
├── my_existing_project
│ ├── __init__.py
│ ├── assets.py
│ ├── definitions.py
│ └── py.typed
└── setup.py
2 directories, 5 files
dg
needs to be able to resolve a Python environment for your project. This
environment must include an installation of your project package. By default,
a project's environment will resolve to whatever virtual environment is
currently activated in the shell, or system Python if no virtual environment is
activated.
Before proceeding, we'll make sure we have an activated and up-to-date virtual
environment in the project root. Having the virtual environment located in the
project root is recommended (particularly when using uv
) but not required.
- uv
- pip
If you don't have a virtual environment yet, run:
uv sync
Then activate it:
source .venv/bin/activate
If you don't have a virtual environment yet, run:
python -m venv .venv
Now activate it:
source .venv/bin/activate
And install the project package as an editable install:
pip install --editable .
Install dependencies
Install the dg
command line tool
- uv
- pip
We'll install dg
globally as a uv
tool:
uv tool install dagster-dg
This installs dg
into a hidden, isolated Python environment separate from your project virtual environment. The
dg
executable is always available in the user's $PATH
, regardless of any virtual environment activation in the
shell. This is the recommended way to work with dg
if you are using uv
.
Let's install dg
into your project virtual environment. This is the recommended way to work with dg
if you are
using pip
.
pip install dagster-dg
Update project structure
Add dg
configuration
The dg
command recognizes Dagster projects through the presence of TOML
configuration. This may be either a pyproject.toml
file with a tool.dg
section or a dg.toml
file. Let's add this configuration:
- uv
- pip
Since our project already has a pyproject.toml
file, we can just add
the requisite tool.dg
section to the file:
...
[tool.dg]
directory_type = "project"
[tool.dg.project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"
Since our sample project has a setup.py
and no pyproject.toml
,
we'll create a dg.toml
file:
directory_type = "project"
[project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"
There are three settings:
directory_type = "project"
: This is howdg
identifies your package as a Dagster project. This is required.project.root_module = "my_existing_project"
: This points to the root module of your project. This is also required.project.code_location_target_module = "my_existing_project.definitions"
: This tellsdg
where to find the top-levelDefinitions
object in your project. This actually defaults to[root_module].definitions
, so it is not strictly necessary for us to set it here, but we are including this setting in order to be explicit--existing projects might have the top-levelDefinitions
object defined in a different module, in which case this setting is required.
Now that these settings are in place, you can interact with your project using dg
. If we run dg list defs
we can see the sole existing asset in our project:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────┘
Add a dagster_dg.plugin
entry point
We're not quite done adding configuration. dg
uses the Python entry
point API
to expose custom component types and other scaffoldable objects from user
projects. Our entry point declaration will specify a submodule as the location
where our project exposes plugin objects. By convention, this submodule is
named <root_module>.lib
. In our case, it will be my_existing_project.lib
.
Let's create this submodule now:
mkdir my_existing_project/lib && touch my_existing_project/lib/__init__.py
See the plugin guide for more on dg
plugins.
We'll need to add a dagster_dg.plugin
entry point to our project and then
reinstall the project package into our virtual environment. The reinstallation
step is crucial. Python entry points are registered at package installation
time, so if you simply add a new entry point to an existing editable-installed
package, it won't be picked up.
Entry points can be declared in either pyproject.toml
or setup.py
:
- uv
- pip
Since our package metadata is in pyproject.toml
, we'll add the entry
point declaration there:
...
[project.entry-points]
"dagster_dg.plugin" = { my_existing_project = "my_existing_project.lib"}
...
Then we'll reinstall the package. Note that uv sync
will not
reinstall our package, so we'll use uv pip install
instead:
uv pip install --editable .
Our package metadata is in setup.py
. While it is possible to add
entry point declarations to setup.py
directly, we want to be able to
read the entry point declaration from dg
, and there is no reliable
way to read setup.py
(since it is arbitrary Python code). So we'll
instead add the entry point to a new setup.cfg
, which can be used
alongside setup.py
. Create setup.cfg
with the following contents
(if your package has existing entry points declared in setup.py
, you'll
want to move their definitions to setup.cfg
as well):
[options.entry_points]
dagster_dg.plugin =
my_existing_project = my_existing_project.lib
Then we'll reinstall the package:
pip install --editable .
To make sure our plugin is working, let's scaffold a new component type and
then make sure it's available to dg
commands. First create the component
type:
dg scaffold component-type Foo
Creating a Dagster component type at /.../my-existing-project/my_existing_project/lib/foo.py.
Scaffolded files for Dagster component type at /.../my-existing-project/my_existing_project/lib/foo.py.
Then run dg list plugins
to confirm that the new component type is available:
dg list plugins
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Plugin ┃ Objects ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │
│ │ │ dagster.asset │ Create a │ [scaffold-ta… │ │
│ │ │ │ definition │ │ │
│ │ │ │ for how to │ │ │
│ │ │ │ compute an │ │ │
│ │ │ │ asset. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.asset_check │ Create a │ [scaffold-ta… │ │
│ │ │ │ definition │ │ │
│ │ │ │ for how to │ │ │
│ │ │ │ execute an │ │ │
│ │ │ │ asset check. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.DefinitionsComponent │ An arbitrary │ [component, │ │
│ │ │ │ set of │ scaffold-tar… │ │
│ │ │ │ dagster │ │ │
│ │ │ │ definitions. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.DefsFolderComponent │ A folder │ [component, │ │
│ │ │ │ which may │ scaffold-tar… │ │
│ │ │ │ contain │ │ │
│ │ │ │ multiple │ │ │
│ │ │ │ submodules, │ │ │
│ │ │ │ each │ │ │
│ │ │ │ which define │ │ │
│ │ │ │ components. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.PipesSubprocessScriptCollectionComponent │ Assets that │ [component, │ │
│ │ │ │ wrap Python │ scaffold-tar… │ │
│ │ │ │ scripts │ │ │
│ │ │ │ executed │ │ │
│ │ │ │ with │ │ │
│ │ │ │ Dagster's │ │ │
│ │ │ │ PipesSubpro… │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.multi_asset │ Create a │ [scaffold-ta… │ │
│ │ │ │ combined │ │ │
│ │ │ │ definition │ │ │
│ │ │ │ of multiple │ │ │
│ │ │ │ assets that │ │ │
│ │ │ │ are computed │ │ │
│ │ │ │ using the │ │ │
│ │ │ │ same op and │ │ │
│ │ │ │ same │ │ │
│ │ │ │ upstream │ │ │
│ │ │ │ assets. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.schedule │ Creates a │ [scaffold-ta… │ │
│ │ │ │ schedule │ │ │
│ │ │ │ following │ │ │
│ │ │ │ the provided │ │ │
│ │ │ │ cron │ │ │
│ │ │ │ schedule and │ │ │
│ │ │ │ requests │ │ │
│ │ │ │ runs for the │ │ │
│ │ │ │ provided │ │ │
│ │ │ │ job. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.sensor │ Creates a │ [scaffold-ta… │ │
│ │ │ │ sensor where │ │ │
│ │ │ │ the │ │ │
│ │ │ │ decorated │ │ │
│ │ │ │ function is │ │ │
│ │ │ │ used as the │ │ │
│ │ │ │ sensor's │ │ │
│ │ │ │ evaluation │ │ │
│ │ │ │ function. │ │ │
│ │ └─────────────────────────────────────────────────────────────┴──────────────┴───────────────┘ │
│ my_existing_project │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ my_existing_project.lib.Foo │ COMPONENT SUMMARY HERE. │ [component, scaffold-target] │ │
│ │ └─────────────────────────────┴─────────────────────────┴──────────────────────────────┘ │
└─────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────┘
You should see the my_project.lib.MyComponentType
listed in the output. This
means our plugin entry point is working.
Create a defs
directory
Part of the dg
experience is autoloading definitions. This means
automatically picking up any definitions that exist in a particular module. We
are going to create a new submodule named my_existing_project.defs
(defs
is
the conventional name of the module for where definitions live in dg
) from which we will autoload definitions.
mkdir my_existing_project/defs
Modify top-level definitions
Autoloading is provided by a function that returns a Definitions
object. Because we already have some other definitions in our project, we'll combine those with the autoloaded ones from my_existing_project.defs
.
To do so, you'll need to modify your definitions.py
file, or whichever file contains your top-level Definitions
object.
You'll autoload definitions using load_defs
, then merge them with your existing definitions using Definitions.merge
. You pass load_defs
the defs
module you just created:
- Before
- After
import dagster as dg
from my_existing_project.assets import my_asset
defs = dg.Definitions(
assets=[my_asset],
)
import my_existing_project.defs
from my_existing_project.assets import my_asset
import dagster as dg
defs = dg.Definitions.merge(
dg.Definitions(assets=[my_asset]),
dg.components.load_defs(my_existing_project.defs),
)
Now let's add an asset to the new defs
module. Create
my_existing_project/defs/autoloaded_asset.py
with the following contents:
import dagster as dg
@dg.asset
def autoloaded_asset(): ...
Finally, let's confirm the new asset is being autoloaded. Run dg list defs
again and you should see both the new autoloaded_asset
and old my_asset
:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ autoloaded_asset │ default │ │ │ │ │
│ │ ├──────────────────┼─────────┼──────┼───────┼─────────────┤ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────────────┘
Now your project is fully compatible with dg
!