Model Selection For dbt CLI

Author:Murphy  |  View: 23897  |  Time: 2025-03-23 20:02:24

When working on Dbt projects you need to ensure that the CLI commands used to run or test models, seeds and snapshots encompass only the resource (or a subset) of interest. In other words, you need to be able to target specific models, tests, seeds or snapshots in order to avoid wasting resources and money. This is even more important when you work with fairly big models that process large volumes of data.

By default, dbt run|test|seed|snapshot will execute all corresponding nodes in the dependency graph (i.e. dbt run will run all models, dbt tests will run all model tests, and so on). In this article, we will present all the possible model selection shorthands you can take advantage when running or testing models, seeds or snapshots via dbt Command Line Interface (CLI).

If you are looking into experimenting with the commands that we will present in the next few sections, feel free to create an example dbt project locally. You can do so (in probably less than two minutes) by following this step-by-step guide where you can also find a containerised dbt environment.


Run all resources in a dbt project

In order to select all resources within a dbt project, all you need to do is select the project name:

# Runs all models in project my_dbt_project
dbt run --select my_dbt_project

# Runs all tests in project my_dbt_project
dbt test --select my_dbt_project

# Runs all snapshots in project my_dbt_project
dbt snapshot --select my_dbt_project

# Runs all seeds in project my_dbt_project
dbt seed --select my_dbt_project

Select a specific resource

In order to execute run, test, snapshot or seed command for a specific model, all you need to do is specify the model name in the --select option:

# Run model with name `my_model`
dbt run --select my_model

# Run test with id `not_null_orders_order_id`
dbt test --select not_null_orders_order_id

# Run snapshot with name `my_snapshot`
dbt snapshot --select my_snapshot

# Run seed with name `my_seed`
dbt seed --select my_seed

You can even run a specific model, seed or snapshot by its specific path that points to the Sql file that defines it:

# Run model my_model
dbt run --select path/to/my_model.sql

# Run snapshot my_snapshot
dbt snapshot --select path/to/my_snapshot.sql

# Run seed my_seed
dbt seed --select path/to/my_seed.sql

Select multiple models

--select accepts multiple arguments which means that it is capable of running multiple models (or tests, snapshots and seeds) at the same time. To do so, simply provide all mode, test, snapshot or seed names when running the command:

# Run multiple models
dbt run --select my_model another_model

# Run multiple tests
dbr test --select not_null_orders_order_id unique_orders_order_id

# Run multiple snapshots
dbt snapshot --select my_snapshot another_snapshot

# Run multiple seeds
dbt seed --select my_seed another_seed

Select node and downstream dependencies

In order to run a dbt node as well as its downstream dependencies, you will need to specify the + operator after the resource name.

# Run the model with name `my_model` as well as its downstream dependencies
dbt run --select my_model+

# Run my_model tests and the tests of its downstream dependencies 
dbt test --select my_model+

# Run seed my_seed and its downstream dependencies
dbt seed --select my_seed+

Select model and upstream dependencies

Likewise, to select a node and its upstream dependencies, the + operator needs to be specified prior to the node name:

# Run the upstream dependencies of model `my_model` and the model itself
dbt run --select +my_model

# Run the tests of my_model and the tests of its upstream dependencies
dbt test --select +my_model

# Run the upstream dependencies of snapshot my_snapshot and the snapshot itself
dbt snapshot --select +my_snapshot

# Run the upstream dependencies of seed my_seed and the seed itself
dbt seed --select +my_seed

Select model with downstream and upstream dependencies

Now in order to run a model as well as all of its downstream and upstream dependencies, you just need to specify the model name in-between two + operators:

# Run the model `my_model` including its parents and children nodes
dbt run --select +my_model+

# Run the tests for model `my_model` including the tests of its parents and children
dbt test --select +my_model+

# Run the snapshot `my_snapshot` and all downstream and upstream dependencies
dbt snapshot --select +my_snapshot+

# Run the seed `my_seed` and all of the downstream and upstream depdencies
dbt seed --select +my_seed+

Select model and N downstream dependencies

There's a chance that instead of running all the downstream (children) dependencies of a model, you may have to run only a number of edges to step through. This can be achieved once again using the + operator, but this time specifying the degree/level of parent models to execute.

# Run model my_model and its first-degree children
dbt run --select my_model+1

# Run tests for `my_model` model and the tests of its first-degree children
dbt test --select my_model+1

# Run `my_snapshot` snapshot and its first-degree children
dbt snapshot --select my_snapshot+1

# Run seed `my_seed` and its first-degree children
dbt seed --select my_seed+1

Select model and N upstream dependencies

In the same way, you can specify the number of edges to step through when it comes to upstream (or parent) dependencies

# Run my_model and its first and second degree parent nodes
dbt run --select 2+my_model

# Run tests of my_model and the tests of its first and second degree parents
dbt test --select 2+my_model

# Run snapshot my_snapshot and its first and second degree parent nodes
dbt snapshot --select 2+my_snapshot

# Run seed my_seed and its first and second degree parent nodes
dbt seed --select 2+my_seed

Select model and N upstream and M downstream dependencies

Finally, to select a model as well as N parent and M children nodes, you can specify the model in between the number of edges to step through for both upstream and downstream dependencies:

# Run model `my_model`, its parents up to the 4th level and its downstreams up to the 5th level
dbt run --select 4+my_model+5

# Run tests of model `my_model` and the tests of its parents up to the 4th level and its downstreams up to the 5th level
dbt test --select 4+my_model+5

# Run snapshot `my_snapshot`, its parents up to the 4th level and its downstreams up to the 5th level
dbt snapshot --select 4+my_snapshot+5

# Run seed `my_seed`, its parents up to the 4th level and its downstreams up to the 5th level
dbt seed --select 4+my_seed+5

Exclude a model

Apart from --select, the dbt CLI also offers the --exclude flag (with the same semantics as --select). Any model specified in the --exclude argument will be removed from the set of models selected with --select.

The following command, will run all models except the one called my_model:

dbt run --exclude my_model

The --exclude argument is also applicable to other dbt commands:

# Run all tests except the one with id `not_null_orders_order_id`
dbt test --exclude not_null_orders_order_id
# Run all tests except the tests of customers model
dbt test --exclude customers

# Run all snapshots except `my_snapshot`
dbt snapshot --exclude my_snapshot

# Run all seeds except `my_seed`
dbt seed --exclude my_seed

Note that both --select and --exclude arguments can be combined in a single dbt command.

For example, the following command will run all models in package my_package except the model user_base_model and its downstream dependencies.

dbt run --select my_package --exclude my_package.user_base_model+

Run a model in a specific package

To run a model, test, snapshot or seed that belongs to a specific dbt package, you need to follow the dot notation as illustrated in the following command:

# Runs model my_model in package mypackage
dbt run --select mypackage.my_model

# Runs tests of my_model model in package mypackage
dbt test --select mypackage.my_model

# Runs snapshot my_snapshot in package mypackage
dbt snapshot --select mypackage.my_snapshot

# Runs seed my_seed in package mypackage
dbt seed --select mypackage.my_seed

Run all models in a specific path

In order to run models, tests, snapshots or seeds placed under a specific directory, you can use the following selector notation:

# Runs all models under path.to.my.models directory
dbt run --select path.to.my.models

# Runs all tests under path.to.my.models directory
dbt test --select path.to.my.models

# Runs all snapshots under path.to.my.snapshots directory
dbt snapshot --select path.to.my.snapshots

# Runs all seeds under path.to.my.seeds directory
dbt seed --select path.to.my.seeds

In addition to the dot notation, you can also run models in a specific path as illustrated below:

# Runs all models under path/to/my/models directory
dbt run --select path/to/my/models

# Runs all tests under path/to/my/models directory
dbt test --select path/to/my/models

# Runs all snapshots under path/to/my/snapshots directory
dbt snapshot --select path/to/my/snapshots

# Runs all seeds under path/to/my/seeds directory
dbt seed --select path/to/my/seeds

Select model with a specific tag

If you have tagged resources and you would like to execute all of them, you can provide the tag selector followed by the tag name, as illustrated in the following command:

# Run all models with "finance" tag
dbt run --select tag:finance

Combining multiple selectors

Note that you can actually combine pretty much any selector described in this tutorial in a single command. For example, the following command will run every resource tagged with the finance tag, the individual model my_model as well as all the models in the path path.to.my.marketing.models:

dbt run --select tag:finance my_model path.to.my.marketing.models

And as usual, this can be applied to pretty much every resource, including tests, seeds and snapshots:

# Tests
dbt test --select tag:finance not_null_orders_order_id path.to.my.marketing.models

# Seeds
dbt seed --select tag:finance my_seed path.to.my.marketing.seeds

# Snapshots
dbt snapshot --select tag:finance my_snapshot path.to.my.marketing.snapshots

Final Thoughts

In conclusion, when working with dbt projects, it is important to be able to target specific models, tests, seeds, or snapshots in order to avoid wasting resources and money. The dbt Command Line Interface (CLI) offers a variety of shorthands that allow you to select specific resources to run, test, seed, or snapshot. These include the ability to include or exclude such models when running any dbt command.


Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You'll also get full access to every story on Medium.

Join Medium with my referral link – Giorgos Myrianthous


Related articles you may also like

ETL vs ELT: What's the Difference?


Staging vs Intermediate vs Mart Models in dbt


How to Structure Your dbt Projects and Data Models

Tags: Data Analysis Data Engineering Dbt Programming Sql

Comment