Skip to main content

Updates

UPDATEs start to increase the complexity of your transformations, but fortunately, they’re pretty darn simple to migrate, as well. The thought process that you go through when translating an UPDATE is quite similar to how an INSERT works, but the logic for the SELECT list in the dbt model is primarily sourced from the content in the SET section of the UPDATE statement. Let’s look at a simple example:

UPDATE orders

SET type = 'return'

WHERE total < 0

The way to look at this is similar to an INSERT-SELECT statement. The table being updated is the model you want to modify, and since this is an UPDATE, that model has likely already been created, and you can either:

  • add to it with subsequent transformations
  • create an intermediate model that builds off of the original model – perhaps naming it something like int_[entity]_[verb].sql.

The SELECT list should contain all of the columns for the table, but for the specific columns being updated by the DML, you’ll use the computation on the right side of the equals sign as the SELECTed value. Then, you can use the target column name on the left of the equals sign as the column alias.

If I were building an intermediate transformation from the above query would translate to something along the lines of:

SELECT
CASE
WHEN total < 0 THEN 'return'
ELSE type
END AS type,

order_id,
order_date

FROM {{ ref('stg_orders') }}

Since the UPDATE statement doesn’t modify every value of the type column, we use a CASE statement to apply the contents’ WHERE clause. We still want to select all of the columns that should end up in the target table. If we left one of the columns out, it wouldn’t be passed through to the target table at all due to dbt’s declarative approach.

Sometimes, you may not be sure what all the columns are in a table, or in the situation as above, you’re only modifying a small number of columns relative to the total number of columns in the table. It can be cumbersome to list out every column in the table, but fortunately dbt contains some useful utility macros that can help list out the full column list of a table.

Another way I could have written the model a bit more dynamically might be:

SELECT
{{ dbt_utils.star(from=ref('stg_orders'), except=['type']) }},
CASE
WHEN total < 0 THEN 'return'
ELSE type
END AS type,

FROM {{ ref('stg_orders') }}

The dbt_utils.star() macro will print out the full list of columns in the table, but skip the ones I’ve listed in the except list, which allows me to perform the same logic while writing fewer lines of code. This is a simple example of using dbt macros to simplify and shorten your code, and dbt can get a lot more sophisticated as you learn more techniques. Read more about the dbt_utils package and the star macro.

0