What even is dbt? An Analytics engineer explains | Laurie Merrell & Michael Chow | Data Science Lab
Автор: Posit PBC
Загружено: 2026-01-30
Просмотров: 1089
Описание:
The Data Science Lab is a live weekly call. Register at pos.it/dslab! Discord invites go out each week on lives calls. We'd love to have you!
The Lab is an open, messy space for learning and asking questions. Think of it like pair coding with a friend or two. Learn something new, and share what you know to help others grow.
On this call, Libby Heeren is joined by Jarvis Innovations Lead Analytics Engineer Laurie Merrell and Posit Principal Software Engineer Michael Chow as they walk us through a beginner dbt project and let us ask as many questions as we like (and we do, we ask all the questions, including, WHAT EVEN IS dbt??). This is a super friendly, MESSY, collaborative, and curious peek at dbt. It's is a tool that's often mysterious to data scientists and it's a big enough framework that it can feel tough to get started with. Walking through the basics makes it way easier to get into!
Hosting crew from Posit: Libby Heeren, Isabella Velasquez
Laurie's LinkedIn: / laurie-merrell
Michael's socials and urls:
LinkedIn: / michael-a-chow
Bluesky: https://bsky.app/profile/mchow.com
GitHub: https://github.com/machow
Resources from the hosts and chat:
🔗 Michael Chow's talk about dbt at the Coalesce Conference in 2022: • The accidental analytics engineer
🔗 Beginner dbt project Michael is using: https://github.com/dbt-labs/jaffle_sh...
🔗 Laurie's Coalesce talk with Ian and Jenna: • From coast to coast: Implementing dbt in t...
🔗 Link to installation page for the DuckDB CLI: https://duckdb.org/install/?platform=...
🔗 "Why is dbt so important" shared by Jenna in the chat: https://highgrowthengineering.substac...
🔗 dbtplyr: https://hub.getdbt.com/emilyriederer/...
🔗 Parquet: https://parquet.apache.org/
🔗 From stored procedures to dbt: A modern migration playbook: https://www.getdbt.com/blog/stored-pr...
🔗 How to structure our dbt projects: https://docs.getdbt.com/best-practice...
🔗 Jenna Jordan's blog on dbt mesh: https://jennajordan.me/blog/data-mesh...
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here:
Website: https://www.posit.co
The Lab: https://pos.it/dslab
Hangout: https://pos.it/dsh
LinkedIn: / posit-software
Bluesky: https://bsky.app/profile/posit.co
Thanks for learning with us! 💛
Timestamps
00:00 Introduction
01:09 Guest introductions: Michael Chow and Laurie Merrell
04:15 Overview of today's session
05:51 Setting up the GitHub Codespace
07:00 The data science workflow vs. organizational needs
10:06 Why dbt is hard to learn in the abstract
13:34 "Could we back up and explain what dbt is again?"
19:12 Running 'dbt build'
20:00 Inspecting the database with DuckDB CLI
26:21 "Does dbt have concurrency or dependency capabilities?"
27:37 Understanding the 'ref' macro
29:52 "Is dbt an orchestrator?"
31:14 "Starting a project from scratch with just SQL?"
32:04 "How is this better than writing Python scripts?"
35:46 "Is data source detection dynamic with dbt?"
38:36 Generating and serving dbt docs
46:51 "Is dbt an IDE like RStudio, but for SQL?"
52:32 Branching and development environments
53:57 "Where would you begin on a brand new project?"
56:38 "How would you validate dependencies and downstream impacts?"
57:48 Defining a view versus a table
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: