Enterprise DNA

Omni by Enterprise DNA

Enterprise DNA Resources

Insights on data, AI & business. Practical AI operating-system thinking for owners, operators, and teams doing real work.

220k+

Data professionals

Omni

AI agents and apps

Audit

Map the manual work

dbt AI Features: What Data Engineers Actually Found
Blog AI

dbt AI Features: What Data Engineers Actually Found

A working data engineer's take on dbt's AI features after weeks in production, including what delivered, what broke, and where the costs surprised us.

Sam McKay

The Hype vs the Pull Request

When dbt Labs started shipping AI features into the core product, the r/dataengineering threads filled up fast. The pitch was familiar. Generate SQL from natural language, get model documentation written for you, surface lineage explanations on demand. Practitioners on the subreddit had a fairly consistent reaction in the first weeks. The demos looked slick. The reality, as several senior analytics engineers put it, was “a useful intern who sometimes makes things up.”

That framing kept coming up. The tool is not a replacement for someone who understands your warehouse, your grain, or your business keys. It is a productivity layer on top of a workflow you already trust. Teams who approached it that way reported a noticeably different experience than teams who expected it to write their entire transformation layer from a Slack message.

What I want to do here is walk through what the practitioner community has actually found, including the numbers, the failure modes, and the workflow patterns that seem to hold up in production.

What Genuinely Delivered

The strongest signal across HN threads and the dbt Slack community is around three specific jobs.

First, model documentation. If you have ever inherited a 400-model project with README files last touched in 2022, you know the pain. The AI doc generation feature takes a compiled SQL model plus its config and produces a description block, column-level descriptions, and a short summary of what the model represents. A data engineer at a mid-size fintech posted their numbers on a practitioner blog last quarter. They ran it across 180 models. Roughly 70 percent of the output was usable as-is or needed only minor edits. The remaining 30 percent needed rewriting because the model misread the business context.

Second, YAML and schema file generation. This is the unglamorous work that nobody wants to do. The tool reads the compiled query, infers column types and tests, and writes a starter schema.yml. Practitioners reported saving 20 to 40 minutes per model on initial setup. For a project with dozens of new models per sprint, that compounds.

Third, on-demand SQL explanation. This is where the latency story is interesting. Practitioners on YouTube demos and the dbt discourse forum consistently reported response times between 2 and 6 seconds for a typical model explanation, depending on model size and warehouse connection. That is fast enough to feel conversational when you are debugging a CTE chain you did not write.

The cost side is harder to pin down because dbt’s AI features are bundled into the paid tier rather than metered per token. But the consensus from teams who dug into their bills was that the AI usage added roughly 8 to 15 percent to their dbt Cloud spend, with the heavy users at the higher end. One analytics lead at a Series B SaaS company said in a LinkedIn post that their monthly bill went from $1,200 to $1,340 after a team of six started using the features daily.

Where It Falls Apart

The failure modes are where the practitioner conversation gets really useful.

The most common complaint, by a wide margin, is hallucinated business logic. The tool will confidently write a description for a model called fct_user_lifetime_value that has nothing to do with your actual LTV calculation. It infers from the name, not from the SQL. A data engineer on r/analytics posted a screenshot of generated docs that described a revenue model as “tracks daily active users” because the model had a user_id column somewhere in the lineage. That kind of mistake is fine if a human reviews it. It is dangerous if the docs get auto-published to a data catalog.

The second issue is around complex SQL. Practitioners consistently reported that the tool performs well on straightforward SELECT statements, aggregations, and basic joins. It struggles with window functions, recursive CTEs, and anything involving dbt macros with custom arguments. A senior engineer on the dbt discourse forum ran a benchmark across 50 queries. The tool got 42 of them right on the first try. The remaining 8 either returned syntactically valid SQL that produced wrong results, or returned SQL that referenced columns that did not exist in the source tables.

Third, onboarding friction is real for teams on the open-source dbt-core. The AI features are gated to dbt Cloud. If you are running dbt-core on your own infrastructure, you do not get them. Several HN commenters pointed out that this creates an awkward situation where the AI features are pitched in product announcements but only available to paying customers. For a team that has standardized on dbt-core for cost or compliance reasons, this is a hard blocker.

Fourth, the cost surprise. While most teams reported manageable increases, a few outliers posted about bills that jumped 40 to 60 percent in a single month. The pattern was always the same. Someone on the team set the AI to auto-generate docs on every model build, including in CI runs that fired hundreds of times per day. The feature does not have aggressive rate limiting by default. If you turn it on without thinking about invocation frequency, you will pay for it.

Who It Actually Fits

The community signal points to a fairly specific profile.

Teams of 5 to 20 data engineers or analytics engineers who are already paying for dbt Cloud and who have a documented model layer that is mostly correct but under-documented. These are the teams getting the most value. The AI fills in the documentation debt, accelerates new model setup, and helps junior engineers understand existing models faster.

Teams that should probably skip it for now are the ones with highly bespoke transformation logic, heavy macro usage, or strict data governance requirements where hallucinated descriptions cannot be tolerated. A regulated fintech with manual review requirements on every column description will spend more time validating AI output than they save generating it.

Solo practitioners and very small teams (1 to 3 people) get less value because the documentation debt is usually smaller and the time savings do not compound as fast. A freelance analytics engineer commented on a YouTube walkthrough that they tried the features for a month and ended up turning them off because the manual review overhead exceeded the generation speedup.

The stack context matters too. If your team is already heavy on AI-assisted coding tools like Cursor or Copilot, the marginal value of dbt’s AI features is lower. You can get similar SQL generation from those tools with prompts that include your project context. The dbt-native features win on integration. They read your compiled artifacts, your project structure, and your existing YAML. A general-purpose coding assistant does not have that context unless you feed it manually.

What Teams Pair It With

The most common pairing pattern in the practitioner conversations is dbt AI features plus a strict human review layer. Teams that got the best results treated the AI output as a first draft, not a final answer. They added a docs review step to their PR template. They required a human to check every generated description before merge. Several teams built lightweight internal scripts that flagged any model description that contained certain trigger phrases (like “likely” or “approximately”) because those words often signaled the AI was guessing.

The second common pairing is with a data observability tool like Monte Carlo, Bigeye, or Elementary. The AI features help you write and document models. The observability tools tell you whether those models are actually producing correct results. Practitioners on the r/dataengineering subreddit consistently said that AI-assisted SQL generation without observability is a recipe for silent data quality issues.

For teams on dbt-core who want similar functionality, the replacements being discussed in 2026 are mostly custom. Some teams are using Cursor with project-level context files that include their dbt manifest. Others are using open-source tools like SQLFluff for formatting plus a separate LLM API call for documentation. The cost is lower but the integration is rougher. One engineer on HN described their setup as “three scripts and a Makefile that I am slightly afraid to touch.”

The other replacement pattern worth noting is doing nothing. Several senior practitioners argued in HN threads that the right answer for many teams is to invest the time in writing documentation properly, once, rather than generating it with AI and reviewing it forever. The math depends on how fast your model layer changes. If you are adding 5 models a quarter, manual docs are fine. If you are adding 50, the AI starts to pay off.

The Honest Take

After reading through dozens of practitioner posts, watching the YouTube deep dives, and talking to a few teams running this in production, the picture is fairly clear. dbt’s AI features are a useful productivity layer for teams who already have their transformation patterns nailed down and need help with the surrounding work. Documentation, schema files, and on-demand explanations are the wins. The risks are hallucinated context, complex SQL edge cases, and cost surprises from unmonitored usage.

The tool is not going to replace your senior analytics engineer. It is not going to write your most important revenue model from a one-line prompt. What it will do is take 30 to 50 percent off the time you spend on documentation and boilerplate, if you build the review discipline to catch its mistakes.

For a team deciding whether to enable these features, the questions to ask are simple. Do you already pay for dbt Cloud. Is your documentation debt large enough that AI-generated first drafts would save real time. Can you enforce a human review step on every AI output. If the answer to all three is yes, turn it on, measure the cost for a month, and adjust.

If you are working through which tools belong in your stack, book a 60-min Omni Audit, https://calendly.com/sam-mckay/discovery-call.