AI in Power BI Development: What Works, What Doesn't, and What's Next
April 27, 2026
By Tony Thomas
TL;DR: AI is already useful in Power BI development, but it is also deeply limited in ways that matter. Understanding exactly where that line falls is the difference between shipping better reports faster and debugging AI-generated nonsense for hours. This is a practical, category-level assessment: where AI helps today, where it fails, and where purpose-built tools are filling gaps that general-purpose AI cannot.
Microsoft Copilot in Power BI: The Official Story
Microsoft has invested heavily in Copilot integration across the Power BI service. Credit where it is due: the vision is ambitious and some of the execution is genuinely good.
What Copilot does well:
- Natural language Q&A over datasets. Asking plain-English questions about data and getting visuals back is real and works. For ad-hoc exploration by business users who will never learn DAX, this is a meaningful capability.
- Narrative summaries. Copilot can generate text descriptions of what a visual shows. For executive reports where someone needs a written summary alongside a chart, this saves time.
- Report page generation. You can describe what you want on a page, and Copilot will create a first draft with visuals. The results vary, but the starting point is often better than a blank canvas for simple layouts.
These are real features solving real problems. Dismissing them outright is intellectually lazy.
Where Copilot falls short:
- DAX generation is surface-level. Copilot can write basic measures. SUM, AVERAGE, COUNTROWS with simple filters. Once you need CALCULATE with complex filter context, iterator functions, or measures that reference other measures in a specific evaluation order, the output becomes unreliable. You spend more time validating and fixing than you would have spent writing the measure yourself.
- No awareness of your semantic model's intent. Copilot sees table and column names. It does not understand that your model uses a specific star schema pattern, that certain columns are surrogate keys not meant for aggregation, or that your business defines "active customer" with a particular set of criteria. It generates technically valid DAX that is semantically wrong for your model.
- Report design is functional, not good. Generated pages put visuals on a canvas. They do not follow layout principles, visual hierarchy, or consistent spacing. The output is a starting point that needs significant rework for anything client-facing.
The licensing constraint nobody talks about enough:
Copilot for Power BI requires Premium or Fabric capacity. That means either a P SKU or an F SKU, starting around EUR 50 per month at the lowest tier. Roughly 80% of Power BI users are on Pro licenses. For them, Copilot does not exist. Every "AI in Power BI" article that fails to mention this is doing readers a disservice.
This is not a minor detail. It means the majority of the Power BI ecosystem has no access to Microsoft's own AI features and needs to look elsewhere for AI-assisted development.
Generic AI for DAX: The Copy-Paste Problem
Outside of Copilot, the most common AI workflow in Power BI development is straightforward: copy a requirement into a general-purpose LLM, ask for a DAX measure, paste the result into Power BI Desktop.
This works surprisingly well for isolated, well-defined calculations. If you need a year-over-year growth measure and you describe your date table structure, a capable LLM will produce correct DAX most of the time.
The problems start when complexity increases.
Context collapse is the core issue. A general-purpose LLM does not have your semantic model. It does not know your table names, your relationships, your existing measure library. You can paste schema information into the prompt, but this approach has hard limits:
- Token windows fill up fast. A moderately complex model with 20 tables, 200 columns, and 50 existing measures already pushes the practical limits of what you can paste into a prompt while leaving room for the actual question and a useful response.
- Relationship awareness is fragile. You can describe that DimCustomer relates to FactSales on CustomerKey, but once you have 15 relationships including inactive ones and role-playing dimensions, the LLM loses track. The DAX it generates may assume a filter propagates through a path that does not exist in your model.
- Measure dependencies break. If your new measure needs to reference [Total Revenue], which itself references [Net Price], which uses SELECTEDVALUE on a slicer table, the LLM needs all of that context. Miss one dependency and the output is wrong in ways that are not obvious until you test edge cases.
The hallucination problem is real but misunderstood. LLMs do not hallucinate DAX functions that do not exist. What they do is combine real functions in ways that are syntactically valid but logically incorrect. A CALCULATE with a filter that contradicts the measure's intended behavior. A SUMX that iterates over the wrong table. These errors pass a syntax check and look reasonable. They just produce wrong numbers.
This is why the schema-aware approach matters. When an AI tool has access to your actual TMDL schema, including table definitions, relationships, existing measures, and column metadata, the generation quality improves dramatically. Not because the underlying model is smarter, but because it has the right context to avoid the most common errors.
Draft BI's Model Studio generates DAX from your actual TMDL schema, sending only structural metadata to the LLM. No row-level data leaves your environment. This is a meaningful distinction from pasting your schema into a chat window, both for accuracy and for data governance.
AI-Assisted Report Design: More Than Pretty Layouts
Report design is where AI's current limitations are most visible and where purpose-built tooling adds the most value.
The layout problem is harder than it looks. Generating a "good" dashboard layout is not a matter of placing visuals in a grid. It requires understanding:
- Visual hierarchy (what should the reader see first)
- Information density appropriate to the audience
- Consistent spacing and alignment across all elements
- Responsive considerations for different screen sizes
- Logical grouping of related metrics
General-purpose AI can talk about these principles. It cannot reliably execute them in a Power BI report specification.
What actually works today:
AI can generate wireframe-level layouts that serve as starting points. Given a description of the metrics and KPIs needed, an AI tool can produce a spatial arrangement that follows basic design principles. The key word is "starting point." A developer still needs to refine spacing, adjust visual types, and ensure the layout communicates the data story correctly.
This is still valuable. Starting from a structured wireframe instead of a blank canvas saves 20 to 30 minutes per report page, and the results are more consistent than most developers produce freehand.
Where AI design assistance needs to go next:
- Design scoring. Evaluating an existing layout against established principles (alignment, spacing ratios, visual balance) and providing specific, actionable feedback. Not "your report could be improved" but "the gap between your KPI cards and the chart below is 24px while all other gaps are 16px."
- Accessibility checking at the design stage. Catching contrast issues, missing alt text patterns, and color-only encoding before the report is built, not after.
- Template intelligence. Learning from an organization's existing reports to generate layouts that match established patterns rather than generic defaults.
Draft BI's Wireframe Studio takes this approach. You build layouts with drag-and-drop or generate them from descriptions, then export to .pbir format for direct import into Power BI Desktop. The design happens in a purpose-built environment with layout guides and spacing tools, not in a chat window.
AI-Assisted Model Assessment: The Overlooked Category
This is the area where AI provides the highest return with the least risk, and it gets almost no attention in the "AI in Power BI" conversation.
Evaluating a semantic model's quality, identifying structural issues, suggesting relationship improvements, and flagging common anti-patterns: these are tasks where AI excels because the input is well-structured (TMDL is essentially a configuration language) and the evaluation criteria are well-defined.
What model assessment can catch:
- Missing or incorrect relationships. Tables that should be related but are not, or relationships with wrong cardinality.
- Star schema violations. Fact tables relating directly to other fact tables, dimension tables with measures, bridge tables that could be eliminated.
- Column-level issues. Columns with names that suggest they are keys but are not used in relationships. Calculated columns that should be measures. Data types that do not match their apparent purpose.
- Measure organization. Measures scattered across tables instead of collected in a dedicated measure table. Circular or unnecessarily complex dependency chains.
This kind of analysis is tedious for a human to do comprehensively. A senior developer might catch the obvious issues in a model review, but systematically checking every relationship, every column, every measure dependency across a 30-table model takes hours. AI does it in seconds and catches patterns a human reviewer might overlook because they are focused on the business logic, not the structural hygiene.
Draft BI's Model Studio includes a Model Assessment mode that runs this kind of analysis against your TMDL schema. It categorizes findings by severity and provides specific remediation guidance, not generic advice.
Works with any Power BI license. No Premium or Fabric capacity required.
What Doesn't Work (Yet)
Honesty about current limitations is more useful than hype about future possibilities.
End-to-end report generation from requirements. The dream of describing a business requirement and getting a production-ready Power BI report is years away, if it arrives at all. The gap between "generate a sales dashboard" and a report that handles the specific edge cases of your business, your data quality issues, your stakeholder preferences, is enormous.
Automated data modeling. AI can assess an existing model and suggest improvements. It cannot reliably design a model from scratch given raw source tables. The decisions involved in dimensional modeling, grain definition, handling slowly changing dimensions, choosing between snowflake and star patterns for specific use cases, require domain knowledge and business context that AI does not have.
Self-correcting DAX. Current AI cannot reliably detect when its own DAX output produces incorrect results. It can check syntax. It cannot verify that the business logic is correct because it does not know what "correct" means for your specific calculation. A year-over-year measure that returns a number is not necessarily returning the right number.
Natural language reporting for complex analysis. Simple questions ("what were total sales last quarter") work. Nuanced questions ("which customer segments showed declining engagement before churning, controlling for seasonal patterns") do not. The gap between natural language understanding and analytical rigor is real.
What's Actually Next
The trajectory is clear even if the timeline is not.
Schema-aware AI becomes the baseline. The copy-paste-into-a-chat-window workflow will be replaced by tools that integrate directly with your semantic model. This is already happening. Within two years, any serious AI-assisted Power BI development will start with the model schema, not a blank prompt.
Design systems for BI. The web development world solved the "every page looks different" problem with design systems: defined spacing scales, color tokens, component libraries. Power BI is behind on this. AI-assisted tooling that enforces design consistency across reports is coming, and it will be more impactful than AI that generates individual visuals.
Model-level intelligence. Instead of asking AI to write one measure at a time, the next step is AI that understands your entire measure library and can suggest additions, identify redundancies, and flag measures that are defined inconsistently with related measures. Think of it as a code review for your semantic model, running continuously.
The unbundling of Copilot's gaps. Microsoft will continue improving Copilot, and it will get meaningfully better. But the Premium licensing requirement creates a structural gap that third-party tools will fill for the Pro-license majority. Purpose-built tools that do one thing well (layout design, DAX generation with full schema context, model assessment) will outperform general-purpose AI for those specific tasks, the same way specialized developer tools outperform general-purpose code assistants for specific languages and frameworks.
Where This Leaves You
If you are a Power BI developer evaluating AI tools today, the practical advice is straightforward:
Use Copilot if you have Premium capacity. It is genuinely useful for exploration and first-draft generation. Do not expect production-ready output.
Stop pasting schemas into generic chat tools. The results are mediocre and the workflow is slow. Use tools that integrate with your model schema natively.
Invest time in model quality. AI-assisted DAX generation is only as good as the model it reads. A well-structured star schema with clear naming conventions produces dramatically better AI output than a messy model with ambiguous column names. This is true regardless of which AI tool you use.
Treat AI output as a first draft, always. Validate every measure. Test edge cases. Check filter context behavior. The time savings from AI come from faster first drafts, not from eliminating review.
The developers who will benefit most from AI in Power BI are not the ones who adopt every new feature uncritically. They are the ones who understand exactly what each tool does well, use it for that purpose, and maintain their own expertise for everything else.
That expertise is not going anywhere. The tools just got better.
Founder of Draft BI, building the design-first companion for Power BI report development. Writing about PBIR, WCAG accessibility, DAX measures, and the workflows that help Power BI developers and analysts deliver better reports faster.