Power BI Dataflows in Catalog
You'll learn how Catalog uses Power BI Dataflows when it builds lineage from reports and data sets back to warehouse tables. You'll also see what Catalog stores for each flow, how to open and read that lineage in Catalog, how to find a Dataflow when search or navigation feels unclear, which M patterns Catalog recognizes, what to expect for freshness and schedules, and what you need in place when lineage looks incomplete or stale.
What Is a Power BI Dataflow?
A Power BI Dataflow is an optional data preparation layer in Power BI. Dataflows use Power Query M to connect to sources, shape data, and publish entities that data sets and other Dataflows can reuse. In many architectures, data moves from a warehouse into a Dataflow, then from the Dataflow into one or more data sets, then into reports and dashboards.
Microsoft's overview explains concepts and authoring: Introduction to Dataflows.
How Dataflows Relate to Lineage in Catalog
Catalog maintains a metadata graph that links BI assets to warehouse objects your integrations have already synced into Catalog. When a data set table loads from a Dataflow, that Dataflow is an extra step between the data set and the warehouse. If that step isn't represented in the metadata Catalog ingests from Power BI, lineage can stop at the data set even when warehouse sync is healthy.
Catalog processes Dataflows as part of Power BI lineage so asset-level and column-level lineage can run through the Dataflow layer when the underlying metadata is available and upstream objects exist in Catalog. You don't turn on a separate Dataflows option in Catalog. The same Power BI credentials, Admin API Settings, and extraction runs cover Dataflow metadata when your tenant and models expose it.
Lineage quality still depends on both sides of the graph:
- Power BI - Admin settings, refresh or republish behavior, and what the Power BI APIs return for mashup and related metadata. See Power BI for details.
- Warehouse and other sources - Tables and columns must be present in Catalog through your warehouse integration or another integration. If Catalog doesn't know about a database or connection your Dataflow uses, lineage can be incomplete even when Power BI extraction succeeds.
The following sections explain how that Dataflow layer is recorded, how to open it in Catalog, how to find Dataflows in search, what to expect for sync cadence, and how to interpret supported M patterns next to reports and warehouse tables.
What Catalog Stores for Power BI Dataflows
Catalog ingests each Power BI Dataflow as a Power BI visualization model, in the same family as a semantic data set. Catalog distinguishes a Dataflow from a semantic data set for lineage and storage. You see both types in the same Power BI areas of Catalog, for example Dashboards and search results for visualization models, using the names Power BI assigns in your tenant.
For each Dataflow, Catalog stores entities as tables on that model. For each entity, Catalog keeps the entity name, the M query text, resolved paths to warehouse tables when the M reads the warehouse, and links between entities when one entity's M references another Power BI table in the same flow, a pattern typical for linked entities. Catalog uses those internal entity links when it resolves lineage all the way to the warehouse, including for column-level lineage when M and metadata are clear enough.
When a semantic data set loads from a Dataflow, its M can use the Power BI Dataflows or Power Platform Dataflows connectors. Catalog parses workspace, Dataflow ID, and entity name from that M so the data set connects to the right flow and entity. Both identifiers need to be present in the shape Catalog expects, or the Dataflow reference does not resolve from that M alone.
The semantic data set connects to a specific flow and entity when its M uses a supported Power BI Dataflows or Power Platform Dataflows path, as described in Supported Patterns and Limits.
Column-level lineage builds on the same graph. It can be weaker than table-level lineage when an entity has several warehouse sources in one logical table, when M is heavily parameterized or indirect, or when field mappings are ambiguous. Catalog can use explicit rename mappings in M, for example Table.RenameColumns, for some column-level paths; other patterns can still be incomplete.
Lineage in the Catalog UI
Catalog shows lineage between reports, visualization models, and warehouse tables in the same lineage graph as other assets. Dataflows and semantic data sets both count as visualization models. The graph can include intermediate steps, for example warehouse to Dataflow entity to semantic data set, even when you focus on an end-to-end question.
Use this sequence when you want to inspect Dataflow-backed lineage:
-
Open the Power BI asset in Catalog from Dashboards in the left navigation or from Catalog search. Quick and advanced search both include visualization models; Power BI Dataflows appear alongside Power BI data sets in that visualization model family. Open the visualization model whose name matches the Dataflow or semantic data set in Power BI, or open a report or dashboard that depends on that model. Then select the Lineage tab on that asset.

On a report or dashboard, Lineage shows preview cards for Dashboard Lineage and Field Lineage before you open the full graph. -
On a report or dashboard, choose Open Dashboard Lineage to inspect object-level dependencies in the lineage canvas. On a visualization model, the canvas opens from the same Lineage tab without those preview cards. Use
+on the left to expand upstream sources and on the right to expand downstream usage, as described in Lineage. When a data set loads from a Dataflow, that Dataflow appears as another visualization model in the graph once metadata resolves.
Dashboard lineage with the report focused; upstream nodes include the semantic data set and warehouse objects. -
Select a warehouse table or view in the graph, or open that warehouse asset and its lineage, when you want to confirm how Catalog links warehouse objects into Power BI. Downstream nodes show which visualization models and reports consume the warehouse object.

Warehouse view focused with downstream semantic data set and report. -
For column-level paths, return to the asset Lineage tab and choose Open Field Lineage, or from a warehouse column use Go To Column Lineage when that control is available. Field lineage depends on how clearly Power Query maps fields. Dataflow-backed paths show here when table-level links support column resolution.

Field lineage for one column from the warehouse through the semantic data set into the report.
Every time your sources sync into Catalog, lineage is recomputed. The lineage graph uses data from the last 30 days; if you expect a recent change, adjust the time range under the graph if links look stale.
Supported Patterns and Limits
Catalog anchors Dataflow support on what the Power BI admin APIs return and on M patterns the product parses and tests. Treat the following as the supported surface for data set to Dataflow references in M:
- Recognized connector entry points - M that navigates through
PowerBI.DataflowsorPowerPlatform.Dataflowstoward a workspace, Dataflow ID, and entity name. - Composite key - Catalog ties a reference to a flow and entity when both the Dataflow identifier and the entity appear in the expected structure. If either is missing or expressed in a way the parser does not recognize, lineage through that Dataflow does not resolve until the model's M matches a supported shape and Power BI exposes matching admin metadata.
Linked entities inside a Dataflow, where one entity is built on another, are part of supported modeling: Catalog follows those internal links when resolving paths to the warehouse.
Plan for the following limits:
- Ambiguous or multi-source entities - When one entity resolves to multiple warehouse table paths in a way Catalog cannot reduce to a single path, column-level lineage can be incomplete even when table-level links exist.
- Dynamic or complex M - Parameters, indirection, or unusual connector shapes can weaken lineage until metadata and M align with what extraction returns.
- Warehouse gaps - If the warehouse tables your Dataflow reads are not in Catalog, the graph stops where Catalog has no upstream object, regardless of Power BI extraction.
Fabric Dataflow behavior follows the same Power BI integration, admin API output, and M patterns described in Supported Patterns and Limits. Use those references when you design validations for Fabric-backed flows.
Performance and Freshness
Use this section to set expectations for how often Dataflow-backed metadata and lineage update in Catalog. Treat extraction duration, Microsoft API behavior, and lineage recomputation timing as driven by your integration schedules and successful sync outcomes rather than by fixed timing promises in this documentation.
For the first load, Catalog-managed Power BI ingestion can take up to 48 hours for the first sync, as described in Power BI. Treat any Dataflow or data set as potentially absent from search and lineage until that first pass completes successfully.
For ongoing Power BI metadata, after the first sync, Catalog-managed environments follow the schedule you coordinate with Catalog operations. Client-managed environments follow the schedule you configure for castor-extract-powerbi and upload, as in Power BI. After your trial, you schedule extraction at your desired frequency, up to once per day, so Dashboard sections stay current, as described in Data visualization integrations.
For warehouse metadata, warehouse integrations typically sync once per day after the first sync, as described in the Catalog onboarding guide. Lineage through a Dataflow still requires warehouse tables and columns to exist in Catalog, so warehouse freshness and Power BI freshness both matter.
When models or admin settings change, use Power BI to refresh or republish affected data sets, then allow a full Catalog extraction cycle before you judge search results or lineage. Power BI sometimes serves updated mashup and admin metadata shortly after your change, but Catalog only reflects it after the next successful extraction.
How to Validate Dataflow Lineage
Follow these steps when you want to confirm that lineage crosses a Dataflow path.
-
Confirm Power BI admin settings
In the Power BI Admin portal, verify the same Admin API Settings called out in Power BI remain enabled for your Catalog service principal's security group. -
Refresh or republish affected data sets
Follow refresh and republish guidance for data sets in Power BI so lineage-related metadata is current in Power BI before the next Catalog extraction. -
Confirm the warehouse side is in Catalog
Open the warehouse integration documentation for your platform, such as Snowflake, and ensure the databases and objects your Dataflow reads are in scope for sync. If a data set still shows no upstream tables, add or extend the warehouse source that backs those tables, then let Catalog run another sync. -
Wait for the next scheduled extraction
Catalog-managed environments run on a schedule you coordinate with Catalog operations. Client-managed environments use your own schedule forcastor-extract-powerbiand upload. See Power BI.
If lineage for Dataflow-backed tables was empty in the past and prerequisites are fixed now, run through the same steps again and allow a full extraction cycle before you open a Support ticket.
Troubleshooting
Use the following situations to narrow down common issues.
Lineage or Metadata Still Looks Wrong After Admin API Changes
When Admin API Settings are correct but graphs or upstream paths still look empty or old, the gap is usually timing or scope rather than the toggle itself.
- Refresh or republish the affected semantic data sets in Power BI so mashup and related metadata match what you expect in the service.
- Wait for Power BI to show consistent metadata for those assets in the admin experience you use outside Catalog, then plan for one full Catalog extraction afterward.
- Confirm warehouse objects for the same logical environment, for example DEV compared to prod, are actually in Catalog and match what the Dataflow queries.
- Re-run validation only after both warehouse sync and Power BI extraction have completed successfully for that cycle.
No Upstream Lineage From a Data Set That Uses Dataflows
-
Admin API metadata is off or scoped wrong - If DAX and mashup expressions or detailed metadata are disabled or the security group for your Catalog service principal is missing, Catalog doesn't receive enough information to resolve Dataflow-backed tables. Fix the settings, then refresh data sets and wait for the next extraction.
-
Warehouse not in Catalog - Lineage only connects assets Catalog already knows about. Add or expand the warehouse integration for the database your Dataflow queries, then sync again.
-
Stale Power BI metadata - Refresh or republish the data set in Power BI, then run another Catalog extraction cycle.
-
M shape not recognized - If the data set's M does not expose both Dataflow ID and entity in a supported Power BI Dataflows or Power Platform Dataflows pattern, Catalog does not connect the data set to the flow. Compare the model's M to Supported Patterns and Limits, republish after changes, and wait for extraction.
Lineage Is Partial or Stops at Some Columns
Column-level lineage depends on how clearly Power Query steps map fields to upstream columns. Heavy renaming, merging, or indirection inside the Dataflow or data set can limit column-level links even when table-level lineage appears. Catalog also improves column-level lineage for some Power BI column rename patterns, for example explicit Table.RenameColumns mappings in M. Patterns outside those cases can still be incomplete. When one entity effectively draws from multiple warehouse sources in parallel, column-level lineage can stop where Catalog cannot pick a single resolved path.
Parameterized or Dynamic M Logic
Organizations often use M parameters for environment or database names. Resolution depends on what Catalog can infer from exported metadata. If you use complex parameter wiring or highly dynamic connection logic and lineage stays incomplete after you try the other guidance in this Troubleshooting section, contact Support with a short description of the pattern. Do not include credentials.
Client-Managed Extraction
If you run castor-extract-powerbi yourself, compare your log output and output files with a recent successful run on the current extractor. If extraction completes but lineage doesn't improve, gather the run timestamp and open a Support ticket so the team can trace ingestion with you.
When You Cannot Tell Which Workspace an Asset Belongs To
Depending on how names are shown in lineage, you might not see the Power BI workspace at every step in the graph. When many workspaces share similar content, use consistent data set and report naming that includes the environment so you can tell assets apart.
Cannot Find a Dataflow in Search or Dashboards
You find Power BI Dataflows in the same places as other Power BI visualization models: under Dashboards in the left navigation or through Catalog search. There is no separate left navigation entry only for Dataflows. Quick search and advanced search both surface visualization models; Dataflows use the name and description text Catalog receives from Power BI, with the same optional tag and field search behavior described in Catalog search. If you still cannot locate a flow, work through this list:
-
Match the Power BI name
Search using the Dataflow name as it appears in Power BI, not an informal label, folder title, or workspace nickname. Catalog indexes the visualization model name and description from the tenant. -
Include visualization models in advanced search
In advanced search, confirm Visualization models is included in the result types you filter on. If that type is unchecked or your filters are very narrow, Dataflows disappear from the result set even when they are synced. -
Confirm the Power BI integration is healthy
In Settings > Integrations, check that the Power BI integration is active, not paused, and that credentials are valid. A paused integration or an expired secret stops new extraction, so search and lineage stop updating until you fix the connection. -
Allow first sync or a full cycle after big changes
For Catalog-managed Power BI, the first ingestion can take up to 48 hours. After you change admin settings, workspaces, or major model definitions, wait for at least one successful extraction before you assume a Dataflow is missing. -
Check workspace and access in Catalog
If your Catalog user or team does not have access to the workspace or asset scope where the Dataflow lives, you might not see it in lists or search even when extraction succeeded for the tenant.
For a concise summary of how search applies to visualization models, see the Advanced Search Results and How Is Search Performed sections in Catalog search.
What's Next?
- Complete or review Power BI credentials and admin steps in Power BI.
- Review quick and advanced search behavior for visualization models in Catalog search.
- Connect and sync the warehouses your Dataflows read in Data warehouse integrations.
- Practice expanding upstream and downstream lineage in Lineage.