Power BI
Requirements
1. IP Whitelisting
This only applies if you need a VPN to connect to Power BI
If applicable then here is Catalog IP to whitelist: 35.246.176.138
2. Create a Microsoft Entra app for Catalog in the Azure portal
Log in to the Azure portal and search for Microsoft Entra ID. In it, create the new App with the following parameters and then click Register:
name:
Catalog
Supported account types:
Accounts in this organizational directory only
On the homepage of your newly created application, from the Overview screen, copy the values for the following fields and store them in a secure location for later:
Application (client) ID
Directory (tenant) ID
From the left menu of your newly created application page, click Certificates & secrets and create a new client secret with the description and expiration date of your choosing
Then, for the newly created client secret, click the clipboard icon to copy the Value and store it in a secure location for later
For more details check here
Make sure the app doesn't have any admin-consent required permissions for Power BI set on it in the Azure portal. They're never used and can cause errors that are hard to troubleshoot. See how to check whether your app has any such permissions.
3. Create a Microsoft Entra security group
In the left menu of the Microsoft Entra ID page, under the Manage section, click Groups Then create a new group with the following configuration:
Set the Group type to Security.
Enter a "API AD” as Group name and (optionally) a Group description.
Under Members, search for the application registration created above and add it to the list
For more details check here
4. Enable the Power BI Service admin settings
Go to the Power BI Admin portal tenant settings (how to get to the Admin portal). For more details, check here
Then on the
Developer Settings
section, enable:Enable
Service principals can use Fabric APIs
Add the group
API AD
to it.
For more details check here
On the
Admin API Settings
, enable:Enable
Service principals can access read-only admin APIs
Enable
Enhance admin APIs responses with detailed metadata
Enable
Enhance admin APIs responses with DAX and mashup expressions
Add the group
API AD
to all of them
4b. Refresh and Republish datasets (recommended)
In order to get information on lineage from the Power BI API, make sure to:
Refresh or republish datasets (especially those that are not scheduled)
Republish datasets containing only DirectQuery tables
Catalog Managed
Once we receive your credentials, Catalog will be able to directly pull the data from Power BI.
Please send us the following:
Tenant (Directory) ID
: Your Power BI instance tenant identifierClient (Application) ID
: the id of the Catalog application for Power BISecret Value
: the value of the secret associated to the Catalog App
Input your credentials directly in Catalog App here under the following format:
{
"clientId": "****",
"secret": "****",
"tenantId": "****"
}
For your first sync, it will take up to 48h and we will let you know when it is complete ✅
If you are not comfortable giving us access to your credentials, please continue to Client managed 👇
Client managed
Running the Extraction package
Install the PyPi package
pip install castor-extractor[powerbi]
For further details: link
Running the PyPi package
Once the package has been installed, you should be able to run the following command in your terminal:
castor-extract-powerbi [arguments]
The script will run and display logs as following:
INFO - Starting extraction of PowerBiAsset.ACTIVITY_EVENTS
INFO - Wrote output file: ./files/1708021983-activity_events.json
INFO - Starting extraction of PowerBiAsset.DASHBOARDS
INFO - Wrote output file: ./files/1708021983-dashboards.json
INFO - Starting extraction of PowerBiAsset.DATASETS
INFO - Wrote output file: ./files/1708021983-datasets.json
INFO - Starting extraction of PowerBiAsset.METADATA
INFO - scan bbe1669a-8d4b-4598-a3a1-8763ea2babe7 ready
INFO - Wrote output file: ./files/1708021983-metadata.json
INFO - Starting extraction of PowerBiAsset.REPORTS
INFO - Wrote output file: ./files/1708021983-reports.json
INFO - Wrote output file: /tmp/catalog/1649078755-summary.json
Arguments
-t
: Tenant ID, your Power BI instance tenant identifier-c
: Client (Application) ID, the id of the Catalog application for Power BI-s
: Secret Value, the value of the secret associated to the Catalog App-o
,--output
: Target folder to store the extracted files
Optional arguments
-sc
,--scopes
: Power BI Scopes to be used, optional-l
,--login_url
: Login URL of your Microsft Entra server, optional-a
,--api_base
: Power BI REST API base URL, optional
Scheduling and Push to Catalog
When moving out of trial, you'll want to refresh your Power BI content in Catalog. Here is how to do it:
The Catalog team will provide you with
Source Id
(an id for us to match your files with your Catalog instance)Catalog Token
An API Token
You can then use the castor-upload
command:
castor-upload [arguments]
Arguments
-k
,--token
: Token provided by Catalog-s
,--source_id
: Source id provided by Catalog-t
,--file_type
: source type to upload. Currently supported are {DBT
|VIZ
|WAREHOUSE
}
Target files
To specify the target files, provide one of the following:
-f
,--file_path
: to push a single file
or
-d
,--directory_path
: to push several files at once (*)
(*) The tool will upload all files included in the given directory.
Make sure it contains only the extracted files before pushing.
Then you'll have to schedule the script run and the push to Catalog, use your preferred scheduler to create this job
You're done!
Last updated
Was this helpful?