Power BI

Requirements

1. IP Whitelisting

This only applies if you need a VPN to connect to Power BI

If applicable then here is Catalog IP to whitelist: 35.246.176.138

2. Create a Microsoft Entra app for Catalog in the Azure portal

For this step, you need your organization's Cloud Application Administrator or Application Administrator to perform the configuration. You may not have the access yourself.

Log in to the Azure portal and search for Microsoft Entra ID. In it, create the new App with the following parameters and then click Register:

  • name: Catalog

  • Supported account types: Accounts in this organizational directory only

On the homepage of your newly created application, from the Overview screen, copy the values for the following fields and store them in a secure location for later:

  • Application (client) ID

  • Directory (tenant) ID

From the left menu of your newly created application page, click Certificates & secrets and create a new client secret with the description and expiration date of your choosing

Then, for the newly created client secret, click the clipboard icon to copy the Value and store it in a secure location for later

For more details check here

3. Create a Microsoft Entra security group

For this step, you need your organization's Cloud Application Administrator or Application Administrator to perform the configuration. You may not have the access yourself.

In the left menu of the Microsoft Entra ID page, under the Manage section, click Groups Then create a new group with the following configuration:

  • Set the Group type to Security.

  • Enter a "API AD” as Group name and (optionally) a Group description.

  • Under Members, search for the application registration created above and add it to the list

For more details check here

4. Enable the Power BI Service admin settings

For this step, you need your organization's Fabric Administrator, formerly known as Power BI Administrator, to perform the configuration. You may not have the access yourself.

Go to the Power BI Admin portal tenant settings (how to get to the Admin portal). For more details, check here

  • Then on the Developer Settings section, enable:

    • Enable Service principals can use Fabric APIs

  • Add the group API AD to it.

For more details check here

  • On the Admin API Settings, enable:

    • Enable Service principals can access read-only admin APIs

    • Enable Enhance admin APIs responses with detailed metadata

    • Enable Enhance admin APIs responses with DAX and mashup expressions

  • Add the group API AD to all of them

In order to get information on lineage from the Power BI API, make sure to:

  • Refresh or republish datasets (especially those that are not scheduled)

  • Republish datasets containing only DirectQuery tables

Catalog Managed

Once we receive your credentials, Catalog will be able to directly pull the data from Power BI.

Please send us the following:

  • Tenant (Directory) ID: Your Power BI instance tenant identifier

  • Client (Application) ID: the id of the Catalog application for Power BI

  • Secret Value: the value of the secret associated to the Catalog App

Input your credentials directly in Catalog App here under the following format:

{
    "clientId": "****",
    "secret": "****",
    "tenantId": "****"
}

For your first sync, it will take up to 48h and we will let you know when it is complete ✅

If you are not comfortable giving us access to your credentials, please continue to Client managed 👇

Client managed

Running the Extraction package

Install the PyPi package

pip install castor-extractor[powerbi]

For further details: link

Running the PyPi package

Once the package has been installed, you should be able to run the following command in your terminal:

castor-extract-powerbi [arguments]

The script will run and display logs as following:

INFO - Starting extraction of PowerBiAsset.ACTIVITY_EVENTS
INFO - Wrote output file: ./files/1708021983-activity_events.json
INFO - Starting extraction of PowerBiAsset.DASHBOARDS
INFO - Wrote output file: ./files/1708021983-dashboards.json
INFO - Starting extraction of PowerBiAsset.DATASETS
INFO - Wrote output file: ./files/1708021983-datasets.json
INFO - Starting extraction of PowerBiAsset.METADATA
INFO - scan bbe1669a-8d4b-4598-a3a1-8763ea2babe7 ready
INFO - Wrote output file: ./files/1708021983-metadata.json
INFO - Starting extraction of PowerBiAsset.REPORTS
INFO - Wrote output file: ./files/1708021983-reports.json
INFO - Wrote output file: /tmp/catalog/1649078755-summary.json

Arguments

  • -t: Tenant ID, your Power BI instance tenant identifier

  • -c: Client (Application) ID, the id of the Catalog application for Power BI

  • -s: Secret Value, the value of the secret associated to the Catalog App

  • -o, --output: Target folder to store the extracted files

Optional arguments

  • -sc, --scopes : Power BI Scopes to be used, optional

  • -l, --login_url : Login URL of your Microsft Entra server, optional

  • -a, --api_base: Power BI REST API base URL, optional

You can also get help with argument --help

Scheduling and Push to Catalog

When moving out of trial, you'll want to refresh your Power BI content in Catalog. Here is how to do it:

The Catalog team will provide you with

  1. Source Id (an id for us to match your files with your Catalog instance)

  2. Catalog Token An API Token

You can then use the castor-upload command:

castor-upload [arguments]

Arguments

  • -k, --token: Token provided by Catalog

  • -s, --source_id: Source id provided by Catalog

  • -t, --file_type: source type to upload. Currently supported are { DBT | VIZ | WAREHOUSE }

Target files

To specify the target files, provide one of the following:

  • -f, --file_path: to push a single file

or

  • -d, --directory_path: to push several files at once (*)

Then you'll have to schedule the script run and the push to Catalog, use your preferred scheduler to create this job

You're done!

Last updated

Was this helpful?