Snowflake Set up

It's time to start the technical onboarding with Catalog. It's pretty simple; you need to grant access to your Snowflake metadata to Catalog and to perform a couple of tests.

Which rights do you need to complete the technical setup of the Catalog user?

You will need one of the following top level roles on Snowflake to create the Catalog user following our suggested solution below:

  • 👩‍💼 SECURITYADMIN

  • 👨‍💼 ACCOUNTADMIN

  • 🧑‍💼 SYSADMIN

If your configuration of rights allows you to do so, you can of course use different roles.

What will Catalog do with its access ?

A bit more information on Snowflake: ❄️ Your Snowflake instance contains a SNOWFLAKE database. It is system defined, read-only, and contains only metadata. It is quite similar to PG-catalog in classic PostGres Databases.

Catalog's access: 🦫 Catalog will only have access to the SNOWFLAKE database. It will run queries on the tables of the ACCOUNT_USAGE schema. Catalog will only have access the metadata, and not the data itself. You can check-out the queries Catalog will run here.

Need some more info ?

If you want some further information on the database SNOWFLAKE and privileges and access control, please check out the Snowflake documentation:

  • 🗄 Database SNOWFLAKE documentation is here

  • 🔑 Privileges and access documentation is here

1. Whitelist Catalog's IP on Snowflake

Needed only if you've set up network policies, meaning your Snowflake instance only accepts connections coming from specific IPs. You can find the relevant Snowflake documentation here

If applicable then here are Catalog fixed IPs for the whitelist:

Snowflake's instruction on how to whitelist an IP address can be found here

2. Create Catalog User on Snowflake

You are asked to enter credentials in Catalog's app Interface. Here is a step-by-step guide to create them. You can run the SQL code below.

[Find the SQL procedure for steps 2 and 3 here]

2.1 Create a dedicated role METADATA_VIEWER_ROLE

The role you will create here will inherit the public role from your instance; it will not have any rights for the moment. It will be the role given to the Catalog user.

USE ROLE SECURITYADMIN;

CREATE ROLE METADATA_VIEWER_ROLE;
GRANT ROLE METADATA_VIEWER_ROLE TO ROLE SYSADMIN;

2.2 Grant access to Snowflake Object Tagging

Please skip this set if you are using Snowflake Standard Edition.

The APPLY TAGpermission is required to retrieve your Snowflake Tags. You can find more information in the Snowflake Documentation on Object Tagging.

USE ROLE SECURITYADMIN;

GRANT APPLY TAG ON ACCOUNT TO ROLE METADATA_VIEWER_ROLE;

2.3 Create a dedicated warehouse

This warehouse size must be of size small at least. Bear in mind that we only retrieve your data model and queries a few time a day and that the auto-suspend set up is on. We need that size as the tables within the account_usage schemas are not always optimised by Snowflake.

USE ROLE SYSADMIN;

CREATE WAREHOUSE METADATA_WH
WITH WAREHOUSE_SIZE = SMALL
  AUTO_SUSPEND = 59
  AUTO_RESUME = TRUE
  INITIALLY_SUSPENDED = TRUE
  COMMENT = 'metadata reader warehouse';

2.4 Grant usage on the new warehouse to new role

You are adding rights to the role you created in step 2.1. This role is granted usage rights to the new warehouse you created in step 2.2.

GRANT USAGE ON WAREHOUSE METADATA_WH TO ROLE METADATA_VIEWER_ROLE;

2.5 Grant usage on Snowflake shared database to the created role

You are adding further rights to the role you created in step 2.1. The following grants usage of the database SNOWFLAKE .

You must use a role that can grant usage to this database, the role ACCOUNTADMIN always can, do not hesitate to pick another one, based on your role configuration

USE ROLE ACCOUNTADMIN;
-- It is highly likely that you need to use the ACCOUNTADMIN role
-- to grant access to this specific database

GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE METADATA_VIEWER_ROLE;

2.6 Create the Catalog user on Snowflake

Time to create a user, which will be CATALOG.

USE ROLE SECURITYADMIN;
-- You may need to use ACCOUNTADMIN
CREATE OR REPLACE USER CATALOG
    LOGIN_NAME = CATALOG
    DEFAULT_ROLE = METADATA_VIEWER_ROLE
    DEFAULT_WAREHOUSE = METADATA_WH;

2.7 Grant the Catalog user the dedicated role

Finally, grant the Catalog user the role you created in 2.1.

GRANT ROLE METADATA_VIEWER_ROLE TO USER CATALOG;

3. Test the Catalog User

The Catalog User should now be correctly set up. Before you give Catalog the credentials, let's first run a test to make sure everything works well.

You will use the created role and run the following statements.

3.1 Check Warehouse

You must see the warehouse METADATA_WH, being the default one

USE ROLE METADATA_VIEWER_ROLE;

SHOW WAREHOUSES;

3.2 Check Role

You must see the role METADATA_VIEWER_ROLE, being the current and default one

SHOW ROLES;

3.3 Check Databases

This query must return all databases that exists on your Snowflake instance

SELECT
    database_id,
    database_name
FROM SNOWFLAKE.ACCOUNT_USAGE.DATABASES
WHERE NOT is_transient
    AND deleted IS NULL;

4. Creating a Key Pair

For more details, see Snowflake documentation about Key pair Authentication

4.1 Create Private Key

Open the terminal on your computer and generate a private key without a passphrase with the following command:

openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt

4.2 Create Public Key

Next, generate the public key using the private key just created above with the following command:

openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub

The key generated will be in the following format:

-----BEGIN PUBLIC KEY-----
MIIBIj...
-----END PUBLIC KEY-----

4.3 Assign Public Key to the Catalog User

Execute an ALTER USER command to add the public key to the Catalog user. To do this, you need to insert the public key without its delimiters in the command. Here is an example:

ALTER USER CATALOG SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...';

5. Add new credentials in Catalog App

You must be a Catalog admin to do it.

You can now enter the newly created credentials in the Catalog App here.

  • Go to "Settings > Integrations"

  • Click on "Snowflake Add"

  • Add the following credentials

For the account field, enter the Account Locator of your Snowflake. This value can be found in the hostname, AWS Private Link or Azure Private Link endpoint for your instance. Here is an example: example.us-west-1 For the username field, put the LOGIN_NAME of the User created

For your private key, paste it in the Private Key field using the following format:

-----BEGIN PRIVATE KEY-----
<PRIVATE_KEY>
-----END PRIVATE KEY-----

Appendix

Full code for steps 2 and 3

-- Create a dedicated role
USE ROLE SECURITYADMIN;
CREATE ROLE METADATA_VIEWER_ROLE;
GRANT ROLE METADATA_VIEWER_ROLE TO ROLE SYSADMIN;

-- To skip if using Snowflake Standard edition
GRANT APPLY TAG ON ACCOUNT TO ROLE METADATA_VIEWER_ROLE;

-- Create dedicated warehouse
USE ROLE SYSADMIN;
CREATE WAREHOUSE METADATA_WH
WITH WAREHOUSE_SIZE = SMALL
  AUTO_SUSPEND = 59
  AUTO_RESUME = TRUE
  INITIALLY_SUSPENDED = TRUE
  COMMENT = 'metadata reader warehouse';

-- Grant usage on the created warehouse to the created role
GRANT USAGE ON WAREHOUSE METADATA_WH TO ROLE METADATA_VIEWER_ROLE;

-- Grant usage on Snowflake shared database to the created role
USE ROLE ACCOUNTADMIN;
-- It is higly likely that you need to use this role 
-- to grant access to this specific database
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE METADATA_VIEWER_ROLE;

-- Create a Catalog user
USE ROLE SECURITYADMIN;
CREATE OR REPLACE USER CATALOG
    LOGIN_NAME = CATALOG
    DEFAULT_ROLE = METADATA_VIEWER_ROLE
    DEFAULT_WAREHOUSE = METADATA_WH;

-- Grant the Catalog user the dedicated role
GRANT ROLE METADATA_VIEWER_ROLE TO USER CATALOG;

-- Test the Catalog user

USE ROLE METADATA_VIEWER_ROLE;

SHOW WAREHOUSES;
-- You must see the warehouse METADATA_WH, being the default one

SHOW ROLES;
-- You must see the role METADATA_VIEWER_ROLE, being the current and default one

SELECT
    database_id,
    database_name
FROM SNOWFLAKE.ACCOUNT_USAGE.DATABASES
WHERE NOT is_transient
    AND deleted IS NULL;
-- This query must return all databases that exists on your Snowflake instance

Then

Queries Ran by the Catalog User

  • Get Databases

    SELECT
        database_id
        ,database_name
        ,database_owner
        ,is_transient
        ,comment
        ,last_altered
        ,created
        ,deleted
    FROM SNOWFLAKE.ACCOUNT_USAGE.DATABASES
  • Get Schemas

    SELECT
        schema_id
        ,schema_name
        ,catalog_id as database_id
        ,catalog_name as database_name
        ,schema_owner
        ,is_transient
        ,comment
        ,last_altered
        ,created
        ,deleted
    FROM SNOWFLAKE.ACCOUNT_USAGE.SCHEMATA
  • Get Tables

    SELECT
        table_id,
        table_name,
        table_schema_id AS schema_id,
        table_schema AS schema_name,
        table_catalog_id AS database_id,
        table_catalog AS database_name,
        table_owner,
        table_type,
        is_transient,
        row_count,
        bytes,
        comment,
        created,
        last_altered,
        deleted
    FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES
    WHERE
    		deleted is null
  • Get Columns

    SELECT
        column_id,
        column_name,
        table_id,
        table_name,
        table_schema_id AS schema_id,
        table_schema AS schema_name,
        table_catalog_id AS database_id,
        table_catalog AS database_name,
        ordinal_position,
        column_default,
        is_nullable,
        data_type,
        maximum_cardinality,
        character_maximum_length,
        character_octet_length,
        numeric_precision,
        numeric_precision_radix,
        numeric_scale,
        datetime_precision,
        interval_type,
        interval_precision,
        comment,
        deleted
    FROM SNOWFLAKE.ACCOUNT_USAGE.COLUMNS
  • Get Users

    SELECT
        name,
        created_on,
        deleted_on,
        login_name,
        display_name,
        first_name,
        last_name,
        email,
        has_password,
        disabled,
        snowflake_lock,
        default_warehouse,
        default_namespace,
        default_role,
        bypass_mfa_until,
        last_success_login,
        expires_at,
        locked_until_time,
        password_last_set_time,
        comment
    FROM SNOWFLAKE.ACCOUNT_USAGE.USERS
  • Get Roles

    SELECT
        created_on,
        deleted_on,
        name,
        comment
    FROM SNOWFLAKE.ACCOUNT_USAGE.ROLES
  • Get Grant to roles

    SELECT
        created_on,
        modified_on,
        privilege,
        granted_on,
        name,
        table_catalog AS database,
        table_schema AS schema,
        granted_to,
        grantee_name,
        grant_option,
        granted_by,
        deleted_on
    FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES
  • Get Grant to users

    SELECT
        created_on,
        deleted_on,
        role,
        granted_to,
        grantee_name,
        granted_by
    FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_USERS
  • Get Queries

    SELECT
        query_id,
        -- dropping INSERT values
        IFF(
            query_type = 'INSERT',
            REGEXP_REPLACE(query_text, 'VALUES (.*)', 'DEFAULT VALUES'),
            query_text
        ) AS query_text,
        database_id,
        database_name,
        schema_id,
        schema_name,
        query_type,
        session_id,
        user_name,
        role_name,
        warehouse_id,
        warehouse_name,
        execution_status,
        error_code,
        error_message,
        CONVERT_TIMEZONE('UTC', start_time) as start_time,
        CONVERT_TIMEZONE('UTC', end_time) as end_time,
        total_elapsed_time,
        bytes_scanned,
        percentage_scanned_from_cache
        bytes_written,
        bytes_written_to_result,
        bytes_read_from_result,
        rows_produced,
        rows_inserted,
        rows_updated,
        rows_deleted,
        rows_unloaded,
        bytes_deleted,
        partitions_scanned,
        partitions_total,
        compilation_time,
        execution_time,
        queued_provisioning_time,
        queued_repair_time,
        queued_overload_time,
        transaction_blocked_time,
        release_version,
        is_client_generated_statement
    FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
    WHERE TRUE
        AND DATE(CONVERT_TIMEZONE('UTC', start_time)) = DATEADD(DAY,-1,CURRENT_DATE)
        AND execution_status = 'SUCCESS'
        AND query_type NOT IN ('SHOW', 'USE', 'ROLLBACK', 'DESCRIBE', 'ALTER_SESSION')

References

Here's a list of all Snowflake documentation referenced above

  • SNOWFLAKE : click here

  • Privileges and access documentation : click here

  • Network policies: click here

  • How to whitelist an IP address : click here

Last updated

Was this helpful?