Snowflake
Prerequisites
Follow installation instructions here
We strongly advise to create a dedicated user to extract your metadata.
You can follow those instructions to create the catalog
user.
Run extraction script
Once the package has been installed, you should be able to run the following command in your terminal:
castor-extract-snowflake [arguments]
The script will run and display logs as following:
INFO - Extracting `DATABASE` ...
INFO - Results stored to /tmp/catalog/1649083626-database.csv
...
INFO - Extracting `USER` ...
INFO - Results stored to /tmp/catalog/1649083626-user.csv
INFO - Wrote output file: /tmp/catalog/1649083626-summary.json
Credentials
-a
,--account
: Snowflake account identifier-u
,--user
: Snowflake user-p
,--password
: Snowflake password
If you use Keep pair Authentication (documentation), with no passphrase, replace password with:
-pk
,--private_key
: Snowflake private key
The key should be formatted as follows:
The key should be formatted as follows:
-----BEGIN PRIVATE KEY-----\\xxxxxPRIVATE_KEYxxxxx\\n-----END PRIVATE KEY-----
Other arguments
-o
,--output
: target folder to store the extracted files
Optional arguments
--skip-existing
: Skip files already extracted instead of replacing them--fetch-transient
: Will fetch transients tables if added--warehouse
: Use a specificWAREHOUSE
to run extraction queries--role
: Use a specificROLE
to run extraction queries--db-allowed
: Database(s) you want to extract--db-blocked
: Database(s) you don't want to extract--insecure-mode
: Turns off OCSP checking for Snowflake Client. Use when facing OCSP issues
Use ENV variables
If you don't want to specify arguments every time, you can set the following ENV in your .bashrc
:
export CASTOR_SNOWFLAKE_ACCOUNT=aCC0un7.eu-central-1
export CASTOR_SNOWFLAKE_USER=us3R
export CASTOR_SNOWFLAKE_PASSWORD=******
export CASTOR_SNOWFLAKE_INSECURE_MODE=True
export CASTOR_OUTPUT_DIRECTORY="tmp/catalog_output"
Then the script can be executed without any arguments:
castor-extract-snowflake
It can also be executed with partial arguments (the script looks in your ENV
as a fallback):
castor-extract-snowflake --output /tmp/catalog
Database filtering
Database filters are optional. If you don't specify any filter, all available databases will be extracted (depending on the provided credentials)
# extract all but <...>
castor-extract-snowflake --db-blocked ZZ_DEPRECATED_DB
# extract only <...>
castor-extract-snowflake --db-allowed DB_1 DB_2 DB_2
# mixed (not really useful, but still doable)
castor-extract-snowflake --db-allowed DB_1 --db-blocked ZZ_DEPRECATED_DB
ROLE and WAREHOUSE
Here is how the extractor connects to SNOWFLAKE:
Use the provided
--role
&--warehouse
if anyOtherwise, use the default
ROLE
&WAREHOUSE
If no defaults are set, use an arbitrary available
ROLE
&WAREHOUSE
The --role
and --warehouse
arguments are useful when you need to enforce specific values.
If you extract your data with user catalog
(instructions here) you don't need to specify those values:
METADATA_WH
is the only available warehouse, it will be used by defaultMETADATA_VIEWER_ROLE
is the only available role, it will be used by default
Last updated
Was this helpful?