Redshift

Prerequisites

We strongly advise to create a dedicated user to extract your metadata.

You can follow those instructions to create the catalog user.

Run extraction script

Once the package has been installed, you should be able to run the following command in your terminal:

castor-extract-redshift [arguments]

The script will run and display logs as following:

INFO - Extracting `DATABASE` ...
INFO - Results stored to /tmp/catalog/1649083626-database.csv


...

INFO - Extracting `USER` ...
INFO - Results stored to /tmp/catalog/1649083626-user.csv
INFO - Wrote output file: /tmp/catalog/1649083626-summary.json

Credentials

  • -H, --host: hostname

  • -P, --port: port number

  • -d, --database: database name

  • -u, --user: user

  • -p, --password: password

Other arguments

  • -o, --output: target folder to store the extracted files

Optional arguments

  • --skip-existing: Skip files already extracted instead of replacing them

  • --serverless: Enables extraction for Redshift Serverless

You can also get help with argument --help

Use ENV variables

If you don't want to specify arguments every time, you can set the following ENV in your .bashrc:

export CASTOR_REDSHIFT_HOST=127.0.0.0
export CASTOR_REDSHIFT_PORT=5439
export CASTOR_REDSHIFT_DATABASE=db_name
export CASTOR_REDSHIFT_USER=extraction_user
export CASTOR_REDSHIFT_PASSWORD=******

# Optional to enable Redshift Serverless
CASTOR_REDSHIFT_SERVERLESS=TRUE

export CASTOR_OUTPUT_DIRECTORY="/tmp/catalog"

Then the script can be executed without any arguments:

castor-extract-redshift

It can also be executed with partial arguments (the script looks in your ENV as a fallback):

castor-extract-redshift --output /tmp/catalog

Last updated

Was this helpful?