Redshift
Prerequisites
Follow installation instructions here
We strongly advise to create a dedicated user to extract your metadata.
You can follow those instructions to create the catalog
user.
This client connects to Redshift
using sslmode=verify-ca
, which means your certificates must be up-to-date. More information here
Run extraction script
Once the package has been installed, you should be able to run the following command in your terminal:
castor-extract-redshift [arguments]
The script will run and display logs as following:
INFO - Extracting `DATABASE` ...
INFO - Results stored to /tmp/catalog/1649083626-database.csv
...
INFO - Extracting `USER` ...
INFO - Results stored to /tmp/catalog/1649083626-user.csv
INFO - Wrote output file: /tmp/catalog/1649083626-summary.json
Credentials
-H
,--host
: hostname-P
,--port
: port number-d
,--database
: database name-u
,--user
: user-p
,--password
: password
Other arguments
-o
,--output
: target folder to store the extracted files
Optional arguments
--skip-existing
: Skip files already extracted instead of replacing them--serverless
: Enables extraction for Redshift Serverless
Use ENV variables
If you don't want to specify arguments every time, you can set the following ENV in your .bashrc
:
export CASTOR_REDSHIFT_HOST=127.0.0.0
export CASTOR_REDSHIFT_PORT=5439
export CASTOR_REDSHIFT_DATABASE=db_name
export CASTOR_REDSHIFT_USER=extraction_user
export CASTOR_REDSHIFT_PASSWORD=******
# Optional to enable Redshift Serverless
CASTOR_REDSHIFT_SERVERLESS=TRUE
export CASTOR_OUTPUT_DIRECTORY="/tmp/catalog"
Then the script can be executed without any arguments:
castor-extract-redshift
It can also be executed with partial arguments (the script looks in your ENV
as a fallback):
castor-extract-redshift --output /tmp/catalog
Last updated
Was this helpful?