⬆️Uploader

This package provides you with a way to push your data to the Catalog infrastructure. It will upload your files to some dedicated bucket in Google-Cloud-Storage (GCS).

Prerequisites

The uploader is part of our castor-extractor pypi package

You can follow the installation instruction here

Usage

You will need:

  • Your source id provided by Catalog, referred as source_id in the code examples

  • Your Catalog Token given by Catalog

We recommend using the castor-upload command:

castor-upload [arguments]

Arguments

  • -k, --token: Token provided by Catalog

  • -s, --source_id: Source id provided by Catalog

  • -t, --file_type: Source type to upload. Currently supported are { DBT | VIZ | WAREHOUSE | QUALITY }

  • -z, --zone: Catalog zone to upload, currently supported are US and EU, defaults to EU

For instances on app.us.castordoc.com, the US zone should be used. For instance on app.castordoc.com, the EU zone should be used.

Target files

To specify the target files, provide one of the following:

  • -f, --file_path: to push a single file

or

  • -d, --directory_path: to push several files at once (*)

Use ENV variables

If you don't want to specify arguments every time, you can set the following ENV in your .bashrc:

CASTOR_UPLOADER_FILE_TYPE=WAREHOUSE  
CASTOR_UPLOADER_SOURCE_ID=********-****-****-****-************ 
CASTOR_UPLOADER_TOKEN=************************************************ 
CASTOR_UPLOADER_DIRECTORY_PATH=./

Then the script can be executed without any arguments:

castor-upload

Troubleshooting

If you encounter problems to upload your files you can increase the timeout or configure retries. This can be done by setting those ENV variables:

  • CASTOR_TIMEOUT_OVERRIDE: number of seconds before timeout (default = 60)

  • CASTOR_RETRY_OVERRIDE: number of retries (default = 1)

Last updated

Was this helpful?