⬆️Uploader

This package provides you with a way to push your data to the Catalog infrastructure. It will upload your files to some dedicated bucket in Google-Cloud-Storage (GCS).

Prerequisites

The uploader is part of our castor-extractor pypi package

You can follow the installation instruction here

Usage

You will need:

  • Your source id provided by Catalog, referred as source_id in the code examples

  • Your Catalog Token given by Catalog

We recommend using the castor-upload command:

castor-upload [arguments]

Arguments

  • -k, --token: Token provided by Catalog

  • -s, --source_id: Source id provided by Catalog

  • -t, --file_type: source type to upload. Currently supported are { DBT | VIZ | WAREHOUSE | QUALITY }

Target files

To specify the target files, provide one of the following:

  • -f, --file_path: to push a single file

or

  • -d, --directory_path: to push several files at once (*)

Use ENV variables

If you don't want to specify arguments every time, you can set the following ENV in your .bashrc:

CASTOR_UPLOADER_FILE_TYPE=WAREHOUSE  
CASTOR_UPLOADER_SOURCE_ID=********-****-****-****-************ 
CASTOR_UPLOADER_TOKEN=************************************************ 
CASTOR_UPLOADER_DIRECTORY_PATH=./

Then the script can be executed without any arguments:

castor-upload

Troubleshooting

If you encounter problems to upload your files you can increase the timeout or configure retries. This can be done by setting those ENV variables:

  • CASTOR_TIMEOUT_OVERRIDE: number of seconds before timeout (default = 60)

  • CASTOR_RETRY_OVERRIDE: number of retries (default = 1)

Last updated

Was this helpful?