A proof-of-concept deploying a data pipeline with GitHub actions.
- Source: Fingrid Data
- Extract: dlt
- Transform & Load: dbt
- Warehouse: MotherDuck
- Visualize: Streamlit
- Fingrid Data API key
- MotherDuck access token
- MotherDuck database
The pipeline is deployed to GitHub actions. Create a fork of this repo to deploy.
The data is extracted from Fingrid using dlt. dlt provides step-by-step instructions. Add the following secret values (typically stored in ./.dlt/secrets.toml):
- DB_NAME
- API_KEY
- MOTHERDUCK_TOKEN
in https://github.com/jonbiemond/actions-data-pipeline/settings/secrets/actions
To run and develop the code locally, install dependencies with uv
.
uv sync
All the secrets and necessary variables are read from environment variables. Set the following, or use a tool like mise-en-place.
API_KEY=<your_api_key>
DB_NAME=<your_db_name>
MOTHERDUCK_TOKEN=<your_motherduck_token>
dlt
reads the secrets from .dlt/secrets.toml
# .dlt/secrets.toml
[sources]
api_key = "env(API_KEY)"
[destination.motherduck.credentials]
database = "env(DB_NAME)"
password = "env(MOTHERDUCK_TOKEN)"
dbt
reads the secrets directly from environment variables.