This project has been deprecated and is no longer in use or being actively maintained.
dim
is a simple application that can be used to define specific rules which data in your tables should follow. Such as number of rows in a specific date partition. It also provides the functionality to execute those checks and persists its metadata in a table in BigQuery.
BigQuery is the only supported data source.
- python >=
3.10
<3.11
- docker
- BigQuery
.
├── Dockerfile
├── Makefile
├── README.md
├── pyproject.toml
├── dim
│ ├── app.py
│ ├── bigquery_client.py
│ ├── cli.py
│ ├── const.py
│ ├── error.py
│ ├── models
│ │ ├── dim_config.py
│ │ └── dim_check_type - contains dim test type object definitions
│ │ └── templates - contains SQL templates used by corredponding dim tests
│ ├── slack.py
│ └── utils.py
├── dim_checks - contains dim configs that specify tests and their conditions for tables
│ ├── [project]
│ │ └── [dataset]
│ │ └── [table]
│ │ └── dim_checks.yaml
│ └── [project]
│ └── [table]
│ └── [table]
│ └── dim_checks.yaml
├── docs
│ └── static - contains static files used by docs in the repo
├── requirements - contains requirement files
└── tests
├── cli - cli module tests
│ └── test_configs - contains .yaml configs used for testing
└── dim_checks - contains tests for dim test types
- Inside
dim_checks/
create a folder structure corresponding to[project_id]/[dataset]/[table]
withdim_checks.yaml
file inside. - Add dim check definitions you'd like to run against your table (see TODO: section_name for more info).
- Build docker app image using
make build
- Run the checks against your table using
docker run dim:latest dim run --project_id=[project_id] --dataset=[dataset] --table=[table] --date=[date_partition]
1.
- Inside
dim_checks/
create a folder structure corresponding to[project_id]/[dataset]/[table]
withdim_checks.yaml
file inside. - Run
make setup-venv
(created virtual environment calledvenv
in the project directory) - Run
make upgrade-pip
(upgrades pip insidevenv
virtual environment) - Run
make update-deps
(compiles requirements for the application and development) - Run
make update-local-env
(installs requirements for the application and development) - Run
make venv/bin/python -m pip install .
(installs dim package in your local virtual environment)
Additional info
Your container should have the following two environment variables set: GOOGLE_APPLICATION_CREDENTIALS
and SLACK_BOT_TOKEN
. It's up to you have you want to do this. One option is to set these in your docker command as follows: docker run -v [local_path_to_gcp_creds_file]:/sa.json -e GOOGLE_APPLICATION_CREDENTIALS=/sa.json -e SLACK_BOT_TOKEN="[SLACK_TOKEN]" dim:latest dim run --project_id=[project_id] --dataset=[dataset] --table=[table] --date=[date_partition]
This is a very early stage application and has very limited functionality. Currently, the following commands are available:
-
run
- runs all tests defined in thedim
yaml for the specified table for a specific date partition. -
backfill
- runs all tests for defined in thedim
yaml for the specified table for a specific range of date partitions (each processed individually). -
validate
- used to validate a specificdim
yaml config. -
mute
- Adds a record to themuted_alerts
table containing information about which alerts for specific tables and date partitions should not be sent out. -
unmute
- Removes a record from themuted_alerts
table for the specified table and date partitions.
- Build docker test image using
make build-test
- Run
make test-all
. This runs the following commands:make test-unit
&&make test-flake8
&&make test-isort
&&make test-black
(these can also be run individually).
- Run docker image with run command.
- Check corresponding path exists (
dim_checks/[gcp_project]/[dataset]/[table]
). - Validate matching configs on read time (this should be fine since they're very small).
- Create corresponding test objects, this includes test execution and success parameters.
- Execute all tests and persist results in BQ (can be used to build Looker dashboard on top of this dim exeuction metadata).
- If alerting enabled, alert sent out if any of the tests failed.
- *.md file linter
- *.py files (flake8 + black)
- *.yaml linter