Completed code challenge #4
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cloud Data Engineer Challenge — MR
Summary
This MR delivers an end-to-end, free-tier–friendly AWS data pipeline using Terraform + Python:
incoming/
) → Lambda (ingest) → RDS PostgreSQL (PostGIS)GET /aggregated-data
)Repository Layout
What’s Implemented
ObjectCreated:*
withprefix=incoming/
andsuffix=.csv
→ ingest Lambda.city
(Title Case), parseprice
(comma/locale tolerant), computelisting_count
+avg_price
.CREATE EXTENSION postgis
and table created on first connect.GET /aggregated-data?limit=&city=
.Additional Features
1,234.56
and6,81
), city normalization, sanity bounds; invalids skipped.idx_agg_count_city (listing_count DESC, city ASC)
for API orderingidx_agg_city (city)
for lookupsidx_agg_geom (GIST)
to enable spatial queries (nullablegeom
)ANALYZE aggregated_city_stats
after ingestErrors > 0
to SNS (opt-in).terraform fmt/validate
, Python lint (ruff), detect-secrets.Deploy Instructions
Prerequisites
make
optional; PowerShell fallbacks are inREADME.md
Configure AWS
aws configure # region: us-east-1 aws sts get-caller-identity
Build Lambda zips (Linux-compatible wheels)
Windows (PowerShell):
(If not using
make
, README includes Docker one-liners to build both zips.)Apply Terraform
Grab outputs
Test Instructions
1) End-to-end with S3 Put
Expected logs:
s3_get_object_before
,csv_parsed
,db_connect_*
,done
.2) Query the API
3) Manual Lambda Invoke (optional)
Security & Networking
Deliverables Checklist
s3_put_event.json
Assumptions
city
and optionalprice
; others ignored.geom
is nullable; GIST index prepped for future spatial queries.Troubleshooting
ObjectCreated:*
,prefix=incoming/
,suffix=.csv
), ensure regionus-east-1
, try a new key name.psycopg2
missing → Rebuild zips via Docker (make build-linux-all
), thenterraform apply
./aws/lambda/renzob-nanlabs-dev-api
; confirm RDS is ready and Secrets accessible.source_code_hash = filebase64sha256(...)
is set for both Lambdas.