This project showcases an end-to-end customer data transformation and analytics workflow using dbt (Data Build Tool) and Google BigQuery.
It focuses on structuring raw customer data into clean, trusted datasets ready for analysis — and extracting valuable business insights like customer loyalty, churn behavior, and overall activity trends.
In real-world business environments, raw customer data is often messy, inconsistent, and hard to use directly for decision-making.
This project simulates a modern data workflow where raw data is:
- Transformed and modeled into clean structures
- Tested for quality
- Documented for transparency
- Analyzed to generate meaningful insights
By following best practices in data engineering and analytics, this project builds a robust foundation for trustworthy business intelligence.
- dbt (Data Build Tool): For data transformation, testing, and documentation.
- Google BigQuery: As the cloud data warehouse to store and query the data.
- SQL: For all transformation logic and insights extraction.
-
Data Transformation:
- Built dbt models to clean and organize raw customer data.
- Structured fields like customer IDs, names, first/most recent orders, and number of orders.
-
Data Testing:
- Implemented dbt tests to ensure data integrity (e.g., primary key uniqueness, no nulls in critical fields).
-
Data Documentation:
- Generated dbt documentation to provide model descriptions, column details, and lineage graphs.
-
Data Analysis:
- Queried BigQuery to extract insights such as:
- Top 10 loyal customers
- Average number of orders per customer
- Churned customers (inactive > 80 days)
- New vs Returning customer breakdown
- Customer active lifespan analysis
- Queried BigQuery to extract insights such as:
-
Visualization:
- Created basic charts directly in BigQuery to visualize key metrics like loyalty distribution and churn rates.
By implementing this workflow:
- Data becomes structured, consistent, and reliable for analysis.
- Businesses can quickly identify high-value customers and churn risks.
- Data quality is monitored automatically using dbt testing.
- Teams gain clear documentation and data lineage visibility, reducing onboarding time for new members.
Query 1 - finding top 10 loyal customers
Query 2 - average number of orders per customer
Query 3 - churn analysis (customers inactive for 2+ months)
Query 4 - new vs returning customer
Query 5 - customer activity time span
-
Most customers place only a single order. This indicates low customer retention across the user base. Business Focus: Introduce loyalty programs, reactivation offers, and retention campaigns to encourage repeat orders.
-
A small group of loyal customers contributes to repeat business. Top customers placed 8 to 10 orders each. These high-value users represent an opportunity for VIP loyalty programs, exclusive offers, and personalized marketing. Business Focus: Nurture loyal customers and increase their lifetime value.
-
Customer churn is a potential risk. Several customers have been inactive for over 80 days. Early signs of churn highlight the need for customer re-engagement strategies. Business Focus: Send reactivation campaigns targeting churned customers to win them back.
-
Returning customer rate is strong, but new customer acquisition is lower. 67% of customers are returning; only 33% are new. This suggests the business is retaining customers well, but may need to increase new customer acquisition efforts. Business Focus: Enhance marketing campaigns to attract new customers, while maintaining strong loyalty efforts for existing users.
-
Most customers have short activity spans. The longest customer activity period recorded is only 76 days (~2.5 months). This indicates that customers typically stop engaging within 2-3 months of their first order. Business Focus: Launch re-engagement campaigns around the 60-day mark to prevent churn.
This project demonstrates the power of combining dbt, BigQuery, and SQL for building scalable, tested, and insightful data workflows.
It highlights how structured data pipelines can translate directly into business value by enabling faster, more confident decision-making.