- 
                Notifications
    
You must be signed in to change notification settings  - Fork 0
 
Technical Documentation
- Overview
 - System Architecture
 - Data Generation Module
 - Pricing Algorithms
 - Machine Learning Components
 - API Reference
 - Flow Diagrams
 - Technical Glossary
 - Configuration Parameters
 - Usage Examples
 
The Dynamic Pricing Strategies ML project is an educational framework that demonstrates various pricing optimization techniques using machine learning and rule-based approaches. The system simulates an e-commerce marketplace ("Elves' Marketplace") with synthetic transaction data and implements multiple pricing strategies.
- Primary: Data scientists and ML engineers
 - Secondary: Business analysts and pricing strategists
 - Tertiary: Software developers implementing pricing systems
 
- Python 3.8+
 - Required packages: pandas, numpy, scikit-learn, matplotlib, seaborn, plotly, jupyter, ipywidgets
 - Memory: Minimum 2GB RAM for dataset processing
 - Storage: 50MB for generated datasets
 
The system consists of three main components:
┌─────────────────────────────────────────────────────────────┐
│                    Dynamic Pricing System                   │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐ │
│  │   Data Layer    │  │ Algorithm Layer │  │ Interface Layer │ │
│  │                 │  │                 │  │                 │ │
│  │ • Data Generator│  │ • Rule-based    │  │ • Jupyter       │ │
│  │ • CSV Storage   │  │ • ML Models     │  │ • Interactive   │ │
│  │ • Validation    │  │ • Optimization  │  │ • Widgets       │ │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘
Raw Parameters → Data Generation → Feature Engineering → Algorithm Processing → Price Output
      ↓                ↓                    ↓                    ↓               ↓
   Business Rules → Synthetic Data → ML Features → Price Calculation → Validation
This module creates synthetic e-commerce transaction data that mirrors real-world pricing scenarios.
Purpose: Generates a realistic dataset of e-commerce transactions with dynamic pricing factors.
Parameters: None (configuration is internal)
Returns: pandas.DataFrame with transaction data
Side Effects:
- Uses fixed random seed (42) for reproducibility
 - Creates deterministic but realistic data patterns
 
Configuration Parameters:
n_transactions = 25000    # Total number of transactions
n_products = 35          # Number of unique products  
start_date = 2023-01-01  # Dataset start date
end_date = 2023-12-31    # Dataset end dateData Schema:
| Column | Type | Description | Range/Values | 
|---|---|---|---|
| transaction_id | string | Unique transaction identifier | TXN_XXXXXX | 
| product_id | string | Product identifier | ELF_XXX | 
| product_name | string | Human-readable product name | Various | 
| category | string | Product category | Potions, Tools, Jewelry, Scrolls, Enchanted Items | 
| original_price | float | Base product price | 10.0 - 500.0 | 
| price_paid | float | Final transaction price | Calculated | 
| quantity | int | Items purchased | 1-3 | 
| timestamp | datetime | Transaction timestamp | 2023-01-01 to 2023-12-31 | 
| customer_id | string | Customer identifier | CUST_XXXX | 
| customer_segment | string | Customer category | New, Loyal, High-Value, Regular | 
| inventory_level_before_sale | int | Stock level | 0-100 | 
| competitor_price_avg | float | Market price comparison | Calculated | 
| holiday_season | int | Holiday indicator | 0 or 1 | 
The data generation implements several pricing multipliers:
- Holiday Multiplier: 1.10 - 1.30x during holiday periods
 - Weekend Multiplier: 1.05 - 1.15x on weekends
 - 
Inventory Multiplier:
- Low stock (< 10): 1.15 - 1.25x
 - High stock (> 80): 0.90 - 0.95x
 
 - Market Noise: 0.95 - 1.05x random variation
 
Algorithm:
final_price = original_price × holiday_multiplier × weekend_multiplier × inventory_multiplier × market_noise| Category | Base Price Range | Characteristics | 
|---|---|---|
| Potions | $15 - $45 | Fast-moving consumer goods | 
| Tools | $25 - $80 | Durable goods with moderate demand | 
| Jewelry | $50 - $200 | Premium items with price elasticity | 
| Scrolls | $10 - $30 | Low-cost, high-volume items | 
| Enchanted Items | $100 - $500 | Luxury goods with low volume | 
Purpose: Applies business rule-based price adjustments using conditional logic.
Signature:
def rule_based_pricing(original_price: float, 
                      inventory_level: int, 
                      is_holiday: bool, 
                      is_weekend: bool, 
                      customer_segment: str) -> Tuple[float, List[str]]Parameters:
- 
original_price(float): Base price before adjustments (required, > 0) - 
inventory_level(int): Current stock quantity (required, 0-100) - 
is_holiday(bool): Holiday period indicator (required) - 
is_weekend(bool): Weekend timing indicator (required) - 
customer_segment(str): Customer category (required, values: 'New', 'Regular', 'Loyal', 'High-Value') 
Returns:
- 
Tuple[float, List[str]]: (adjusted_price, list_of_applied_adjustments) 
Side Effects: None (pure function)
Business Rules:
- 
Inventory-Based Pricing:
if inventory_level < 10: price *= 1.25 # +25% for scarcity elif inventory_level > 80: price *= 0.95 # -5% for excess inventory
 - 
Temporal Pricing:
if is_holiday: price *= 1.20 # +20% holiday premium if is_weekend: price *= 1.10 # +10% weekend premium
 - 
Customer Segmentation:
if customer_segment == 'Loyal': price *= 0.90 # -10% loyalty discount
 
Algorithm Complexity: O(1) - constant time execution
Example Usage:
new_price, adjustments = rule_based_pricing(
    original_price=50.0,
    inventory_level=5,
    is_holiday=True,
    is_weekend=False,
    customer_segment='Loyal'
)
# Result: (67.50, ['📦 Low stock (+25%)', '🎄 Holiday season (+20%)', '👑 Loyal customer (-10%)'])Purpose: Uses linear regression to predict optimal pricing based on historical patterns.
Signature:
def predict_optimal_price(original_price: float,
                         inventory_level: int,
                         is_holiday: bool,
                         is_weekend: bool,
                         competitor_price: float) -> Dict[str, Any]Parameters:
- 
original_price(float): Base product price (required, > 0) - 
inventory_level(int): Current inventory level (required, 0-100) - 
is_holiday(bool): Holiday season indicator (required) - 
is_weekend(bool): Weekend indicator (required) - 
competitor_price(float): Average competitor pricing (required, > 0) 
Returns:
- 
Dict[str, Any]: Comprehensive prediction results including:- 
predicted_price: Optimal price prediction - 
confidence_score: Model confidence (R² score) - 
revenue_estimate: Expected revenue - 
demand_forecast: Predicted demand - 
price_sensitivity: Elasticity coefficient - 
test_prices: Array of test prices for analysis - 
revenues: Corresponding revenue predictions 
 - 
 
Machine Learning Model:
- Algorithm: Linear Regression (sklearn.linear_model.LinearRegression)
 - Features: [original_price, inventory_level, is_holiday, is_weekend, competitor_price]
 - Target: price_paid (from historical data)
 - Training Split: 80% training, 20% validation
 - Performance Metrics: R² score, RMSE
 
Feature Engineering:
features = df[['original_price', 'inventory_level_before_sale', 
               'holiday_season', 'weekend', 'competitor_price_avg']]
target = df['price_paid']Model Training Process:
- Load historical transaction data
 - Engineer weekend feature from timestamp
 - Split data into training/validation sets
 - Fit linear regression model
 - Calculate performance metrics
 - Generate price-revenue optimization curve
 
The system implements a scikit-learn based linear regression model for price optimization:
Model Architecture:
Input Features (5) → Linear Regression → Price Prediction
                  ↓
              Model Coefficients → Business Insights
Feature Importance: The model learns coefficients for each input feature, allowing interpretation of pricing factors:
# Example coefficient interpretation
price_effect = {
    'base_price': 0.85,      # Strong positive correlation
    'inventory': -0.12,      # Negative correlation (more stock = lower price)
    'holiday': 8.45,         # Significant positive impact
    'weekend': 3.22,         # Moderate positive impact  
    'competitor': 0.18       # Slight positive correlation
}The system generates revenue optimization curves by:
- Price Testing: Testing price points from 80% to 120% of original price
 - Demand Modeling: Using price elasticity assumptions
 - Revenue Calculation: Revenue = Price × Predicted_Demand
 - Optimization: Finding price point that maximizes revenue
 
Demand Elasticity Model:
# Simple elasticity assumption
demand = base_demand * (original_price / test_price) ** elasticity_coefficient
revenue = test_price * demanddef generate_elves_marketplace_data() -> pd.DataFrameGenerates synthetic e-commerce dataset with realistic pricing patterns.
Returns: DataFrame with 25,000 transactions across 34 products
def rule_based_pricing(original_price: float, 
                      inventory_level: int, 
                      is_holiday: bool, 
                      is_weekend: bool, 
                      customer_segment: str) -> Tuple[float, List[str]]Applies business rule-based price adjustments.
Error Handling:
- Validates price > 0
 - Ensures inventory_level in valid range
 - Validates customer_segment values
 
def predict_optimal_price(original_price: float,
                         inventory_level: int,
                         is_holiday: bool,
                         is_weekend: bool,
                         competitor_price: float) -> Dict[str, Any]Machine learning-based price optimization with revenue forecasting.
Performance:
- Training time: ~50ms on 25k records
 - Prediction time: <1ms per request
 - Memory usage: ~15MB for model storage
 
def interactive_pricing(original_price: float, 
                       inventory: int, 
                       is_holiday: bool, 
                       is_weekend: bool, 
                       customer_segment: str) -> NoneWidget-compatible function for Jupyter notebook interaction.
Side Effects: Prints formatted pricing analysis to stdout
def create_pricing_visualization(original_price: float = 50.0,
                                inventory_level: int = 50,
                                is_holiday: bool = False,
                                is_weekend: bool = False,
                                competitor_price: float = 52.0) -> NoneGenerates interactive Plotly visualizations for pricing analysis.
Visualizations Created:
- Revenue vs Price curve
 - Demand vs Price relationship
 - Profit margin analysis
 - Competitive positioning
 
graph TD
    A[Business Requirements] --> B[Data Generation]
    B --> C[Synthetic Dataset]
    C --> D[Feature Engineering]
    D --> E{Pricing Strategy}
    E -->|Rule-Based| F[Apply Business Rules]
    E -->|ML-Based| G[Train ML Model]
    F --> H[Price Output]
    G --> I[Predict Optimal Price]
    I --> H
    H --> J[Revenue Analysis]
    J --> K[Business Insights]
    graph TD
    A[Input: Original Price] --> B[Check Inventory Level]
    B -->|< 10 items| C[Apply +25% markup]
    B -->|> 80 items| D[Apply -5% discount]
    B -->|10-80 items| E[No inventory adjustment]
    C --> F[Check Holiday Status]
    D --> F
    E --> F
    F -->|Holiday = True| G[Apply +20% premium]
    F -->|Holiday = False| H[Check Weekend Status]
    G --> H
    H -->|Weekend = True| I[Apply +10% premium]
    H -->|Weekend = False| J[Check Customer Segment]
    I --> J
    J -->|Loyal Customer| K[Apply -10% discount]
    J -->|Other Segments| L[Final Price Calculation]
    K --> L
    L --> M[Return Price + Adjustments]
    graph TD
    A[Historical Transaction Data] --> B[Data Preprocessing]
    B --> C[Feature Engineering]
    C --> D[Train/Test Split]
    D --> E[Model Training]
    E --> F[Model Validation]
    F --> G[Performance Evaluation]
    G -->|R² < 0.7| H[Hyperparameter Tuning]
    G -->|R² ≥ 0.7| I[Model Deployment]
    H --> E
    I --> J[Price Prediction Service]
    J --> K[Revenue Optimization]
    graph TD
    A[Current Product State] --> B[Generate Price Candidates]
    B --> C[For Each Price Point]
    C --> D[Predict Demand]
    D --> E[Calculate Revenue]
    E --> F[Consider Constraints]
    F --> G[Compare to Current Revenue]
    G --> H[Select Optimal Price]
    H --> I[Validate Business Rules]
    I -->|Pass| J[Implement New Price]
    I -->|Fail| K[Apply Rule Constraints]
    K --> J
    Dynamic Pricing: Algorithmic pricing strategy that adjusts prices in real-time based on market conditions, demand patterns, and business constraints.
Price Elasticity: Measure of how responsive demand is to price changes. Calculated as % change in demand / % change in price.
Revenue Optimization: Process of finding the price point that maximizes total revenue (price × quantity sold).
Customer Segmentation: Division of customers into groups based on behavior, value, or demographics for targeted pricing strategies.
Linear Regression: Statistical method for modeling the relationship between input features and target variable using linear equations.
Feature Engineering: Process of selecting and transforming variables for machine learning models.
Cross-Validation: Technique for evaluating model performance by testing on multiple data subsets.
R² Score (Coefficient of Determination): Metric measuring how well the model explains variance in the target variable (0-1 scale).
Base Price: Original manufacturer suggested retail price before any dynamic adjustments.
Markup: Percentage increase above base price.
Markdown: Percentage decrease below base price.
Holiday Premium: Additional price increase during high-demand seasonal periods.
Inventory Multiplier: Price adjustment factor based on current stock levels.
Competitor Price: Average market price for equivalent products.
Synthetic Data: Artificially generated data that mimics real-world patterns and distributions.
Jupyter Notebook: Interactive computing environment for data science and ML development.
IPython Widgets: Interactive HTML widgets for Jupyter notebooks.
Plotly: JavaScript-based plotting library for interactive visualizations.
# Dataset Configuration
DATASET_CONFIG = {
    'n_transactions': 25000,        # Total number of transactions
    'n_products': 34,              # Number of unique products
    'n_customers': 5000,           # Approximate customer base
    'date_range': {
        'start': '2023-01-01',     # Dataset start date
        'end': '2023-12-31'        # Dataset end date
    },
    'random_seed': 42,             # Reproducibility seed
    'holiday_periods': [           # Holiday date ranges
        ('2023-03-15', '2023-04-15'),  # Spring Festival
        ('2023-06-15', '2023-07-15'),  # Midsummer Festival
        ('2023-11-20', '2023-12-25')   # Winter Celebration
    ]
}# Rule-Based Pricing Configuration
PRICING_RULES = {
    'inventory_thresholds': {
        'low_stock': 10,           # Units below which scarcity pricing applies
        'high_stock': 80,          # Units above which clearance pricing applies
        'low_stock_markup': 0.25,  # 25% markup for low inventory
        'high_stock_discount': 0.05 # 5% discount for excess inventory
    },
    'temporal_adjustments': {
        'holiday_premium': 0.20,   # 20% holiday season markup
        'weekend_premium': 0.10    # 10% weekend markup
    },
    'customer_discounts': {
        'loyal_discount': 0.10,    # 10% discount for loyal customers
        'high_value_discount': 0.15 # 15% discount for high-value customers
    }
}# ML Model Configuration
ML_CONFIG = {
    'model_type': 'LinearRegression',
    'train_test_split': 0.8,       # 80% training, 20% testing
    'features': [
        'original_price',
        'inventory_level_before_sale',
        'holiday_season',
        'weekend',
        'competitor_price_avg'
    ],
    'target': 'price_paid',
    'performance_threshold': 0.7,   # Minimum R² score
    'price_test_range': (0.8, 1.2), # Test prices from 80% to 120% of original
    'price_test_points': 20         # Number of price points to test
}# Plotting Configuration
PLOT_CONFIG = {
    'figure_size': (12, 8),
    'color_palette': 'husl',
    'interactive_plots': True,
    'plot_style': 'seaborn-v0_8',
    'dpi': 100,
    'font_size': 12
}# Generate synthetic dataset
df = generate_elves_marketplace_data()
print(f"Generated {len(df)} transactions")
print(f"Date range: {df['timestamp'].min()} to {df['timestamp'].max()}")
print(f"Products: {df['product_id'].nunique()}")
print(f"Customers: {df['customer_id'].nunique()}")
# Save to CSV
df.to_csv('marketplace_data.csv', index=False)# Example 1: High-demand scenario
price, adjustments = rule_based_pricing(
    original_price=100.0,
    inventory_level=3,      # Very low stock
    is_holiday=True,        # Holiday period
    is_weekend=True,        # Weekend
    customer_segment='New'  # New customer
)
print(f"New price: ${price:.2f}")
print(f"Adjustments: {adjustments}")
# Output: New price: $165.00, Adjustments: ['📦 Low stock (+25%)', '🎄 Holiday season (+20%)', '🌅 Weekend (+10%)']
# Example 2: Clearance scenario
price, adjustments = rule_based_pricing(
    original_price=50.0,
    inventory_level=95,      # Excess inventory
    is_holiday=False,        # Regular period
    is_weekend=False,        # Weekday
    customer_segment='Loyal' # Loyal customer
)
print(f"New price: ${price:.2f}")
print(f"Adjustments: {adjustments}")
# Output: New price: $42.75, Adjustments: ['📦 High stock (-5%)', '👑 Loyal customer (-10%)']# Load historical data
df = pd.read_csv('elves_marketplace_data.csv')
df['timestamp'] = pd.to_datetime(df['timestamp'])
# Prepare features
df['weekend'] = (df['timestamp'].dt.weekday >= 5).astype(int)
# Predict optimal price
result = predict_optimal_price(
    original_price=75.0,
    inventory_level=45,
    is_holiday=True,
    is_weekend=False,
    competitor_price=78.0
)
print(f"Predicted optimal price: ${result['predicted_price']:.2f}")
print(f"Expected revenue: ${result['revenue_estimate']:.2f}")
print(f"Model confidence: {result['confidence_score']:.3f}")
print(f"Demand forecast: {result['demand_forecast']:.1f} units")# Create interactive pricing widget
import ipywidgets as widgets
from IPython.display import display
# Define widget controls
price_slider = widgets.FloatSlider(value=50.0, min=10.0, max=200.0, description='Price:')
inventory_slider = widgets.IntSlider(value=50, min=0, max=100, description='Inventory:')
holiday_checkbox = widgets.Checkbox(value=False, description='Holiday')
weekend_checkbox = widgets.Checkbox(value=False, description='Weekend')
segment_dropdown = widgets.Dropdown(
    options=['New', 'Regular', 'Loyal', 'High-Value'],
    value='Regular',
    description='Segment:'
)
# Create interactive function
pricing_widget = widgets.interactive(
    interactive_pricing,
    original_price=price_slider,
    inventory=inventory_slider,
    is_holiday=holiday_checkbox,
    is_weekend=weekend_checkbox,
    customer_segment=segment_dropdown
)
display(pricing_widget)# Optimize prices for multiple products
products_to_optimize = [
    {'id': 'ELF_001', 'price': 25.0, 'inventory': 12, 'competitor': 27.0},
    {'id': 'ELF_002', 'price': 45.0, 'inventory': 8, 'competitor': 43.0},
    {'id': 'ELF_003', 'price': 75.0, 'inventory': 85, 'competitor': 72.0}
]
results = []
for product in products_to_optimize:
    result = predict_optimal_price(
        original_price=product['price'],
        inventory_level=product['inventory'],
        is_holiday=True,  # Holiday optimization
        is_weekend=False,
        competitor_price=product['competitor']
    )
    
    results.append({
        'product_id': product['id'],
        'original_price': product['price'],
        'optimal_price': result['predicted_price'],
        'revenue_lift': result['revenue_estimate'] - (product['price'] * result['demand_forecast']),
        'confidence': result['confidence_score']
    })
# Display results
for result in results:
    print(f"Product {result['product_id']}:")
    print(f"  Original: ${result['original_price']:.2f}")
    print(f"  Optimal: ${result['optimal_price']:.2f}")
    print(f"  Revenue Lift: ${result['revenue_lift']:.2f}")
    print(f"  Confidence: {result['confidence']:.3f}")
    print()def hybrid_pricing_strategy(original_price, inventory_level, is_holiday, 
                          is_weekend, customer_segment, competitor_price):
    """
    Hybrid strategy combining rule-based and ML approaches
    """
    # Get rule-based price
    rule_price, rule_adjustments = rule_based_pricing(
        original_price, inventory_level, is_holiday, is_weekend, customer_segment
    )
    
    # Get ML prediction
    ml_result = predict_optimal_price(
        original_price, inventory_level, is_holiday, is_weekend, competitor_price
    )
    ml_price = ml_result['predicted_price']
    
    # Weighted combination (70% ML, 30% rules)
    final_price = 0.7 * ml_price + 0.3 * rule_price
    
    # Ensure reasonable bounds
    min_price = original_price * 0.7  # Never below 70% of original
    max_price = original_price * 1.5  # Never above 150% of original
    final_price = max(min_price, min(max_price, final_price))
    
    return {
        'final_price': round(final_price, 2),
        'rule_price': rule_price,
        'ml_price': ml_price,
        'adjustments': rule_adjustments,
        'confidence': ml_result['confidence_score']
    }
# Example usage
result = hybrid_pricing_strategy(
    original_price=60.0,
    inventory_level=15,
    is_holiday=True,
    is_weekend=True,
    customer_segment='Loyal',
    competitor_price=65.0
)
print(f"Hybrid pricing result: ${result['final_price']:.2f}")
print(f"Rule-based price: ${result['rule_price']:.2f}")
print(f"ML predicted price: ${result['ml_price']:.2f}")
print(f"Model confidence: {result['confidence']:.3f}")- Data Generation: O(n) where n = number of transactions
 - Rule-Based Pricing: O(1) constant time per prediction
 - ML Model Training: O(n·f) where n = samples, f = features
 - ML Prediction: O(f) where f = number of features
 
- Dataset Storage: ~2.5MB for 25k transactions
 - ML Model: ~15MB in memory
 - Jupyter Environment: ~50MB for interactive widgets
 
- 
For Production Use:
- Implement caching for frequently accessed prices
 - Use database storage instead of CSV files
 - Consider more sophisticated ML models (Random Forest, XGBoost)
 - Implement A/B testing framework for pricing strategies
 
 - 
For Large Datasets:
- Use chunked processing for datasets > 100k records
 - Implement data pipelines with Apache Airflow or similar
 - Consider distributed computing with Dask or Spark
 
 - 
Real-time Pricing:
- Pre-compute price tables for common scenarios
 - Use Redis or similar for fast price lookups
 - Implement circuit breakers for ML model failures
 
 
This technical documentation provides comprehensive coverage of the Dynamic Pricing Strategies ML system, from low-level implementation details to high-level business concepts. It serves as both a reference guide for developers and an educational resource for understanding dynamic pricing methodologies.