Real-Time AWS Data Processing Pipeline

This project demonstrates how to build a real-time data processing pipeline using AWS services. It's ideal for handling data from IoT sensors or any live data stream.

What This Project Does

a) This pipeline receives "real-time sensor data", processes it, stores the "raw data in Amazon S3" and saves a "summary of the latest readings in DynamoDB".

b) It’s a simplified version of what many companies use for real-time monitoring systems (e.g., smart homes, factory sensors, etc.).

AWS Services Used

   | Service        | Purpose                                                 |
   |----------------|---------------------------------------------------------|
   | Amazon Kinesis | Ingests real-time data (e.g., from IoT sensors)         |
   | AWS Lambda     | Processes each data record as it arrives                |
   | Amazon S3      | Stores raw data files (in JSON format)                  |
   | DynamoDB       | Stores the latest temperature reading per device        |
   | IAM            | Provides permissions for Lambda to access S3 & DynamoDB |

Project Structure

   real-time-pipeline/
   ├── lambda/
   │ └── process_stream.py # Python code for processing data
   ├── iam/
   │ └── lambda_role_policy.json # IAM permissions for the Lambda function
   ├── cloudformation/
   │ └── pipeline_stack.yaml # AWS CloudFormation to set up the infrastructure
   └── README.md # Project documentation

Sample Data (Sent to Kinesis)

json

   {
     "device_id": "sensor-123",
     "temperature": 27.5
   }

This sample simulates a device sending temperature data.

How It Works :

Sensor or data source sends data to a Kinesis stream.
Lambda function automatically triggers for every new record:

a) Decodes the record

b) Saves raw JSON data to an S3 bucket

c) Updates the latest temperature in DynamoDB
You can view:

a) Raw data in the S3 bucket

b) Latest temperature per device in DynamoDB

Deployment Instructions

You must have an AWS account with necessary permissions.

Deploy the Infrastructure

Use AWS CloudFormation with pipeline_stack.yaml to create:

a) Kinesis Stream

b) S3 bucket

c) DynamoDB table

d) IAM role

e) Lambda function
Upload the Lambda Code

a) Zip the process_stream.py file and upload it to S3.
Test It

a) Use the AWS Kinesis console or AWS CLI to send a test record like the one above.

Use Cases

1) IoT monitoring (temperature, pressure, location, etc.)

2) Real-time analytics dashboards

3) Fraud detection or alerting systems

4) Smart agriculture or factory systems

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
cloudformation_pipeline_stack.yaml.txt		cloudformation_pipeline_stack.yaml.txt
lambda_iam_role_policy.json.txt		lambda_iam_role_policy.json.txt
lambda_process.py.txt		lambda_process.py.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-Time AWS Data Processing Pipeline

What This Project Does

AWS Services Used

Project Structure

Sample Data (Sent to Kinesis)

How It Works :

Deployment Instructions

Use Cases

About

Uh oh!

Releases

Packages

satyajit929/Real-Time-Data-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Real-Time AWS Data Processing Pipeline

What This Project Does

AWS Services Used

Project Structure

Sample Data (Sent to Kinesis)

How It Works :

Deployment Instructions

Use Cases

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages