Skip to content

Analysis and visualization of open-source police data from two areas, Leicestershire Street and Northumbria Street to derive data-driven insights

Notifications You must be signed in to change notification settings

SwethaJoseph/Crime-Pattern-Analysis-Project

Repository files navigation

Crime-Pattern-Analysis-Project

Overview

This project involves the analysis and visualization of open-source police data from two areas, Leicestershire Street and Northumbria Street, for the month of March 2021. The analysis utilizes Apache Spark SQL for data cleansing, configuration, and pre-processing. Insights are visualized using various graphs and charts to depict crime patterns and their impacts on public safety.

Key Technologies

  • Apache Spark SQL: Used for data processing and querying.
  • Python (PySpark, matplotlib, pandas): For data manipulation and visualization.
  • Jupyter Notebook: The environment for running and documenting the analysis.

Datasets

  • Leicestershire Street Data: Contains crime records for March 2021.
  • Northumbria Street Data: Contains crime records for March 2021.

Both datasets are sourced from data.police.uk.

Analysis Steps

  • Environment Setup: Installation and configuration of Jupyter Notebook and necessary Python libraries.
  • Data Cleaning and Transformation: Removing or rectifying incorrect, inaccurate, or missing data, and transforming data into suitable formats.
  • Exploratory Data Analysis: Using SQL queries and Python functions to gain insights into crime patterns.
  • Visualization: Creating bar charts, pie charts, and maps to pictorially represent the data.

Key Crime Insights

  • Crime Types: Leicestershire sees more "Violence and sexual offences", Northumbria more "Anti-social behaviour".
  • Geographic Influence: Crime rates and types vary significantly by location.
  • Investigation Outcomes: Many cases in Leicestershire are unresolved; Northumbria often has no suspect identified.
  • Population Density: Northumbria has higher "Anti-social behaviour" rates despite lower population density.
  • Data Gaps: Missing data affects the completeness of the analysis.

About

Analysis and visualization of open-source police data from two areas, Leicestershire Street and Northumbria Street to derive data-driven insights

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published