This repository contains Python scripts for extracting Sentinel-2 satellite data from the Copernicus Data Space Ecosystem. The scripts provide functionality for vegetation indices extraction, water mask generation, true color imagery, and time series analysis.
Disclaimer: These are unofficial scripts created for personal research and common use cases. They are inspired by the official CDSE use cases but are not affiliated with or endorsed by the Copernicus program or ESA.
ℹ️ Note: The Copernicus logo displayed above is used for informational purposes only to indicate compatibility with Copernicus data services. This repository is not affiliated with the official Copernicus program.
The Copernicus program is the European Union's Earth observation program, providing free and open access to satellite data and services. The Copernicus Data Space Ecosystem (CDSE) is the official platform for accessing Copernicus satellite data, including Sentinel-2 imagery.
Sentinel-2 is a European optical imaging mission that provides high-resolution imagery of Earth's land surface, supporting applications in agriculture, forestry, land use change, emergency response, and humanitarian relief operations.
The scripts utilize the Sentinel Hub API to access Sentinel-2 Level-2A data through the CDSE platform. They are designed for reproducible research workflows and can be easily configured for different study areas and time periods.
copernicus-data-utilis-scripts/
├── README.md # Main documentation
├── QUICKSTART.md # Quick start guide
├── .gitignore # Excludes configs and outputs
├── requirements.txt # Python dependencies
├── example_area.geojson # Sample study area
├── configs/ # Configuration files
│ ├── example_config.json # Example configuration
│ └── user_config.json # User credentials (gitignored)
├── extractors/ # Core extraction classes
│ ├── __init__.py # Package initialization
│ ├── vegetation_indices_extractor.py # Multi-index extraction
│ ├── water_mask_truecolor_extractor.py # Water & true color
│ ├── timeseries_extractor.py # Time series analysis
│ ├── visualization_tools.py # Visualization utilities
│ └── openeo_extractor.py # OpenEO API integration
├── scripts/ # Example and utility scripts
│ ├── setup_config.py # Configuration setup
│ ├── test_connection.py # Connection testing
│ ├── example_vegetation_indices.py # Vegetation indices example
│ ├── example_water_mask_truecolor.py # Water mask example
│ ├── example_visualization.py # Visualization example
│ ├── example_openeo_lake_monitoring.py # OpenEO lake monitoring example
│ ├── example_openeo_visualization.py # OpenEO data visualization
│ ├── example_openeo_simple.py # Simple OpenEO data loading
│ └── example_openeo_timeseries.py # OpenEO NDVI time series analysis
└── outputs/ # Generated data and visualizations
- Vegetation Indices Extraction: NDVI, EVI, MNDWI, SAVI, BSI, and NDMI
- Water Mask Generation: Automated water body detection and classification
- True Color Imagery: Enhanced RGB satellite images with cloud masking
- Time Series Analysis: Daily observations with statistical aggregation
- GeoTIFF Output: All raster outputs in standard GIS format with metadata
- Configurable Parameters: Easy-to-modify configuration files for reproducibility
This repository provides two complementary approaches for data extraction:
-
Sentinel Hub Processing API (Primary)
- Direct access to CDSE endpoints
- Custom evalscripts for specialized processing
- Real-time on-demand processing
- Fine-grained control over data operations
-
OpenEO API (Alternative)
- Standardized workflows across providers
- Multi-mission support (Sentinel-2, Sentinel-3, Landsat, etc.)
- Integration with OpenEO ecosystem tools
- Batch processing and job management
These scripts are inspired by common research workflows and use cases documented in the official CDSE documentation. They address typical requirements for:
- Agricultural Monitoring: Vegetation health assessment and crop development tracking
- Environmental Research: Water body mapping and land cover change detection
- Climate Studies: Time series analysis of environmental indicators
- GIS Applications: Standardized data formats for integration with existing workflows
The scripts follow CDSE best practices while providing additional functionality for research and analysis purposes.
The OpenEO approach specifically addresses the need for standardized workflows and multi-provider support, as demonstrated in the official CDSE use case examples for land monitoring, agriculture, and disaster monitoring applications.
- Python 3.8 or higher
- Sentinel Hub account with CDSE access
- OpenEO account for alternative API access
- Required Python packages (see requirements.txt)
- Clone this repository:
git clone <repository-url>
cd copernicus-data-utilis-scripts
- Install required packages:
pip install -r requirements.txt
- Set up your Sentinel Hub credentials:
- Copy
configs/example_config.json
toconfigs/my_config.json
- Update the client_id and client_secret with your credentials
- Keep your config files private (they are excluded from git)
- Copy
The scripts use JSON configuration files for easy parameter management. Each script can be configured independently or use a shared configuration.
{
"sentinel_hub": {
"client_id": "your_client_id_here",
"client_secret": "your_client_secret_here"
},
"extraction": {
"geometry_path": "example_area.geojson",
"output_base_dir": "./outputs",
"resolution": 10,
"max_cloud_coverage": 30
},
"time_ranges": {
"vegetation_indices": {
"start_date": "2023-07-01",
"end_date": "2024-01-31"
}
}
}
- Resolution is fixed at 10 meters for optimal data quality but once cloned can be adjusted in the extractors
- Time periods are extended to 7 months to ensure data availability but can also be adjusted
- Cloud coverage threshold is set to 30% for balance between data quality and quantity
- All outputs are in GeoTIFF format with WGS84 coordinate system
Extract multiple vegetation and water indices for a specified area and time period:
python scripts/example_vegetation_indices.py
This script generates separate GeoTIFF files for each index with full geospatial metadata.
Generate water classification masks and true color satellite imagery:
python scripts/example_water_mask_truecolor.py
Outputs include water classification (blue=water, green=vegetation, red=other) and enhanced RGB imagery.
Extract NDVI time series data using OpenEO API:
python scripts/example_openeo_timeseries.py
Creates a CSV file with NDVI observations and generates comprehensive visualizations.
Generate comprehensive visualizations of all extracted data:
python scripts/example_visualization.py
Produces static plots, interactive maps, and HTML reports for analysis and presentation.
Extract water monitoring data using Sentinel-3 SLSTR via OpenEO:
python scripts/example_openeo_lake_monitoring.py
This demonstrates the OpenEO approach with:
- Sentinel-3 SLSTR Level 2 data for water monitoring
- Lake Albert case study (Uganda/DRC border)
- Standardized OpenEO workflows for multi-mission support
- Water indices (NDWI, MNDWI) for lake delineation
Download and visualize OpenEO data with comprehensive plotting:
python scripts/example_openeo_visualization.py
This script demonstrates:
- Real data download from CDSE OpenEO backend
- Multiple output formats: NetCDF (scientific) + GeoTIFF (GIS)
- Temporal processing: Max and mean compositing
- Professional visualization: RGB + individual band plots
- Multi-band analysis: Red, Green, Blue, NIR, and SCL bands
Streamlined NDVI time series analysis using OpenEO workflows:
python scripts/example_openeo_timeseries.py
This script demonstrates:
- NDVI time series extraction from Sentinel-2 L2A data
- Focused study area (Yellowstone National Park)
- Automated data sorting and visualization
- Statistical analysis and trend detection
- Professional plotting with publication-ready outputs
All outputs are organized in the following structure:
outputs/
├── indices/ # Vegetation indices GeoTIFF files
├── water_mask.tif # Water classification mask
├── truecolor.tif # True color satellite image
├── daily_timeseries.csv # Time series data
└── visualizations/ # Generated plots and reports
The scripts implement several quality control measures:
- Cloud masking using Sentinel-2 Scene Classification Layer (SCL)
- Automatic handling of missing or invalid data
- Statistical aggregation for time series data
- Metadata preservation for reproducibility
- API: Sentinel Hub REST API with CDSE endpoints
- Data Source: Sentinel-2 Level-2A (atmospheric corrected)
- Coordinate System: WGS84 (EPSG:4326)
- File Format: GeoTIFF with embedded metadata
- Cloud Masking: SCL band-based filtering
- API: OpenEO REST API with CDSE backend
- Data Sources: Multi-mission support (Sentinel-2, Sentinel-3, Landsat, etc.)
- Processing: Standardized OpenEO workflows and processes
- Authentication: OIDC-based authentication
- Batch Processing: Asynchronous job management
- Data Download: NetCDF and GeoTIFF export capabilities
- Visualization: RGB composites and multi-band analysis
When contributing to this repository:
- Follow the existing code style
- Update configuration examples if needed
- Test scripts with different parameters
- Document any new features or changes
- Create a pull request and will take a look!
This project is licensed under the MIT License. See the LICENSE file for details.
- Copernicus Data Space Ecosystem for satellite data access
- Sentinel Hub for API infrastructure
- European Space Agency for Sentinel-2 mission data