A deep learning model that achieves 76.66% accuracy in predicting US states from street view images. This project demonstrates advanced computer vision techniques applied to geographical feature recognition.
GeoStateNet is a fine-tuned ResNet-101 model with a custom head, trained on the 50States10K dataset to classify which US state a street view image was taken in. Developed as my senior project at the University of Redlands, this project explores how deep learning can recognize subtle geographical patterns in visual data.
- 76.66% panorama accuracy - Significantly outperforming the original DeepGeo baseline (38.32%)
- Real-time inference - Integrated with GeoGuessr game via Chrome extension
- Comprehensive training pipeline - Multi-phase training with Weights & Biases integration
- Practical Implementation - Includes API server and browser extension for practical applications
Model | Panorama Accuracy | Single Image Accuracy |
---|---|---|
DeepGeo (baseline) | 38.32% | 25.92% |
GeoStateNet | 76.66% | 60.82% |
- Final validation loss: 1.34
- Best epoch: 4/5
- Training conducted on Google Colab with NVIDIA A100 GPU
Details on training runs available upon request
# Clone the repository
git clone https://github.com/dcrew44/GeoStateNet.git
cd GeoStateNet
# Install the package
pip install -e .
# Download the trained model checkpoint (76% accuracy)
wget https://github.com/dcrew44/GeoStateNet/releases/download/v1.0/best_model.pth -O checkpoints/best_model.pth
# Quick inference on a single image
python predict.py street_view.jpg
# Use a specific checkpoint
python predict.py street_view.jpg --checkpoint path/to/model.pth
# Show top 10 predictions
python predict.py street_view.jpg --top-k 10
This project includes a complete ecosystem for real-time state prediction in GeoGuessr:
- GeoStateNet-API - FastAPI server for model inference
- GeoStateNet-Extension - Chrome extension for GeoGuessr integration
This project uses the 50States10K and 50States2K datasets from the DeepGeo paper:
- Training: 500K images (2.5K locations ร 4 views ร 50 states)
- Testing: 100K images (500 locations ร 4 views ร 50 states)
- Resolution: 256ร256 pixels
- Coverage: All 50 US states with stratified sampling
-
Download from the DeepGeo project:
-
Extract to the
data/
directory following the structure in the repository
Licence note: The 50States10K/2K datasets are released by the DeepGeo authors without an explicit licence. Redistribution here is limited to download links; users should cite the original paper when using the data.
GeoStateNet employs a sophisticated training strategy:
- Base Model: ResNet-101 pretrained on ImageNet
- Custom Head: AdaptiveConcatPool2d + Fully Connected layers with dropout
- Multi-Phase Training:
- Phase 1: Head-only training
- Phase 2: Unfreeze Layer 4
- Phase 3: Unfreeze Layers 2-4
- Panorama Aggregation during Inference: Averages predictions across 4 cardinal directions
This project takes inspiration and implements techniques from several key works:
-
Suresh, Chodosh, Abello (2018). DeepGeo: Photo Localization with Deep Neural Network.
- Provided foundational dataset and baseline results
-
Victor De Fontnouvelle. GeoGuessrBot: Predicting the Location of Any Street View Image.
- Demonstrated effectiveness of averaging predictions across panorama views.
- This technique was crucial for achieving 76% accuracy on test dataset.
-
Haas, Skreta, Alberti, Finn (2024). Pigeon: Predicting Image Geolocations
- Current SOTA in image geolocation.
- Inspired chrome extension for playing GeoGuessr
-
Jeremy Howard and Sylvain Gugger. FastAI
- I implemented training best practices from the fastai library including:
- AdaptiveConcatPool2d implementation
- Discriminative learning rates
- One-cycle learning rate scheduling
- Progressive unfreezing strategy
- Extensive augmentation pipline
- I implemented training best practices from the fastai library including:
- PyTorch and torchvision for model implementation
- Weights & Biases for experiment tracking
- Google Colab for GPU training resources
# Key hyperparameters that achieved 76% accuracy
optimizer: AdamW
batch_size: 256
learning_rates:
- Phase 1: 0.01 (head only)
- Phase 3: 0.004 (unfreeze layer2-4)
weight_decay: 0.01
label_smoothing: 0.1
# Local training (requires GPU)
python -m state_classifier.main --config config.yaml
# For Google Colab, see colab_training_example.ipynb
All experiments are tracked via Weights & Biases. Key metrics include:
- Per-state accuracy breakdown
- Confusion matrices
- Training/validation curves
- ๐ Multi-view transformer that learns cross-directional context
- ๐บ๏ธ Fine-grained geocells (โ12 km) using contrastive pre-training
- ๐ Per-state calibration plots & model card for bias analysis
This project is licensed under the MIT License - see the LICENSE file for details.
Special thanks to:
- My advisor at the University of Redlands Professor Rick Cornez
- The authors of all the works which inspired me
- A special thanks to the authors of the DeepGeo paper for making their dataset publicly available
- The open-source community for invaluable tools and inspiration
- Professor Joanna Bieri who inspired me to pursue machine learning
If you found this project interesting, please consider giving it a โญ!