This module provides S3 compatibility for the Data Universe subnet, allowing miners to upload data directly to S3 storage instead of (or alongside) HuggingFace. The implementation includes blockchain-based authentication and folder-based access for efficient uploads and downloads.
- Folder-based access using coldkey structure for miner isolation
- Strong security with policy-based restrictions and signature verification
- Blockchain authentication for miners and validators
- Policy-based uploads allowing efficient multi-file operations
- Comprehensive validator access to all miner data
- Rate limiting to prevent abuse
- Compatible with HuggingFace for smooth migration
- FastAPI implementation with automatic documentation
- Docker support for easy deployment
data/
├── x/ # X/Twitter data
│ ├── <miner_coldkey_1>/
│ │ ├── file1.parquet
│ │ ├── file2.parquet
│ │ └── ...
│ └── <miner_coldkey_2>/
│ ├── file1.parquet
│ └── ...
└── reddit/ # Reddit data
├── <miner_coldkey_1>/
│ ├── file1.parquet
│ └── ...
└── <miner_coldkey_2>/
└── ...
The security model ensures proper isolation between miners while giving validators complete access:
- Miner Isolation: Each miner can only access their own folder (
data/<source>/<coldkey>/
) - Policy Restrictions: Upload policies restrict operations to specific folders
- Size Limits: Files must be between 1KB and 5GB per upload
- Signature Verification: All requests require valid blockchain signatures
- Time-Limited Access: Access expires after 24 hours by default
/get-folder-access
: Get a policy for uploading multiple files to a folder- Creates a signed POST policy restricted to the miner's folder
- Includes a listing URL for verifying uploads
- Valid for 24 hours by default
/get-validator-access
: Get comprehensive read access to all miner data- Provides organized URLs for global, source-level, and miner-specific access
- Automatically discovers all miners in the system
- Includes file listing and download capabilities
/healthcheck
: Check server status and configuration
- Python 3.8+
- Required Python packages:
pip install -r requirements.txt
- S3-compatible storage (AWS S3, MinIO, Cloudflare R2, etc.)
- AWS credentials with bucket access
Edit the config.py
file to set your bucket and region:
# S3 configuration
S3_BUCKET = 'your-bucket-name' # Change to your bucket name
S3_REGION = 'us-east-1' # Change to your region
SERVER_PORT = 8000 # Server port
# Run the optimized FastAPI server
python -m uvicorn updated_server:app --host 0.0.0.0 --port 8000 --reload
# Or use Docker
docker build -t s3-auth-server .
docker run -p 8000:8000 s3-auth-server
The repository includes several test utilities:
cross_folder_test.py
: Tests the security of folder isolationtest_upload.py
: Demonstrates miner upload workflowsdownload_test.py
: Tests validator download capabilities
Run tests with:
# Test folder security
python cross_folder_test.py
# Test upload capabilities
python test_upload.py
# Test validator capabilities
python download_test.py
The recommended migration strategy:
- Dual Upload Phase: Upload to both HuggingFace and S3
- Validator Update: Update validators to check both systems
- S3-Only Phase: Stop HuggingFace uploads once S3 is validated
- Historical Migration: Move historical data as needed
- Never expose AWS secret keys to clients
- Keep the authentication server behind a secure proxy
- Validate all inputs and enforce rate limits
- Monitor for unusual access patterns
- Regularly rotate AWS credentials
- Use the smallest set of S3 permissions needed