S3 Diff Archive

A powerful, efficient command-line tool for incremental backup and archiving of files to Amazon S3. This tool performs differential backups by only archiving files that have changed since the last backup, making it ideal for large datasets where full backups would be inefficient.

🚀 Features

Incremental Backups: Only archives files that have changed since the last backup
S3 Integration: Direct upload to Amazon S3 with configurable storage classes
Password Protection: Encrypt your archives with password-based encryption
File Filtering: Support for exclude patterns using glob syntax
Multiple Tasks: Configure multiple backup tasks in a single configuration file
Database Tracking: Uses BadgerDB to track file states and changes
Compression: Automatic ZIP compression with configurable size limits
Restoration: Experimental restore functionality from archived backups (Must request deep_archive restore first from s3)
Detailed Logging: Comprehensive logging for monitoring and debugging
Notifications: Configurable notification system for operation status updates

📦 Installation

Download Pre-built Binaries

Pre-compiled binaries are available for download from the Releases section. Choose the appropriate binary for your operating system:

Linux: s3-diff-archive-linux-amd64
macOS: s3-diff-archive-darwin-amd64 (Intel) or s3-diff-archive-darwin-arm64 (Apple Silicon)
Windows: s3-diff-archive-windows-amd64.exe

Build from Source

If you prefer to build from source:

git clone https://github.com/fahidsarker/s3-diff-archive.git
cd s3-diff-archive
go build -o s3-diff-archive .

⚙️ Configuration

Environment Variables

Create a .env file in your working directory with your AWS credentials:

AWS_ACCESS_KEY_ID=your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_key_here
AWS_REGION=us-east-1
S3_BUCKET=your-bucket-name

Configuration File

Create a YAML configuration file (e.g., config.yaml) based on the sample:

# Base path in S3 bucket where archives will be stored
s3_base_path: "backups/my-project"

# Directory to store logs (optional)
logs_dir: "./logs"

# Temporary directory for creating zip files
working_dir: "./tmp"

# Maximum size for each zip file in MB
max_zip_size: 5000

# Notification script for operation status updates (optional)
# Available placeholders: %icon%, %operation%, %status%, %message%
notify_script: 'echo "%icon% %operation% - %status% | %message%"'

# Backup tasks configuration
tasks:
  - id: photos
    dir: "./photos"
    storage_class: "DEEP_ARCHIVE"  # Cost-effective for long-term storage
    encryption_key: "MySecurePassword123"
    exclude: ["**/.DS_Store", "**/Thumbs.db", "**/*.tmp"]

  - id: documents
    dir: "./documents"
    storage_class: "STANDARD_IA"   # For infrequently accessed files
    encryption_key: "AnotherSecurePassword456"

  - id: videos
    dir: "./videos"
    storage_class: "GLACIER"       # Even more cost-effective for archives

Storage Classes

Choose the appropriate S3 storage class based on your access patterns and cost requirements:

STANDARD: For frequently accessed data
INTELLIGENT_TIERING: Automatic cost optimization
STANDARD_IA: For infrequently accessed data
ONEZONE_IA: Lower cost for infrequently accessed data (single AZ)
GLACIER: For archival data accessed once or twice per year
DEEP_ARCHIVE: Lowest cost for long-term archival (7-10 years)

Notification System

The tool supports configurable notifications for operation status updates. Configure the notify_script in your config file to receive notifications:

# Simple echo notification (default)
notify_script: 'echo "%icon% %operation% - %status% | %message%"'

# macOS notification using osascript
notify_script: 'osascript -e "display notification \"%message%\" with title \"S3 Archive - %operation%\" subtitle \"%status%\""'

# Linux notification using notify-send
notify_script: 'notify-send "S3 Archive - %operation%" "%message%" --urgency=normal'

# Slack webhook notification
notify_script: 'curl -X POST -H "Content-type: application/json" --data "{\"text\":\"%icon% %operation% - %status%: %message%\"}" YOUR_SLACK_WEBHOOK_URL'

# Discord webhook notification
notify_script: 'curl -H "Content-Type: application/json" -d "{\"content\":\"%icon% %operation% - %status%: %message%\"}" YOUR_DISCORD_WEBHOOK_URL'

Available Placeholders

%icon%: Status-specific emoji (✅ for success, ❌ for error, ⚠️ for warning, ❌⚠️❌⚠️ for fatal)
%operation%: The operation being performed (scan, archive, restore, system)
%status%: Operation status (success, error, warn, fatal)
%message%: Detailed message about the operation result

🔧 Usage

Basic Commands

# Scan directories for changes (dry run)
s3-diff-archive scan -config config.yaml

# Archive changed files to S3
s3-diff-archive archive -config config.yaml

# Restore files from S3
s3-diff-archive restore -config config.yaml

# View database contents for a specific task
s3-diff-archive view -config config.yaml -task photos

Command-line Options

Each command supports the following flags:

-config: Path to configuration file (required)
-env: Path to environment file (default: .env)
-task: Task ID (required for view command only)

Example Workflow

Initial Setup:

# Create your configuration
cp config.sample.yaml config.yaml
# Edit config.yaml with your settings

# Set up environment variables
cp .env.example .env
# Edit .env with your actual AWS credentials

Scan for Changes:

s3-diff-archive scan -config config.yaml

Perform Backup:

s3-diff-archive archive -config config.yaml

Restore When Needed:

s3-diff-archive restore -config config.yaml

📁 Project Structure

s3-diff-archive/
├── main.go                 # Main application entry point
├── go.mod                  # Go module dependencies
├── config.sample.yaml      # Sample configuration file
├── archiver/              
│   ├── archiver.go        # File archiving logic
│   └── zipper.go          # ZIP compression utilities
├── constants/
│   └── colors.go          # Terminal color constants
├── crypto/
│   ├── files.go           # File encryption/decryption
│   └── strings.go         # String encryption utilities
├── db/
│   ├── container.go       # Database container management
│   ├── db-archiver.go     # Database archiving
│   ├── db.go              # Main database operations
│   ├── reg.go             # File registry management
│   └── view.go            # Database viewing utilities
├── logger/
│   ├── log.go             # Logging configuration
│   └── loggers.go         # Logger implementations
├── restorer/
│   ├── compare.go         # File comparison utilities
│   └── restorer.go        # File restoration logic
├── s3/
│   ├── s3-manager.go      # S3 operations manager
│   └── task-uploader.go   # Task-specific upload logic
├── scanner/
│   ├── scanner.go         # File system scanning
│   └── types.go           # Scanner type definitions
├── types/
│   ├── s3-config.go       # S3 configuration types
│   └── sfile.go           # File metadata types
└── utils/
    ├── config-parser.go   # Configuration parsing
    ├── notifier.go        # Notification system
    ├── rand-create.go     # Random data generation
    ├── tools.go           # General utilities
    └── zipper.go          # ZIP file utilities

🔍 How It Works

Scanning: The tool scans specified directories and calculates checksums for all files
Comparison: File states are compared against a local BadgerDB database stored in S3
Differential Detection: Only files that have changed (new, modified, or deleted) are identified
Archiving: Changed files are compressed into password-protected ZIP archives
Upload: Archives are uploaded to S3 with the specified storage class
Database Update: The local database is updated and synchronized with S3

🛡️ Security Features

Encryption: All archives are password-protected using ZIP encryption
AWS IAM: Leverages AWS IAM for secure access control
Secure Storage: Passwords are not stored in configuration files
Integrity Checking: File checksums ensure data integrity

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the Repository

git clone https://github.com/yourusername/s3-diff-archive.git
cd s3-diff-archive

Create a Feature Branch

git checkout -b feature/your-feature-name

Make Your Changes
- Write clean, well-documented code
- Follow Go best practices and conventions
- Add tests for new functionality
Test Your Changes
```
go test ./...
go build .
```
Submit a Pull Request
- Provide a clear description of your changes
- Include any relevant issue numbers
- Ensure all tests pass

Development Guidelines

Code Style: Follow standard Go formatting (go fmt)
Testing: Add unit tests for new features
Documentation: Update README and code comments as needed
Dependencies: Minimize external dependencies when possible

Reporting Issues

Use the GitHub Issues page
Provide detailed reproduction steps
Include configuration files (with sensitive data removed)
Specify your operating system and Go version

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

🙏 Acknowledgments

BadgerDB for efficient key-value storage
AWS SDK for Go for S3 integration
doublestar for glob pattern matching

📞 Support

For support and questions:

📫 Create an issue on GitHub Issues
📖 Check the documentation and examples above
🔍 Search existing issues for similar problems

Note: This tool is designed for efficient incremental backups. For initial backups of large datasets, the first run may take longer as it processes all files. Subsequent runs will be much faster as only changed files are processed.

This tool does not guarantee data integrity or security beyond the provided encryption and S3 storage features. Always test your backup and restore processes to ensure they meet your requirements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

S3 Diff Archive

🚀 Features

📦 Installation

Download Pre-built Binaries

Build from Source

⚙️ Configuration

Environment Variables

Configuration File

Storage Classes

Notification System

Available Placeholders

🔧 Usage

Basic Commands

Command-line Options

Example Workflow

📁 Project Structure

🔍 How It Works

🛡️ Security Features

🤝 Contributing

Development Guidelines

Reporting Issues

📄 License

🙏 Acknowledgments

📞 Support

About

Uh oh!

Releases 3

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
archiver		archiver
constants		constants
crypto		crypto
db		db
logger		logger
restorer		restorer
s3		s3
scanner		scanner
types		types
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
Readme.md		Readme.md
archiving_test.go		archiving_test.go
config.sample.yaml		config.sample.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

fahidsarker/s3-diff-archive

Folders and files

Latest commit

History

Repository files navigation

S3 Diff Archive

🚀 Features

📦 Installation

Download Pre-built Binaries

Build from Source

⚙️ Configuration

Environment Variables

Configuration File

Storage Classes

Notification System

Available Placeholders

🔧 Usage

Basic Commands

Command-line Options

Example Workflow

📁 Project Structure

🔍 How It Works

🛡️ Security Features

🤝 Contributing

Development Guidelines

Reporting Issues

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages