A Discord bot that stores and manages personal data through Direct Messages (DMs). The bot automatically categorizes and stores text, passwords, emails, links, and extracts text from images using OCR.
- DM Only: Only responds to Direct Messages, ignores server messages
- Auto-categorization: Automatically detects and stores:
- Passwords (with labels)
- Email addresses
- URLs and links
- Notes and text
- Image text (via OCR)
- Persistent Storage: SQLite database for reliable data storage
- Command Interface: Responds to specific trigger commands
- OCR Support: Extracts text from uploaded images using Tesseract
Command | Description | Example |
---|---|---|
!wake or !hey |
Wake up bot and get conversation summary | !wake |
!get password <label> |
Retrieve a saved password | !get password gmail |
!get credentials |
Get all saved credentials (username/password) | !get credentials |
!get credential <label> |
Get specific credentials by label | !get credential Gmail |
!get notes |
Get all saved notes | !get notes |
!get emails |
Get all saved emails | !get emails |
!get links |
Get all saved links | !get links |
!list |
List all stored data categories | !list |
!clear |
Clear all your data | !clear |
!recent [number] |
Show recent messages | !recent 10 |
!search <term> |
Search through stored data | !search password |
!help |
Show help message | !help |
- Credentials:
username: john password: mypass123
Gmail - username: [email protected] password: mypass123
Netflix: user: [email protected] pass: mypass123
Spotify - john: mypass123
- Passwords:
password: gmail mypassword123
- Notes: Any regular text message
- Emails:
[email protected]
(automatically detected) - Links:
https://github.com/user/repo
(automatically detected) - Images: Upload any image, OCR text will be extracted and stored
- Python 3.8+
- Tesseract OCR (for image text extraction)
Windows:
- Download from: https://github.com/UB-Mannheim/tesseract/wiki
- Install and note the installation path (usually
C:\Program Files\Tesseract-OCR\tesseract.exe
) - Add to PATH or set
TESSERACT_PATH
environment variable
macOS:
brew install tesseract
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install tesseract-ocr
Replit:
# In Replit shell
sudo apt update
sudo apt install tesseract-ocr
-
Clone or download the bot files
-
Install Python dependencies:
pip install -r requirements.txt
-
Create a Discord Bot:
- Go to https://discord.com/developers/applications
- Create a new application
- Go to "Bot" section
- Create a bot and copy the token
- Enable "Message Content Intent" in Bot permissions
-
Set environment variables:
# Required
export DISCORD_BOT_TOKEN="your_bot_token_here"
# Optional (Windows only if Tesseract not in PATH)
export TESSERACT_PATH="C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
Windows (PowerShell):
$env:DISCORD_BOT_TOKEN="your_bot_token_here"
$env:TESSERACT_PATH="C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
Windows (Command Prompt):
set DISCORD_BOT_TOKEN=your_bot_token_here
set TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe
- Run the bot:
# Production mode (recommended)
python start_bot.py
# Or direct mode
python bot.py
# Or with validation only
python validate_setup.py
- Create a new Repl and upload all bot files
- Install Tesseract OCR in the Replit shell:
sudo apt update sudo apt install tesseract-ocr
- Set environment variables in Replit Secrets:
DISCORD_BOT_TOKEN
: Your Discord bot token
- Run the bot - it will automatically start the keep-alive server
- The bot will stay awake with the built-in keep-alive mechanism
Replit Features:
- ✅ Automatic keep-alive server (port 8080)
- ✅ On-demand processing with
!wake
command - ✅ Conversation history analysis
- ✅ Persistent SQLite database
- ✅ Health monitoring endpoints
- ✅ Performance metrics tracking
discord-personal-bot/
├── bot.py # Main bot entry point
├── storage.py # Database operations with validation
├── ocr.py # Image OCR processing
├── commands.py # Command handling
├── config.py # Configuration settings
├── monitoring.py # Performance monitoring
├── keep_alive.py # Replit keep-alive server
├── validate_setup.py # Setup validation script
├── start_bot.py # Production startup script
├── run_bot.py # Simple startup script
├── requirements.txt # Python dependencies
├── README.md # This file
├── logs/ # Log files (created automatically)
└── user_data.db # SQLite database (created automatically)
The bot can be configured using environment variables:
Variable | Default | Description |
---|---|---|
DISCORD_BOT_TOKEN |
Required | Discord bot token |
DATABASE_PATH |
user_data.db |
SQLite database file path |
TESSERACT_PATH |
Auto-detect | Path to Tesseract executable |
LOG_LEVEL |
INFO |
Logging level |
LOG_FILE |
bot.log |
Log file path |
MAX_RECENT_MESSAGES |
50 |
Max messages for recent/search |
MAX_SEARCH_RESULTS |
10 |
Max search results to display |
You: password: gmail mypassword123
Bot: (stores password, no response)
You: [email protected]
Bot: (stores email, no response)
You: https://github.com/user/repo
Bot: (stores link, no response)
You: Remember to buy groceries tomorrow
Bot: (stores as note, no response)
You: !get password gmail
Bot: Password for 'gmail': `mypassword123`
You: !get notes
Bot: Your Notes:
1. Remember to buy groceries tomorrow
2. Meeting at 3 PM today
You: !list
Bot: Your Stored Data:
📝 Total Messages: 15
🔑 Passwords: 3
📄 Notes: 8
📧 Emails: 2
🔗 Links: 2
- All data is stored locally in SQLite database
- Bot only responds to DMs, not server messages
- No data is shared between users
- Consider enabling encryption for sensitive data (see config.py)
The bot includes comprehensive monitoring and health check capabilities:
http://your-repl-url.repl.co/
- Bot status pagehttp://your-repl-url.repl.co/health
- Basic health checkhttp://your-repl-url.repl.co/health/detailed
- Detailed health informationhttp://your-repl-url.repl.co/metrics
- Performance metricshttp://your-repl-url.repl.co/status
- Bot status JSONhttp://your-repl-url.repl.co/ping
- Simple ping endpoint
- Message processing rate
- Command execution rate
- Error tracking and categorization
- OCR operation performance
- Database operation metrics
- Uptime tracking
- Comprehensive logging to
logs/bot.log
- Console output for real-time monitoring
- Error categorization and tracking
- Performance metrics logging
-
"Tesseract not found" error:
- Install Tesseract OCR
- Set
TESSERACT_PATH
environment variable (Windows)
-
"Invalid Discord bot token" error:
- Check your bot token is correct
- Ensure bot has "Message Content Intent" enabled
-
Bot not responding to DMs:
- Make sure you're sending DMs, not server messages
- Check bot permissions
-
OCR not working:
- Verify Tesseract installation
- Check image format is supported (PNG, JPG, etc.)
Check bot.log
for detailed error messages and debugging information.
- Add command logic to
commands.py
- Update the
handle_command
method - Add help text to
_handle_help
method
- Add new table to
storage.py
initialize
method - Add corresponding storage/retrieval methods
- Update
get_all_categories
method
This project is open source. Use responsibly and ensure compliance with Discord's Terms of Service.