A Wordle solver that uses Bayesian statistics to guess the correct word. This project demonstrates how probability theory can be applied to solve everyday puzzles.
The solver starts with a list of all possible 5-letter words and uses letter frequency analysis to create initial probabilities. After each guess, it updates these probabilities based on the feedback received (correct letters in right/wrong positions). The goal is to find the word in as few guesses as possible.
- Letter Frequency Analysis: Calculates how often each letter appears in the word list
- Initial Scoring: Words with more common letters get higher initial probabilities
- Bayesian Updates: After each guess, eliminates words that don't match the feedback
- Guess Selection: Picks the word with highest remaining probability
- Letter frequency analysis for initial word scoring
- Proper Bayesian probability updating
- Interactive mode where you provide feedback
- Automatic mode that solves random words
- Performance testing on multiple words
- Verbose logging to see the solver's thought process
- Make sure you have R installed
- Put the word list file in the same directory
- Run the script and choose your mode:
- Automatic: Solver picks a random word and solves it
- Interactive: You provide feedback for each guess
bayesian_wordle.R
: Main solver scriptvalid-wordle-words.txt
: List of valid 5-letter wordsperformance_test.R
: Script to test solver performanceREADME.md
: This file
The solver typically finds words in 3-6 guesses, with an average of about 4.5 guesses for successful solves. In testing on 50 random words, it had a 74% success rate.
Some words are easier than others - the solver found "elain" in just 2 guesses, while some words took all 6 guesses.
The solver uses entropy calculations to measure how much information each guess provides. Early guesses focus on words with common letters, while later guesses use more sophisticated selection when fewer words remain.
The Bayesian update is straightforward: if a word would produce different feedback than what we received, its probability becomes zero. Otherwise, it keeps its relative probability.
This is for educational purposes to demonstrate Bayesian statistics. It's not meant to cheat at the actual Wordle game.
The solver works best when it has a good word list. The current list has about 14,855 words, which covers most valid Wordle words.
Could add:
- Better word selection algorithms
- Machine learning approaches
- Support for different word lengths
- Web interface
To see how well the solver performs, run:
Rscript performance_test.R
This will test 50 random words and show statistics about success rate and average guesses.