Alpha Zero Connect 4 Python

Implementation details

Network architecture

graph LR;

A[Conv Block] --> B[Conv Block]
B --> C[FF block]
C --> D[Value head]
C --> G[Policy head]

Where:

Conv Block:
- Conv: 4n in, 4xn out, 5x5 convolution with stride 1, padding 2
- Activation: Selu
- MaxPool: 2x2 max pooling
- Dropout: 0.1
FF block: LazyLinear with output 256, Selu activation and dropout
Value head:
- Linear: 256, 64 out
- Activation: Selu
- Dropout: 0.5
- Linear: 64, 1 out
- Activation: Tanh
Policy head:
- Linear: 256, 128 out
- Activation: Selu
- Linear: 128, 128 out
- Activation: Selu
- Linear: 128, 64 out
- Activation: Selu
- Dropout: 0.5
- Linear: 64, 8 out

Training Methodology

In order to train the model, the following sequence of steps is applied:

For each episode do the following:
1. Create two agents, randomly choose one to start.
2. Play the game until the game is over.
3. Record the choices of each player.
4. The winner will take positive score whereas the loser will take negative score. Draws result with score of 0.
Run around 50 episodes in parallel and record the results
Train the model on the recorded results
Repeat the process

Commands

To train the model: run the trainer_main.py file. If you want to use the recorded model, use the load option to load the saved model.
To test the model in an actual main, use the normal main file. You can use the load to use the latest checkpoint of the model

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
checkpoints		checkpoints
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ai_agent.py		ai_agent.py
arguments.py		arguments.py
board.py		board.py
constants.py		constants.py
dataset.py		dataset.py
game.py		game.py
game_ui.py		game_ui.py
main.py		main.py
mcts.py		mcts.py
model.py		model.py
piece.py		piece.py
temperature_scheduler.py		temperature_scheduler.py
trainer.py		trainer.py
trainer_main.py		trainer_main.py
user_input.py		user_input.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Alpha Zero Connect 4 Python

Implementation details

Network architecture

Training Methodology

Commands

References

About

Uh oh!

Releases

Packages

Languages

License

chris-dem/alpha-zero-connect4-python

Folders and files

Latest commit

History

Repository files navigation

Alpha Zero Connect 4 Python

Implementation details

Network architecture

Training Methodology

Commands

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages