PROJECT GAME BLOG ABOUT ME SETTING UP

BLOG

Here is a record of my progress so far.

PICKING A GAME

February 1, 2024
Screenshot

First week done! After doing some research on two-player perfect information games, I settled on two games that look promising. Santorini is a relatively new game that looked fun and interesting, but could potentially have too big of a state space to effectively train a ANN to play it well in only a single semester. Othello (also known as reversi) is a classic, safer choice in the event that Santorini is overly ambitious. I ended up programming the basic logic for both games for the time being as I do more research on training and building my artificial neural network.


FOUNDATIONAL WORK

February 8, 2024
Screenshot

Continuing from last week, I optimized my code for the game Santorini. The game currently allows for two human players to play against each other. Eventually, I will need to allow a game to be played with a combination of different player types. The current state of the program is handling all edge cases and bad user inputs, which is a good foundation to build on. I also have began the development of another player type that makes a completely random (but valid) move choices. I have cleaned up most of the experimental code I built prior to ensure I have reliable functions with clear documentation. Most of the functions are used to verify all of the games rules are followed across all stages of the game.


DESIGNING A SIMPLE AI

February 20, 2024
Screenshot

This week I finished development for a random player, and allowed games to be played in any combination of human and random player types. The random player first randomly select one of it's workers, randomly generates a change_in_x and change_in_y (repeating this step until a valid move is found), and moves the worker accordingly. It follows a very similar process for placing tiles.

In order to develop the next player type, I need to have a firm understanding of strategy in Santorini. The next player is a simple AI that has a decision making process following basic heurisics/strategies. This AI would employ an evaluation function which determines the best move to make by assigning a high value to boardstates which maximally followed the heursitcs. As a result, this AI would follow basic strategy and would almost certainly be better than a purely random AI. The choice of these heuristics would dramatically effect the performance of the simple AI player, but selecting 'good' heuristics is subjective and must be guided by good intuition of the game. A basic heuristic is "one-step lookahead", which evaluates whether or not there is a winning game state within one move, and always takes it when avaliable. In the context of Santorini, if one of the player's workers can move to the third floor, it will always make that move. Another basic heuristic is "two-step lookahead", which evaluates the best position within two moves. In the context of Santorini, if the opposing player has a winning move on their next turn that can be blocked by the simple AI, then that is the best move to take.

It can be difficult to determine more nuanced heuristics. For example, the optimal play from an offensive standpoint would be to move upwards whenever possible. From a defensive standpoint, it would be best to keep the opposing player from moving upwards whenever possible. Taking both of these ideas in account, a 'good' heuristic to include in the evaluation function would be one that considers the height differential between the two players, which takes both the defensive and offensive in consideration.

While I'm still figuring out the specific implementations of the heuristics, I've decided which ones I would like to follow. My simple AI should (1) take a winning move when possible, (2) block the oppositions winning move when possible, (3) maximize the height differential of the players in favor of the AI, (4) attempt to stay central to the board to maximize avaliable moves, and (5) maintain an optimal distance from the opposing workers.


ANN FUNDAMENTALS

March 1, 2024

As I implement the design of my simple AI, it has become increasingly more important that I have a solid understanding of Artificial Neural Networks (ANN), which my primary AI system will utilize. In my research this week, I have done some research about ANNs and how to actually develop one. There are a wide variety of ANN libraries in C++, of which I have been sorting through the documentation and assessing which best suits my programs needs.

The ANN of my primary AI will essentially serve as that AI's evaluation function. Provided a boardstate, it will produce a value that attempts to accurately estimate the likelihood of winning from that position. The network takes the boardstate as a vector for input, and it subsequently performs matrix transformations and pruning to refine this state into its 'value'. The magic comes from discovering the proper matrices to use in these transformation, and selecting the correct weights for them. I can improve my ANN by reducing the sum of the squared error (difference between produced and expected 'value' of a state) across all boardstates.


EVALUATION FUNCTIONS

March 8, 2024
Screenshot

This week I made some fundamental changes to how I was handling various functions in my program. Previously, my rand_Player would randomly guess a move and check if it was valid. While this method worked well, it became necessary to choose a better method. I first encoded relative positions from the player as integers. 0 == "upper_left", 1 == "above", 2 == "upper_right", 3 == "left"... These relative positions could be used to index a boolean array of size 8, with each entry representing whether or not moving in that direction was valid. I could also use these relative positions to index another array, representing whether or not placing in that direction was valid. I used this method to construct a 2x8x8 array which held the validity of each placement for each move for each worker for a player. After this array is populated, I would only need to randomly select a "true" value from the array and decode the respective moves embedded in it. Moving to this method is necessary for the evaluation functions for the ANN player and Heuristic player. For this to work, I need to feed every possible move into these evaluation functions, determine the "value" of those moves, and select the highest valued move. With this in place, I implemented my first evaluation function for the Heursitic player. This function provided high values for moves that kept workers central to the board, maximized a height differential between player and opposition, and won the game. With these three heuristics in place, the Heursitic player should always outperform the random player.


MY FIRST ANN

March 20, 2024
Screenshot
This week I began designing a very basic ANN. Fundamentally, an ANN is a series of layers. The first layer is the input layer, which in my implementation will be some representation of the current boardstate of the game. I am currently storing a boardstate as a sequence of 51 numbers: the first is the color of the player whose turn it is, the next 25 are the heights in each board position, and the last 25 are whether a worker is present in each board position (and if so, what player it belongs to). The internal layers (dubbed hidden layers) are intermediate steps within the ANN. Each layer has a matrix multiplication, a vector addition, and a nonlinear function that filters out some values. In the first implementation of an ANN, I simply had each matrix encode adjacent positions. The last layer is a 1x50 vector that condenses the output into a single value. This value is the approximation of the worth of the inputted boardstate. For the time being, I wrote out a matrix representation for each layer of a extremely basic (and extremely dumb) ANN. This will need to be improved with the training process later on.


ANN SUPPORT FUNCTIONS

April 3, 2024
Screenshot

In order to improve an ANN, I need to be able to record its performance and adjust it. In order to record its performance, I adjusted my program so that every single move made in a game is recorded. In order to adjust the ANN easier, I made initialized the player by reading in each one of its matrix layers and shift vectors from a file. Finally, I implemented an evaluation function that handles the matrix mulitplication, vector addition, and ReLu.


TRAINING

April 12, 2024
Screenshot

This week is the first week of training... The AI does not have a lot of time to train, so I need to work in overdrive now! The first step is to translate my array of valid moves into an array of valid boardstates reachable in a single turn. Then, I need to make a parallel array that holds the respective board evaluations of these boardstates (provided by the ANN). Finally, a simple function that randomly selects a move within a threshold of the best move avaliable is required. After the ANN can successfully make moves, I need to train it to make better moves. This is done by reducing the overall error across the entire ANN. Essentially, in order to accomplish this I must gather a bunch of boardstates (from playing 10 or so games), assess the projected value of each boardstate (by putting it in the ANN), assess the "actual" value of each boardstate (by playing 1000+ games from it as a starting position and measuring the win rate), and adjusting each of the 7650 variables within the ANN to reduce the error. Obviously, this process is very tedious and long, and I am now on a time crunch, but hopefully after the first trial of training I will have a solid understanding of how to best conduct this process (and repeat it many times to improve the ANN)!!


FINAL PRESENTATION

April 19, 2024
Screenshot

Screenshot

We are nearing the final presentation! Writing an algorithm to train the ANN took longer than expected. In order to do so, I needed to calculate the gradient of the error function across the each variable in the neural net. This has proven difficult due to memory required exceeding stack reserve size issues. Whether or not it is correct is also extremely hard to test due to how w3-hide-large the computation is. This week I also had to create my final presentation for capstone!


DEFENSE

May 1, 2024

The defense is coming up, so this week I spent a lot of time making my code annotations better, my website up to date, and making some diagrams. I also had to introduce a learning rate to my gradient calculation due to the large initial difference between the target and estimate values. Without the learning rate, the ANN was making too great of adjustments with each training iteration, which overestimated the magnitude in which a variable was contributing to the error of the ANN. With the end of the semester coming up, I need to prioritize continuing training and fixing my minimax algorithm.


FINISHING UP

May 8, 2024

To finish up the project, I made a lot of changes to make a version of the project that is publically accessible. I spent some time creating a readme.txt file to documnet a high level overview of the code, as well as cleaned up the documentation for individual functions. I made a lot more features easier to configure at the user level (rather than hardcoded) like selecting player types for playing a simple game. Finally, I finished a cleaner implementation of the minimax algorithm and I set the default turn decision algorithm to employ two turn lookahead (minimizing the opponents best move).