CSCI460 Senior Capstone Experience

Creating Equal Baseball Leagues

St. Norbert College

May 6th, 2023

This is my final blog post and I am proud of all the work I put in this semester. After my defense meeting, we talked about adjustments I could make to my program that would make it an even better project. One of these ideas would be to display more details about the most fit league so that we can gain insight on how the leagues evolved over time. I feel like this algorithm can be applied to many different problems in our everyday lives, and can help out businesses in the area.

Thinking back on when I decided to pursue this major, I was scared and fearful that I would be too stressed out to finish. I could not have been more wrong. All of the material we learned and projects assigned did not feel like work, and the future I have because of CSCI is bright. I truly enjoyed working on my project throughout the semester and I am happy with how my project turned out, and feel a great sense of achievement. Before this project, I had no idea what a genetic algorithm was but now I can see all of the uses and problems that can be solved because of the power a genetic algorithm can bring.

April 26th, 2023

It is the night before my presentation and I have some new updates and progress to share. Firstly, Dr. McVey and I modified the way that the leagues for the next generation are created. Originally, we selected the leagues that move onto the next generation after crossover and mutation. We found that this was skewing our data and had to find a better way to go about this problem. We came up with the idea to automatically place the leagues that we modified into the next generation right after we change them instead of waiting until the end of our processes. This also means that the number of trades (crossovers), mutations, and elite teams (leagues that automatically move on) must equal the number of leagues per generation.

More progress was made in my goal of having an easy to use GUI (Graphical User Interface) with a lot of help from Dr. McVey and Dr. Diederich. This GUI enables the user to modify and set their own parameters for the genetic algorithm. I felt this was important because I wanted the user to be able to see all of their changes and settings at once, as well as while the program was actually running. I allow the user to modify the number of leagues per generation, the number of TopDogs (leagues that automatically move on), number of trades, and number of mutations. The user can also choose their stopping conditions. Stopping conditions are criteria that the program uses to know when to stop computing generations. Either a max number of generations is met or a goal difference is achieved to stop the generations. Finally, the user gets to choose how many generations are displayed.

gui1 gui2

April 9th, 2023

It's been a while since I've posted on my blog. There has been significant progress made and that is partly the reason why I am lacking on my blog. Since my last update, I have successfully created and implemented my crossover and mutation functions for my genetic algorithm. A crossover is basically a trade between teams. I randomly select a league from the current generation, and then randomly select 2 teamms that will be trading between eachother. Since all of my teams have the same structure, it is pretty simple to trade between the teams because I am guaranteed to have the correct amount of players at each position.

Mutation is also fairly simple because I have a player object that saves all of their data and functions to access and modify a player. To mutate, I also start by selecting a random league in the current generation and then select a random player from one of my teams. A mutation resembles either a hot or cold streak for a player. I give each player an updated WAR by selecting a random value from a range of the highest and lowest WARs in the league. Once I crossover and mutate in my generation, the best leagues (besides the best leagues that automatically move on) are selected and move onto the next generation where these processes are repeated again.

trade1 trade2 mutation

March 15th, 2023

Dr. McVey helped me again, and we solved my solutions problem. Instead of creating a struct for my solutions, we created a 2-Dimensional array that resembles a grouping of leagues. I can sort the leagues in each solution by the difference between the highest team WAR and lowest team WAR. We were also able to solve my memory allocation issue, and started to dynamically allocate to solve the problem. Spring break is coming up and I am going down to Florida for the break for a baseball tournament and hope to work on my project over break. Up next is when I try to tackle the crossover and mutation functions.

March 8th, 2023

I have been struggling lately when trying to form my solutions. I am trying to make a League struct, which I envision to hold a private member for the WAR and a private array to hold leagues. This is causing me problems and am researching ways to fix this problem. I also have a memory based problem, and I think that my data structures are allocating too much memory. I am getting errors when trying to create leagues of larger sizes. I am also starting to think about how my crossover and mutation functions will work.

March 1st, 2023

I changed my team composition today to emmulate a more realistic team... a team now consists of 3 pitchers, 2 catchers, 6 infielders, and 6 outfielders. With the help of Dr. McVey, we created a team object so that I can make a league, which is an array of Team objects. I can also find the difference of the leagues which will be important when it comes to the crossover function and finding the most fit leagues when the leagues are in solutions (groupings of leagues).

February 22nd, 2023

Today I was able to create a couple teams made up of the players from my data set. Each team is composed of 1 pitcher, 1 catcher, 4 infielders, and 3 outfielders without any duplicates or problems. I can also add up the team total WAR which is a way to show how good a team is based on it's players. My next project is going to be creating multiple leagues with these teams... but for now I am happy and having fun!

February 20th, 2023

As of tonight, I have a base for my genetic algorithm. It works when I use 3 variables and plug them into a function, and see how close I can get to a certain value after the calculation is completed. Now, I have to figure out how I can plug teams of players into this algorithm to achieve my goal of creating equal teams. I also gathered all of my data for my players via https://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=y&type=8&season=2022&month=0&season1=2022&ind=0&team=0&rost=1&age=0&filter=&players=0&startdate=&enddate=.

I have decided to group my players into 4 "positions" as of now: Pitchers, Catchers, Infielders, and Outfielders. I did this because I think it will be easier to work with, and if I want to go back and change this in the future when I know I have a working solution, I can do so. I made a Player class in C++ and successfully made an accurate Player object for all 560 players (I know it's a lot!). The object contains a player's name, position, and WAR (I call it rank). I am confident about my progress but I know I still have a long way to go.

February 15th, 2023

I started my attempt at coding my genetic algorithm today and found a good video as a reference for getting started. This YouTuber did a really good job at explaining what was going on step by step and I found it really helpful since I do not have a lot of experience with genetic algorithms. I can see that using premade libraries is going to be very useful and I am excited to see how this portion of my project goes.https://www.youtube.com/watch?v=SWi-4IHFf1c.

February 12th, 2023

Our mini poster board sessions went really good. It helped me to organize my thoughts and explaining it to others proved to myself that I knew the overall plan going forward.

February 8th, 2023

Thinking of which language I would like to write in and I think C++ will be my choice. Since my project is reliant on data and I do not need any visual tools, I think this is the right choice. My only problem is that I need a form for users to use and I think this link will be very helpful: https://social.msdn.microsoft.com/Forums/vstudio/en-US/e6fbde42-d872-4ab3-8000-41ab22a4a584/visual-studio-2017-windows-forms?forum=winformsdesigner.

February 6th, 2023

Since I've had minimal experience with genetic algorithms, this past week has been spent watching videos and researching different examples and ways to design genetic algorithms. This video was a good source: https://www.youtube.com/watch?v=uQj5UNhCPuo. I also had a meeting with Dr. McVey this past week which helped clear my initial questions.I am starting to get a better understanding about what I need to achieve and I am not as freightened as I was to begin with.

I decided to use the WAR statistic (Wins Above Replacement) to rank and compare my players. Essentially, WAR takes a complete look at a baseball player, and uses a formula to calculate how valuable they are with hitting, defense, pitching, and more. This statistic also adjust based on different ballparks and the level of competition on the opposing teams. The higher the WAR for a player, the more valuable they are because that means they won more games for your team than a Minor League player or free agent playing the same position. Now I have to think of a way to measure the balance of each league.

January 30th, 2023

Hello and welcome to my first blog post! After receiving our topics last week, I found out that I will be working with genetic algorithms. I had a meeting with Dr. McVey to go over a plan and what to start thinking about and I am excited to get started.

I got assigned the task of creating balanced teams based on a pool of professional baseball players using a genetic algorithm. I see this project as two smaller projects. The first part is going to be deciding how to assign a player a value so that my algorithm can create equal teams. This will be challenging because there are countless different stats to consider for each player, and pitchers don’t have any comparable statistics with position players. In order to ensure my teams will be equally created, I need to find an accurate system to give value to these players. The second part will be creating my genetic algorithm and deciding how to know if and when the teams are equal in my set. For example, I could compare average values of each team in the leagues, or compare the difference between the “top” and “bottom” teams.

Right now, my project seems intimidating because I have never dealt with genetic algorithms, but am excited to be working with baseball statistics and am willing to learn. I will spend this week brainstorming my system for evaluating players and trying to find relevant sources online.