*DQ : DQ stands for disqualification. It occurs when a player makes a certain number of invalid moves in a game. The threshold for disqualification is 3 invalid moves in Tic-Tac-Toe, 6 in Connect Four, and 15 in Gomoku. Invalid moves can result from the LLM's response not following the specified format, providing a row or column that is out of the allowed range, or choosing a position that is already occupied by a previous move.
If you would like to submit your results to the leaderboard, please send the zip file, which was downloaded after running the game simulation, to research.explorations@gmail.com. Please contact if you have any questions.
If you would like to see a deeper look into the results of the games, please have a look at the Results Matrix.