Reflections on the Process of Writing a Chess Engine

Over the last year, I have sporadically been working on writing a chess engine from scratch in. As an avid chess player in my youth, I decided that the project would provide a good opportunity to both revisit my interest in the subject and further my applied programming experience. Prior to this project, I had not spent very much time working in C++; as such, I used the project as an opportunity to furnish my skills. In this post, I'll briefly discuss the process of writing the engine, as well as certain issues I encountered along the way.

First Attempts

When first starting the process of writing the engine, I had to spend a significant amount of time familiarizing myself with the nuances and conventions of chess programming. In my final engine, I decided to go with bitboards to represent the locations of pieces on my board. In my implementation, bitboards are 64-bit integers where each bit in the number corresponds to a square on the board; for example, the least significant bit corresponds to A1, while the most significant bit corresponds to H8. Storing bitboards for each type of piece allows us to fully represent the state of the board, where set bits in the bitboard represent the presence of that type of piece on that square. From what I can tell, bitboards are considered to be the current "standard" on what representation to use. Other options exist, but from my experience bitboards have the greatest amount of preexisting literature on the subject, which greatly assists with construction of the engine.

After a while, I eventually got the engine to a point where it "worked" and I was able to play games against it. Its performance was seemingly adequate for the amount of time I'd spent on it; however, I soon hit a wall. My efforts to add further improvements to the engine such as transposition tables and late move reduction proved to be fruitless, and I ran into seemingly exponentially growing amounts of bugs. As a result, I decided to set the project aside for future evaluation.

Starting Fresh

After eventually coming back to the project, I decided to take the opportunity to scrap all of my code and start anew, in hopes of avoiding ingraining bugs deep into the framework of the engine. As such, I decided to take a much more careful approach in writing my move generation code. After writing my initial pseudo-legal generator, I went through all of the Perft test positions in the given link, and checked to see if the numbers properly aligned. If there was any sort of discrepancy, I followed the methodology described in this blog. Through this process, I was able to write a move generator that I am confident does not contain any bugs.

Once I finished that portion of the project, I again attempted to implement a transposition table. To my surprise, this time it worked as intended, and I achieved significant speedup in my search processes. After that, I found that the combination of PVS/Negascout, LMR, Quiescence search, and null-move pruning gave me very good results, more than doubling the depth I was originally able to reach. The evaluation function proved to be difficult to sort out, but even just including basics of king safety, pawn structure, and mobility resulted in the engine being able to beat engines and people with ELO ratings up to around 2000.

Implementing the UCI Protocol

One major stumbling block I ran into was my attempt to connect my engine to a GUI, with the intention of helping facilitate matches against other engines. I decided that I wanted to use the UCI Protocol to do so, which uses standard input to transfer commands between the engine and the GUI. The basic structure of the UCI protocol follows the Model-View-Controller framework, and requires writing a loop function to manage the output. While ostensibly this is a simple procedure, I encountered a few problems along the way that greatly complicated the process. The issue that took the most time to resolve dealt with processing user input while the engine is searching; for example, the engine needs to be ready to stop, print information, or assert it is running at any given time at the process. As I found, by far the easy way to do this is with rudimentary multithreading. The basic structure of my loop method went as follows:

void loop():
    while true:
        read input into line
        if line == "isready":
            print readyok
        if line == "uci":
            ...
        if line == "go":
            start thread and launch search on that thread
            detach thread and continue operations on original thread
        ...

I ended up creating and destroying one additional thread to handle searches in the background, while still collecting input in the original thread. This fixed the majority of the issues. However, to properly terminate searches, the search method requires some way to know that it has been stopped. To do so, I created a struct that I passed by reference into the search function and contains information including whether the search has been stopped or not. If "stop" is sent to the loop at any point, the struct is updated, and the search function is told to terminate it. After finishing this, the engine properly worked with Arena and other UCI-compatible GUIs.

Further Steps

Currently I feel that I am mostly finished with the current iteration of the engine. Further improvements at this point would either come through further complication of the evaluation function or tedious optimization of various parameters. One possible thing that I could add is proper time management. At the moment, by default the engine just searches to a fixed depth every time "go" is solved. If the engine gets low on time, this could result in unnecessary timeouts. As such, I could simply take the current time into account when determining what depth to allocate to each move. The engine can comfortably beat me now, which was my original goal; although the process proved to be longer than expected, I would consider the project to be a success.

As a final addendum, I tested my engine using the Strategic Test Suite, and received a rating of 1940 overall. This is an aggregation of 15 different ratings obtained through evaluation of sets of 100 problems. These results seem reasonable considering the strength of my program in practice. Notably, however, the variance of these 15 ratings is massive: 444 points. The engine received scores as low as 1280 in pawn advancement ratings, and as high as 2871 in recapturing heuristics. This aligns well with what I perceived the strengths in weaknesses of my engine to be. If I were to continue tweaking evaluation features, my first focus would probably be on pawn advancement and passed pawn scores.