James Landis Hearts Neural Network Suitability Analysis February 2000 There are several distinct but interrelated tasks that must be learned to create an effective AI to play Hearts. These tasks include passing cards, shooting the moon, and preventing other players from shooting. Over the course of the game, the AI should learn how its opponents play and adjust its own style of play accordingly. The first task is to learn which of the cards remaining in a hand to play on each turn in order to yield the lowest possible score as compared to the opponents for that hand. The second task is to learn which cards in a given hand should be passed so that the hand is likely to earn the lowest possible score as compared with the opponents. Finally, the AI should learn to base the decisions for the first two tasks on the play of the opponents over the course of the game. There is one main performance measure for these tasks: the sum of the difference between the scores of all of the opponents and the AI. This sum can be taken for just one hand, or the current running total for the game. The training experience should be gained mainly from playing games against itself. The random nature of the cards in each hand provides robustness in the space of all hands and opponents' hands that is searched. At first, the algorithm will simply play random cards, but over time it will discover the nuances of the game, such as shooting the moon, sloughing cards on short-suited hands, etc... The target function should be learned in phases. In the first phase, the funtion will not learn how to pass or adjust behavior on opponents' play. This function takes as input the set of all cards in hand and the set of all cards played (and who played them). The second phase will learn how to pass, and will take as input the current hand as well as take into consideration the expected outcome of the the hands that it might possibly have after the pass. These hands will be evaluated based on the first phase. The behavior of the game play will also take as input the cards passed and the cards received. Finally, more inputs will be added to learn the play of the opponents. Over the course of the game, these inputs will be adjusted based on certain aspects of the opponents' play: whether or not they are ducking tricks, what kinds of cards they pass, how often they try to shoot the moon, etc... The function should be represented in a feed-forward neural network, trained using backpropagation. Much like Backgammon, hearts has an element of randomness (the dealing of the cards) and proceeds in a linear fashion to the end of the game. Tesauro's implementation of the network was able to learn to play Backgammon very well because of these features, so a neural network should also be able to learn to play Hearts very well. A polynomial or linear function on the elements is not likely to be able to capture the complexity of the variables inherent in the game.