class: center, middle # Artificial Intelligence ### Lab 2 --- class: medium # Lab 2 * In Lab 2 you will implement an agent that uses Monte Carlo Tree Search to play blackjack * You will get a python "framework" with an implementation of the game and some agents to compare with * You can work in groups of up to two students * The submission deadline is on Tuesday, 3/23, AoE --- class: medium # Blackjack * Blackjack is a popular card game, where players play against the dealer/bank * The player is dealt two cards and can request to draw more cards from the deck * The goal is to get a sum of card values close to 21, but not over 21 * For example: Ten of Spades, Three of Hearts, Seven of Clubs are 10+3+7=20 points * Jack, Queen and King are 10 points each, an Ace can count as 1 **or** 11 points (player's choice) --- class: medium # Blackjack * After the player has performed their actions, the dealer draws cards until they have more than 16 points * If the player has more than 21 points, they lose * (Else) If the dealer has more than 21 points, the player wins * (Else) If the player then has more points than the dealer, the player wins * (Else) If there is a tie, no one wins * The winner gets an amount of points --- # Blackjack
([Source](https://www.blackjackapprenticeship.com/how-to-play-blackjack/)) --- class: medium # Blackjack: Player Actions A player can do one of four things: * **Hit**: Request one more card * **Stand**: Stop taking cards, passing the turn to the dealer * **Double Down**: Draw exactly one more card and then stand, and double the bet (win or lose $2) * **Split**: If the first two cards have the same value, the player can split them into two hands, and continue playing with these two independently (each hand wins/loses $1) --- # Blackjack Strategy
([Source](https://www.blackjackinfo.com/blackjack-basic-strategy-engine/?numdecks=6&soft17=s17&dbl=all&das=yes&surr=ns&peek=yes)) --- # MCTS for Blackjack * Blackjack is a nice game for our purposes * There is randomness * We only have (up to) 4 actions to choose from * Games only last a few turns * There is actual strategy! --- # MCTS for Blackjack
--- # MCTS for Blackjack
--- # MCTS for Blackjack
--- # MCTS for Blackjack
--- # MCTS for Blackjack
--- # MCTS for Blackjack
--- # MCTS for Blackjack
--- class: center, middle # The "Framework" --- # Framework Structure
--- # Running the Framework ``` > python blackjack.py -q Average points: -1.58 > python blackjack.py -q basic Average points: -0.3 > python blackjack.py -q -n 3 timid Average points: -0.6666 > python blackjack.py --help usage: blackjack.py [-h] [-n COUNT] [-s] [-r] [-d D] [player] Run a simulation of a Blackjack agent. ... ``` --- # Framework: Game ```python class Game: def __init__(self, cards, player, split_rule, verbose): pass def round(self): pass def continue_round(self, player_cards, dealer_cards, bet): pass def main(dtype, split_rule, verbose): deck = deck_types[dtype] p = Player("Playername", deck[:]) g = Game(deck, p, split_rule, verbose) winnings = g.round() ``` --- # Deck Types?! * No one said you *have* to play with a standard 52-card deck * There are several (fictional) deck "types" defined in the code (select with `-d type`) * Your agent will be able to (automatically!) play with any of them * If you have ever wondered what would happen if a card was valued 12 points, or 1.5 points, etc., you can find out * If you haven't, you will still find out --- # MCTS ```python class MCTSPlayer(Player): def get_action(self, cards, actions, dcards): deck = self.deck[:] for p in cards + dealer_cards: deck.remove(p) p = RolloutPlayer("Rollout", deck) g1 = Game(deck, p, verbose=False) results = {} for i in range(MCTS_N): p.reset() res = g1.continue_round(cards, dcards, self.bet) return best_action ``` --- class: medium # MCTS * Currently, the `RolloutPlayer` **does not construct a tree** * You will have to come up with a way to store and construct tree! * When the `RolloutPlayer` should perform an action, you have to locate where in the tree it currently is: - All actions expanded: Use selection strategy - Not all actions expanded: Choose one to expand and start simulating - In simulation: Choose action randomly (optionally: with a different strategy) --- class: medium # Report and Evaluation * Gambling is bad * But let's see how bad * Compare your agent's performance with the other agents' * By default, you will get an average over 100 games * For the final evaluation, use more repetitions to reduce the variance * Optional: Statistical analysis --- class: medium # Report and Evaluation * You can use whichever selection strategy you want (epsilon-greedy, roulette wheel, UCT, your own invention, ...) * It will likely have some parameter(s) * Try how/if different parameter values have an effect on your agent's performance * Also vary the general `MCTS_N` parameter! --- # References * [MCTS Tutorial](https://www.cs.swarthmore.edu/~bryce/cs63/s16/reading/mcts.html) * [MCTS Slides](https://www.lri.fr/~sebag/Slides/InvitedTutorial_CP12.pdf) * [MCTS for a Card Game](http://teaching.csse.uwa.edu.au/units/CITS3001/project/2017/paper1.pdf)