Jabril: D7 John-Green-bot: Miss. John-Green-bot: I-9 Jabril: NOOOOOOOOOO Hey, I’m Jabril and welcome to Crash Course AI! John-Green-bot might struggle with some things like natural language processing and moving through the real world, but using AI he’s pretty good at board games. AI researchers spend a LOT of time trying to teach AI how to beat humans at games, & this isn’t just because games are fun. Games provide constrained scenarios for testing new AI algorithms and approaches. In a game, it’s easy to know whether the AI is winning or losing, because there’s usually a score or some objective measure of “winning.” This is great because AI learns from examples, trying things out, and slowly improving. Games basically provide their own training data, which is a big relief because AI systems need lots of training data to get really good. An AI can even play against itself to generate training data and evolve better game strategies. Or an AI can be programmed to look at previous games (even games played by expert humans) for strategies that lead to victory. Comparing AIs against expert human gamers can also help us figure out how an AI is improving over time. This comparison also gives us a sense of the difficulty of problems an AI can solve. In other words, if I can teach John-Green-Bot to beat me at battleship, I can probably also teach him to beat me at any game that’s simpler than battleship. Finally, games are cool (and that’s important too!). Sometimes AI programming can feel a bit difficult or dry because of all the math and troubleshooting, and games can provide a fun motivation to learn all this stuff. This is the reason why some of my first AI demos were games. For all these reasons, games and computing have a rich history that’s worth diving into. For that, let’s go to the Thought Bubble. Humans have always been fascinated by the idea that machines could beat us at our own strategy games. In 1770, the inventor Wolfgang von Kempelen revealed an invention to the Empress Maria Theresa of Austria. The Mechanical Turk was a clock-like contraption that appeared to play chess. Chess pieces were attached to rods that were hooked into a bathtub-sized box. After Empress Maria made a move on the board, she turned a crank that activated the machine, which would move chess pieces mechanically. To her surprise, the machine was able to beat most challengers. However, it was an elaborate hoax and The Mechanical Turk was actually controlled by a person hidden inside! Getting a machine to play chess is actually really complicated. So when AI researchers first tried to tackle the problem in the late 1940s and early 1950s, they focused on simpler chess situations. Like, games with only a few pieces remaining on the board or full games played on a small 6x6 board without any bishops. At the same time, researchers worked on an AI that could play checkers, because checkers looked easier… although it was really almost as complicated. The first program to play a legitimate game of checkers was built by IBM in 1956. And, in a classic cold war move, two programs that could play a full chess game were developed in parallel by the US and Russia in 1957. But these programs didn’t get /good/ for another 40 years. Checkers was first, with a program called Chinook which started dominating masters in 1995. Chess followed when a computer called Deep Blue beat the chessmaster Garry Kasparov in 1997. Thanks Thought Bubble. Since then, strategy games have been mastered one-by-one, with the most recent victories over humans at Go in 2017, DOTA 2 in 2018, and Starcraft II in 2019. Okay, so the best way to understand the difficulty of teaching AI to play games is through an example. Oh John Green Bot. So let’s start with a really simple goal: teaching John-Green-bot to play tic-tac-toe. One of the ways that we can think about playing tic-tac-toe is as a tree with all the possible moves from any given state of what the game board looks like. For example, if this is the current game state, it’s John-Green-bot’s turn, and he’s using Xs… there are three places he can go. We can draw a tree representing possible outcomes for each of these options, and all of the options his opponent (me, or anyone else) can take: Because computers think with numbers, each outcome can be assigned a reward -- a number like a 1 for a win, and -1 for a loss or tie. Basically, John-Green-bot will need to search through the tree of possibilities to find his win. To decide which choice to make, John-Green-bot will assume that in each tree, both he AND his opponent will make the best possible choices. In other words, his policy (or his framework for making decisions) will alternate between choosing the branch that will maximize the outcome of winning on his turn, and minimize the outcome of his opponent winning on their turn. This is called the minimax algorithm. Then, each game state can be assigned a value based on how likely it leads to John-Green-bot winning, and he can decide which move to make based on his policy. Looking at this tree, John-Green-bot will always pick option 1.0, and win the game! Of course, this was a pretty small tree because we were looking at a game in progress. To draw the whole tic-tac-toe tree from beginning to end, we would need to represent about 250,000 boards. That seems like a lot, but it would take like a half a second for a powerful modern computer to compute this many options. By laying out all the possibilities and taking the paths that led to a win, John-Green-bot can solve tic-tac-toe. This means that John-Green-bot will always achieve the best possible outcome, either a win or a tie, no matter how his opponent plays. Thanks John Green Bot. But we can’t solve all games this way. Checkers, for example, has about 10 to the 20th power board states… or 10 followed by 20 zeros. That’s more board states than there are grains of sand on Earth! Chess has 10 to the 50th power board states. And Go has 10 to the 250th power board states. To put those huge numbers into perspective, there are only 10 to the 80th atoms in the entire known universe! Computer scientists have theorized that it would be impossible for conventional computers to calculate this many states due to the laws of physics. Like, for example, if you combined all planets and stars and everything in the whole universe into a single supercomputer, it still wouldn’t be powerful enough to solve the game of Go. But some people have hope that quantum computers may be able to get there one day... So, if figuring out all of the board states could be mathematically impossible, how did computers beat the number one ranked human masters in Chess and Go? Many modern systems, including Google’s AlphaGo computer that beat a human master in Go in 2017, use an algorithm called Monte Carlo Tree Search. Monte Carlo is a famous casino, so whenever you see the term “monte carlo,” it’s a good bet that the algorithm will be using randomness and chance to solve a problem. Combining Monte Carlo randomness and regular tree search like minimax, modern game AIs decide which part of the huge tree to search by guessing at odds. Basically, they want higher odds that the part of the game tree they search will lead to a win. But these aren’t just random guesses like we would make in many casino games, AI systems can simulate millions of “what-if” scenarios and use math to estimate the likelihood of winning if they choose one path or another. In each “what-if” scenario, the AI considers making one particular move and then simulates playing a large number of (but not all) possible games, where the next moves are chosen at random. By averaging these possible outcomes, the AI estimates how “good” that particular move is. It’s so much faster to estimate a handful of choices than exhaustively calculate each branch of the game tree. And some computers can even do this estimation in real time. One example of this is Google’s DeepMind which defeated human professional players at Starcraft II in 2019 --- where time is very critical. Of course, Starcraft II, Go, and Tic-Tac-Toe aren’t all the types of games that humans play. Other games require other strategies and have other computational challenges: IBM’s Watson question-answering system was able to beat human Jeopardy! champions in two televised matches in 2011. Watson listened for keywords in the clue and tried to use a knowledge graph to figure out responses. And we’ll talk more about knowledge graphs in a future episode. Watson wasn’t perfect and struggled a bit with context. For example, it famously guessed “What is Toronto?” on something in the category “US Cities.” But Watson was still able to do better than human contestants overall. Evolutionary neural networks use the environment as an input, like reinforcement learning. But this approach introduces /multiple/ agents who try multiple neural network structures, and then build on successful ones for the next generation. Sort of like animals, the ones that are better at surviving get to pass on their genes. For example, the AI MarI/O can learn how to play a Super Mario World level by telling MarI/O what buttons it can push and that getting farther to the “right” in the level is good. This AI will start by basically mashing buttons at random, but as some button mashes get it farther to the right, it remembers and learns from those successful attempts. In the next lab, we’ll build our own AI to use this approach to /crush/ a video game that we built where John-Green bot destroys trash. So, are there any games that are safe to play, where humans will always have an edge and AI won’t be able to beat us? Computers definitely seem to struggle with parts of language like humor, irony, metaphor, and wordplay. Computers also aren’t great at understanding and predicting real people, who don’t always act “optimally,” so social games could be more difficult too. But AI systems are finding some success in bluffing games like online poker, so it’s important not to underestimate them. John Green Bot: All - in. Computers might also struggle with creativity or surprise, because there’s not a really clear way to assign values to states. It’s difficult to assign a number to “how creative” something is, compared to saying “go as far right as you can in the Mario level” or “achieve a winning state in a chess game.” So, considering all of that, maybe games like charades would be pretty stacked for a human victory. Or… what about pictionary? Hide-and seek??? We’d love to hear in the comments what games you think are safe from AI. But in the next episode, which is another lab, we’ll program an AI system to learn how to play an arcade-like game… and I’ll beg John Green bot for my poker money back.