Sherlock Holmes And The Final Problem: A Game Of Pure Conflict

Last night the BBC’s Sherlock Holmes returned for a New Year’s special. It was great. If you missed it (and have access) you should catch it on BBC iPlayer before it goes away forever (click here). The episode inspired me to think of how game theory can be used to solve some of Sherlock’s many problems. That’s what I’m going to investigate in this post.

Just a quick disclaimer before we begin. There are many people out there much smarter than me that could use game theoretic tools to capture how Mr. Holmes uses his logic and deductive reasoning to solve all of his little puzzles. I’m a bit slow, so I’m going to assess a pretty simple puzzle faced by Mr. Holmes a long time ago in 1891.

Indeed, the return of Sherlock reminded me of an old, but pretty famous, Sherlock Holmes story written by Sir Arthur Conan Doyle called “The Final Problem”. (You can access a PDF of The Final Problem here.) In this story strategic interdependence between Holmes and his arch nemesis Moriarty is particularly prevalent. Therefore, the tools of game theory can be used and, moreover, we can ask a big question: “Did Sherlock Holmes act rationally?”

The Final Problem

This story, set in 1891, introduces Holmes’s greatest opponent, the criminal mastermind Professor James Moriarty. Here’s a recap of The Final Problem to set the scene for our game.

One evening Holmes arrives at Dr. John Watson’s residence in an agitated state and with grazed and bleeding knuckles. Much to Watson’s surprise, he had apparently escaped three separate murder attempts that day after a visit from Professor Moriarty, who warned Holmes to withdraw from his pursuit of justice against him to avoid any regrettable outcome.

Holmes has been tracking Moriarty and his agents for months and is on the brink of snaring them all and delivering them to Scotland Yard. Moriarty is the criminal genius behind a highly organised and secret criminal force and Holmes will consider it the crowning achievement of his career if only he can defeat Moriarty. Moriarty is out to thwart Holmes’s plans and is well capable of doing so, for he is, as Holmes admits, the great detective’s only intellectual equal.

In response to Moriarty’s threats, Holmes asks Watson to come to the continent with him, giving him unusual instructions designed to hide his tracks to Victoria station. On meeting at Victoria Station Holmes plans that the two head to Dover in order to flee to the continent. The next day Watson follows Holmes’s instructions to the letter and finds himself waiting in the reserved first class coach for his friend, but only an elderly Italian priest is there. The cleric soon makes it apparent that he is in fact, Holmes in disguise.

As the train pulls out of Victoria, Holmes spots Moriarty on the platform, apparently trying to get someone to stop the train. Holmes is forced to take action as Moriarty has obviously tracked Watson, despite extraordinary precautions. He and Watson strategically alight at Canterbury (before reaching Dover), making a change to their planned route. As they are waiting for another train to Newhaven a special one-coach train roars through Canterbury, as Holmes suspected it would. It contains Moriarty, who has hired the train in an effort to overtake Holmes and catch him before he and Watson were to reach Dover. Holmes and Watson are forced to hide behind luggage, but they manage to make their escape to the continent!

Figure 1: Sherlock Holmes and Professor Moriarty at the Reichenbach Falls.

Having made their way to Strasbourg via Brussels, Holmes gets word that Moriarty is searching for them. After a chase through Europe, finally settling in Switzerland, Moriarty catches up with Holmes and fights with him atop a waterfall at Reichenbach Falls. They both appear to fall to their deaths. Indeed, this was depicted in last nights episode!

Applying some game theory

The story of Holmes and Watson fleeing England describes a situation of strategic interdependence to which we can apply some game theory. The three elements that we need in any game are players, strategies, and payoffs, so let’s figure these out…

  • Players. For simplicity we consider only two players in this game: Sherlock Holmes and Professor Moriarty.
  • Strategies. Holmes is faced with the decision of either going straight to Dover or disembarking at Canterbury, which is the only intermediate station. Moriarty, whose intelligence allows him to recognise these possibilities, has the same set of options. Therefore the strategy sets for both players contain only Dover and Canterbury.
  • Payoffs. Holmes believes that if they should find themselves on the same platform, it is likely that he’ll be killed by Moriarty. If Holmes reaches Dover unharmed, he can then make good his escape. Even if Moriarty guesses correctly, Holmes prefers Dover, as then, if Moriarty does fail, Holmes can better escape to the continent.

In this game there is imperfect information: Sherlock does not know what Moriarty is going to do for sure before he moves, and vice versa. Both players must therefore select their strategies simultaneously.

The strategic-form of the game is given by the payoff matrix below:

Figure 2: Payoff matrix for The Final Problem.

So what type of game is this? How do we solve it?

Pure conflict and maximin strategies

The game between Holmes and Moriarty is a constant-sum game where the sum of the payoffs in each cell of the matrix give a fixed number (in this case 100). A two-player constant-sum game is a game in which the two players’ payoffs always sum to a constant. Since this property implies that a change in behaviour which raises one player’s payoff must lower the other player’s payoff, constant-sum games are situations of pure conflict. In other words, what is good for one player is bad for the other.

If you have read many Sherlock Holmes stories, you know that he is both brilliant and arrogant. While it would be uncharacteristic of him to take a conservative tack in handling a strategic situation, he is smart enough to know that Moriarty may well be his match. So, rather than think about Holmes formulating a conjecture about what Moriarty would do and then choosing a best reply, let us presume that he takes a more cautious route in selecting a strategy.

Suppose, then, that Holmes believes that whatever strategy he selects, Moriarty will have foreseen it and will act so as to minimise Holmes’ expected payoff. Holmes then wants to choose the mixed strategy that maximises his own expected payoff given his belief about Moriarty. In other words, he exercises caution by optimising against his pessimistic beliefs.

What we’ve just described Holmes as choosing is what is known as his maximin strategy. Generally speaking, a maximin strategy maximises a player’s payoff, given that the other players are expected to respond by choosing strategies to minimise that player’s payoff.

Solving games of pure conflict

To solve this game we first let p be the probability that Holmes chooses to go to Dover, and then we let q be the probability that Moriarty chooses to go to Dover.

Holmes’s maximin strategy. To derive Holmes’s maximin strategy we need to derive his expected payoff function when Moriarty chooses to go to Dover. This expected payoff function is given by: p * 20 + (1-p) * 70 = 70 – 50p .

If Moriarty chooses Canterbury, Holmes’s expected payoff function is: p * 90 + (1-p) * 10 = 10 + 80p .

We can map these expected payoff functions in the diagram below:

Figure 3: Holmes’s expected payoffs. Green is where Moriarty selects Dover. Yellow is where Moriarty selects Canterbury.

Thus, p = 6/13 is the mixed strategy that maximises Holmes’s expected payoff, given that whatever mixed strategy Holmes chooses, Moriarty responds so as to minimise Holmes’s expected payoff.

Moriarty’s maximin strategy. To derive Moriarty’s maximin strategy we need to derive his expected payoff function when Holmes chooses to go to Dover. This expected payoff function is given by: q * 80 + (1-q) * 10 = 10 + 70q .

If Holmes chooses Canterbury, Moriarty’s expected payoff function is: q * 80 + (1-q) * 90 = 90 – 60q .

We can map these expected payoff functions in the diagram below:

Figure 4: Moriarty’s expected payoffs. Green is where Holmes selects Canterbury. Yellow is where Holmes selects Dover.

Thus, q = 8/13 is the mixed strategy that maximises Moriarty’s expected payoff, given that whatever mixed strategy Moriarty chooses, Holmes responds so as to minimise Moriarty’s expected payoff. This strategy pair is the maximin solution to the game.

The Final Solution

The Mixed Strategy Nash equilibrium of this game, therefore, is where Holmes will go to Dover with probability 6/13 (and therefore goes to Canterbury with probability 7/13),  while Moriarty will go to Dover with probability 8/13 (and therefore goes to Canterbury with probability 5/13). Thus, there is a higher probability for Sherlock to alight at Canterbury in an effort get rid of Moriarty.

Yo, this is exactly what happened in the story! Gosh darn, game theory has done it again! Moreover, Sherlock seems to have acted rationally! With a good grip on game theory you can be as clever as Sherlock Holmes too!

Conclusions & John von Neumann

I think that the real hero here isn’t Sherlock Holmes… Nor is it Dr. John Watson… The real hero can only be attributed to the genius that is John von Neumann.

John von Neumann is one of the greatest mathematicians that has ever lived, and should be considered as one of the founding fathers of game theory. Specifically, he has to be credited with providing the solution concept for games of pure conflict.

He was first to really discuss the maximin property which states that for any two-player game of pure conflict, the maximin solution is a Nash equilibrium.

Furthermore, if a two-player game of pure conflict has a Nash equilibrium in which both players randomise (i.e., they don’t use pure strategies), then each player’s Nash equilibrium strategy is also his maximin strategy.

What You Do Tells Me Who You Are: Why Signalling Is Important

In many strategic situations a player knows something that another player would like to know. A player can have private information. When Neville Chamberlain was negotiating with Adolf Hitler in 1938, he would have liked to have known Hitler’s true intentions. A player makes the best decision they can in light of the information they have, but they would always like to know more about those on the other side of the strategic divide.

In some scenarios, a player may have the opportunity to learn something about the private information that another player knows. For example, consider a bargaining scenario that might occur at a car dealership. How badly the buyer wants to buy the car may be unknown to the dealer, but the dealer may be able to infer something from the initial offer that the buyer makes. Why do you think car dealers often ask what you’re willing to pay for the car? They want you to reveal information about your willingness to pay, while concealing information about the price at which they’re willing to sell the car.

Another example is given in by the attacker-defender game from the previous post. An attacker can be ‘strong’ or ‘weak’ and it’s strength may be unknown to the defender. However, the attacker may signal its strength to the defender either visually (what they are wearing, how well they are armed, etc.) or through their reputation (have they overcome other defenders in the past, etc.). An attacker can signal their type to the defender, and the defender can act appropriately.

In this post we model and analyse such a scenario. The basic model is known as a signalling game and involves two players: the sender and the receiver. The sender has a certain type, which is given by Nature. The sender’s type is private information to her and is thus unknown to the receiver. The sender chooses an action and may thereby be “signalling” or “sending” information to the receiver. The receiver observes the sender’s action and then chooses an action himself. However, what action is best for the receiver depends on the sender’s type. (It may also depend on the sender’s action.)

To introduce signalling games, and to see what kind of mischief can occur in a signalling game, we consider a situation that you’re apt to find yourself in at some point in life: making a good first impression when dating.

Example I: Dating people from Tinder

You’ve done it. You finally got that Tinder date you’ve been longing for. To this point your chat on Tinder has been pretty minimal… but now it’s your time to shine. You’re meeting up with your date for a few drinks at a bar in town and you need to put your best foot forward. This is the one.

The only advice you’ve been getting from your friends is, ‘just be yourself.’ But you seem to know better. You should absolutely not be yourself; that’s not how you should play this game… So how should you play it? Let’s find out.

Setting the scene

Assume that we have a guy meeting up with a girl (of course it can be two girls meeting up, or two guys meeting up, or whatever, I’m not ruling anything out here). Let’s also assume that the guy is a bit weird and is already completely in love with the girl, and the girl is pretty indifferent about him… so, naturally, the girl has all the power in this scenario.

The girl accepts to meet up with the guy and, on the basis of a few dates, will decide whether to be in a more long-term relationship with him. By long-term relationship I just mean that the girl puts some investment into the guy: meeting each others parents, keeping a toothbrush and clothes at each others’ places, that kind of stuff.

One of the attributes to be learned during this dating period is how ‘nice’ the guy actually is (although, this may not be best characteristic for the girl to observe… we’ll come back to this later!). How ‘nice’ a guy is is just positively related to how much effort he is willing to put into the date: finding good places to go on dates, turning up on time, not burping while eating & drinking, not having several other girls on the side while in the relationship, etc.

To keep matters simple, imagine there are two types of guy in the world: douchebags and gentlemen / true gents. A douchebag is inclined to put minimal effort into the relationship, while a gentleman has inherent natural tendency to put more effort into the relationship.

Suppose that the girl has been messed around by douchebags in the past (and the guy knows this) and wants to build a relationship with a gent. And, regardless of whether he is a douchebag or gentleman, the guy would like to be in a relationship with the girl. She’s great banter.

So what should the guy do to enhance his chances of being in a relationship if he is, in fact, a douchebag? Although he’s not willing to put in much effort into the relationship, even a douchebag may be willing to put in extra effort during the dating period if it will convince the girl that he is gentlemanly and thus worthy of a relationship.

Suppose a douchebag does act like a true gent. The girl, being at least as clever as the lowly douchebag, should recognise that a guy who puts in a lot of effort is not necessarily a gent, but could in fact be a douchebag masquerading as a gent. In that case, the girl cannot infer the guy’s type during the dating period!

Now, consider a guy who is a true gent. If he’s clever, he’ll realise that even a douchebag type will put in a lot of effort in order to avoid conveying the fact that he is a douchebag. So, what should a true gent do? Put in even more effort to being nice! The gent may have to go overboard to distinguish himself from a douchebag. For the girl to distinguish a douchebag from a gent the gent needs to step up his game even more. Geez, life is hard if you’re a nice guy!

Signalling games tend to involve a lot of subtle strategies. So, what is the optimal strategy that the guy should take? What is the optimal strategy that the girl should take? How do we solve these complex games?

Modelling the Tinder date

Thus far in our Tinder date we have players, we have strategies for each player, and we have different types of player.

  • Players. A guy (the sender) and a girl (the receiver).
  • Strategies. The guy needs to put in some amount of effort during the dating period, and the girl needs to decide whether to keep or dump the guy.
  • Types. There are two types of guy in the world: douchebags and gents.

But what about payoffs and beliefs? Determining payoffs and beliefs are important for solving signalling games, so we’ll need to be more specific about these.

Payoffs. We need to assume that all payoffs have some number value that represent utilities. Suppose the payoff to the guy from being in the relationship is 130 and the payoff from being dumped is 70 (not 0 because there are plenty more fish in the sea, blah, blah, blah).

The guy has three options in terms of the amount of effort he can put in—40 units of effort, 60 units of effort, and 80 units of effort—and the personal cost to putting in effort depends on his type, as expressed in the table below. The guy’s overall payoff is the value of being in the relationship (or not), less the personal cost of putting in effort.

Figure 1: The guy’s personal cost of putting in effort into the date.

We can see from the table that due to the gentleman’s innate tendency to put in more effort that this type of guy incurs less personal cost of putting in more effort. Makes sense.

As noted, the girl would rather form a relationship with a genuinely nice guy as opposed to a douchebag. Accidentally forming a relationship with a douchebag will give her a payoff of 25. Forming a relationship with a gent will provide her a payoff of 100. Continuing to be single provides her a certain payoff of 60.

But what’s the probability that the guy she is dating is a genuinely good guy because, let’s face it, there are a lot of douchebags in the world? From past experience the girl assigns a prior probability of 0.75 (75%) that the guy she is going on a date with is genuinely a douchebag and a prior probability of 0.25 (25%) that the guy is a gent.

These beliefs have the implication that, unless the girl is able to acquire more information about the guy’s type, she’ll not form a relationship with him. Indeed, based on these probabilities the expected payoff from forming a relationship is 0.75 * 25 + 0.25 * 100 = 43.75, which is less than the payoff of 60 from just dumping the guy.

Finally, note that a strategy for the guy assigns an action—40, 60, or 80 units of effort—to each possible type—douchebag or gent—while a strategy for the girl assigns an action—dump or keep—to each possible action of the guy. This is seen in the Bayesian game is depicted below.

Figure 2: The decision tree representing the signalling game for our Tinder date.

This is the structure of the game. Note that the different information sets of the girl are given by dotted lines of different colours… but how do we solve this game?

Solving signalling games

The setup is the easy part, solving it is more difficult. Depending on the complexity of the game, it can be really difficult. Yo, the dating game is a hard game to play! In fact, signalling games in general are pretty intense. So, sorry if this gets a bit complicated.

In the previous post we suggested that to solve a Bayesian game we needed to introduce a solution concept called Bayes-Nash equilibrium. A Bayes–Nash equilibrium is a strategy profile that prescribes optimal behaviour for each and every type of a player, given the other players’ strategies and given beliefs about other players’ types.

Solving signalling games is much like solving a Bayesian game. To do so we need to use a solution concept called a perfect Bayes-Nash equilibrium.

Solution concept: Perfect Bayes-Nash equilibrium

As illustrated by the Tinder dating game, there are three stages to a signalling game.

  • First, Nature chooses the sender’s type.
  • Second, the sender learns his type and chooses an action.
  • Third, the receiver observes the sender’s action, modifies her beliefs about the sender’s type in light of this new information, and chooses an action.

A strategy for the sender assigns an action to each possible type, and a receiver’s strategy assigns an action to each possible action of the sender. The proposed method for solving such a game goes under the grandiose title of perfect Bayes–Nash equilibrium.

Perfect Bayes–Nash equilibrium is founded on two key concepts: sequential rationality and consistent beliefs. Sequential rationality means that, at each point in a game, a player’s strategy prescribes an optimal action, given her beliefs about what other players will do. In the particular context of a signalling game, sequential rationality requires that a sender’s strategy be optimal for each of her types (just as with Bayes–Nash equilibrium) and that a receiver’s strategy be optimal in response to each of the sender’s possible actions.

Sequential rationality requires optimal behaviour, given beliefs. As you can imagine, beliefs can’t be just any old thing, but rather should be cleverly derived in light of the strategic behaviour of other players. In a signalling game, a receiver starts with a set of beliefs about a sender’s type—which are referred to as his prior beliefs and are the probabilities given by Nature—and then he gets to observe the sender’s action before having to act himself.

Because the sender’s action may contain information about the sender’s type, the receiver then modifies his original beliefs to derive a set of posterior beliefs (or beliefs conditional on the sender’s action).

A receiver has consistent beliefs if his posterior beliefs are consistent with the sender’s acting in her own best interests. In other words, a receiver should ask, “Having observed the sender’s behaviour, what types of sender would act in such a way?”

Given this, let’s think back to our Tinder dating game…

Solving the Tinder dating game

For the perfect Bayes-Nash equilibrium we need to consider three aspects. First we need to consider the optimal strategy to each possible type of guy (sender). Second, we need to consider the girl’s (the receiver’s) optimal strategy to each possible action of the guy. And finally, the consistent beliefs of the girl (the receiver).

With regards to the perfect Bayes–Nash equilibrium of the dating game consider the following collection of strategies and beliefs:

  • Guy’s strategy: If douchebag, then put in 40 units of effort. If true gent, then put in 80 units of effort.
  • Girl’s strategy: If the guy puts in 40 or 60 units of effort, then do not form a relationship. If the guy puts in 80 units of effort then form relationship.
  • Girl’s beliefs: If the guy puts in 40 units of effort, then assign a probability of 1 to him being a douchebag. If the guy puts in 60 units of effort, then assign a probability of 0.6 to him being a gent. If the guy puts in 80 units of effort, then assign a probability of 1 to him being a gent.

To determine whether this is a perfect Bayes–Nash equilibrium, start with the guy’s strategy. Given the girl’s strategy noted above, the payoffs from various actions are shown in the table below.

Figure 3: The total payoffs from each type of guy putting in different effort levels given the girl’s optimal strategy.

For example, if the guy is a douchebag and puts in 60 units of effort, his payoff is -5 since he incurs a personal cost of 75 and is dumped (as dictated by the girl’s strategy). Given that the guy is a douchebag, it is indeed optimal for him to put in 40 units of effort, as a payoff of 20 exceeds that of -5 and that of 10. However, if the guy is get, he should put in 80 units of effort. Doing so means a payoff of 50—since he gets a girlfriend—and putting in less effort results in a lower payoff because he will be dumped.

Thus, the guy’s strategy is optimal for both of his types, given the girl’s strategy. Moreover, the girl’s strategy is optimal.  Further, the girl’s strategy is then optimal for each possible action of the guy. Accordingly, this scenario is a perfect Bayes–Nash equilibrium!

There are multiple forms of perfect Bayes-Nash equilibrium. We can have a separating equilibrium, a pooling equilibrium, or a semi-separating equilibrium. The dating scenario is an example of a separating strategy—a strategy that assigns a distinct action to each type of player. Hence, the receiver can “separate out” each player’s type from her observed play.

This separating equilibrium provides a nice concluding point: To effectively signal her type, a sender who is of an attractive type may need to distort her behaviour in order to prevent being mimicked by a less desirable type.

However… not all signalling games have a separating equilibrium! We’ll talk more about this with regards the attacker-defender application in the next post!

What life lessons can we take away from our dating game?

I’d like think that there are four main lessons to take away from our Tinder dating game.

  • First, signalling is an important aspect in everyday life. Dating is just one example.
  • Second, good guys have an incentive to be overly nice and put in an extraordinary amount of effort during the dating phase, which can (as an unintended consequence) put a lot of girls off. Of course, we didn’t model this, but because of this douchebags may win regardless.
  • Third, evaluating a person based on their effort level may not be the most efficient strategy. Other characteristics may be better, like… I don’t know… how funny they are. A person is either funny or not; it’s a difficult variable to influence in the dating period.
  • Finally, douchebags will always be douchebags. They will only be in relationships with other douchebags, and the circle of life continues. This is especially true with regards the separating equilibrium of the signalling game.

Conclusions & next post

This was a pretty long post and we began our investigation of a concept that we will use in the next post… so, let’s recap!

By now we know two things about games and life in general. One, people have private information. Two, people move sequentially. The significance of sequential moves is that a person who moves first may reveal information about what it is she knows that other players do not. A player who moves second may then be able to glean information about a player’s type from her observed behaviour and use that information in deciding what to do. But if that is what is going on, then the player who moves first may adjust her behaviour with the intent to mislead those who are drawing inferences from it!

An objective of here was to sort out these various forces and identify solutions in which all players are acting in their best interests and no players are fooled. This analysis was conducted in the context of a signalling game, which is the simplest structure that embodies these various features. A signalling game involves a player, known as the sender, who has some private information and moves first by choosing an action. The action is observed by a second player, known as the receiver, who, after updating his beliefs as to the first player’s private information, selects an action as well. The solution concept utilised was a perfect Bayes–Nash equilibrium.

These concepts can be applied to the dating world, in which we saw that a separating equilibrium comes into play and helps the receiver (the girl in our case) identify the type of sender they are faced with.

Next post. We’ll look at signalling games in a deeper way and see how these signalling games can be applied to attacker-defender games. Specifically, we will look at how pooling strategies may come into play and why statistics may be important here.

Attacker-Defender Games with Incomplete Information

A key assumption in game theory is that the game being played is common knowledge to all players. Specifically, each player knows who is participating, what their options are, and how they evaluate outcomes. But what if this assumption of common knowledge didn’t hold? What if some player had private information that wasn’t known to other players?

Going back to the initial example from the previous post what if, say, a defender didn’t know the identity of the attacker? Or the defender didn’t know the agenda of the attacker? Or the defender didn’t know the strength (the potential threat) of the attacker? This is a scenario in which some information can be private to the players, i.e., the identity, agenda, and strength of the attacker is private information to the attacker and not known by the defender

There are many real-world settings in which people have some information that is known only to them. How can games of private information be solved? Why may these games of private information be important to attacker-defender games?

Example I: The Munich Agreement

In order to motivate our discussion of games with private information let’s first turn the clock back to 1938. Nazi Germany had just annexed Austria, and it was believed that Adolf Hitler was considering a similar action against Czechoslovakia’s Sudetenland. With the Great War (now known as World War I) a recent memory, Europeans feared a repeat of such misery and horror. In an effort to preserve the peace, Prime Minister Neville Chamberlain of Great Britain traveled to Munich, Germany, to reach an agreement with Hitler. On September 28, 1939, Chamberlain and Hitler signed the Munich Agreement, giving Germany the Sudetenland in exchange for Hitler’s promise that he would go no further. A chunk of Czechoslovakia had been delivered as a concession to forestall war. Of course, peace proved to be an illusion. Germany would enter Prague the following spring and invade Poland a year later, starting World War II.

In deciding whether to propose and then sign this agreement, Chamberlain was uncertain as to the ultimate intentions of Hitler. Was Hitler only seeking additional lebensraum (“living space”) for the German people? If so, then perhaps a concession such as the Sudetenland would placate him and indeed avoid war. Or was Hitler concocting a more grandiose plan to invade much of Europe?

The situation in Munich can be described by the extensive-form game below.


Chamberlain moves first by deciding whether to offer concessions or stand firm. The presumption is that Hitler will accept the concessions, and our attention will focus on the decision regarding the pursuit of war. The preferences of Chamberlain are clear: His most preferred outcome is to stand firm whereupon Hitler avoids war, while his least preferred outcome is to provide concessions but then Hitler goes to war. Having been offered concessions, Hitler is given more time to prepare his war machine; thus, we shall suppose that Chamberlain finds that outcome less desirable than standing firm and going to war.

The challenge with analysing this situation lies with Hitler’s payoffs. While Hitler is presumed to know them, Chamberlain does not. The unknown payoffs of Hitler in the decision tree are given by question marks ‘?’.

Without knowing Hitler’s payoffs, how can Chamberlain determine what Hitler will do?

Determining Hitler’s payoffs

Let’s contemplate the possibilities that might have been racing through Chamberlain’s mind. One thought is that Hitler is amicable, as reflected in the payoffs presented in the first decision tree below. We refer to Hitler as amicable because his most preferred outcome is to gain concessions and avoid war. Note, however, that if Chamberlain stands firm, Hitler will go to war in order to gain additional land. Thus, if Chamberlain really did face an amicable Hitler and knew this fact, then he ought to provide concessions.

The other possibility is that Hitler is belligerent, as summarised by the payoffs in the second decision tree below. Here, Hitler has a dominant strategy of going to war, although he prefers to do so after receiving concessions. If this is the game Chamberlain is playing, then he would do better to stand firm.


In actuality, Chamberlain was uncertain as to whether he was playing the game described in the ‘amicable decision tree’ on the left or the ‘belligerent decision tree’ on the right. This situation is known as a game of incomplete information.

Introducing Nature

The trick to solving a game of incomplete information is to convert it to a game of imperfect information—that is, transform it from something we don’t know how to solve into something we do know how to solve!

This is done by introducing a new player referred to as Nature. Nature is intended, not to refer to trees, fleas, and bees, but rather random forces in players’ environment.

Nature takes the form of exogenously specified probabilities over various actions and is intended to represent players’ beliefs about random events. In the context at hand, Nature determines Hitler’s preferences (or payoffs) and thus the game that is being played, as is shown in the decision tree below.


Nature is modelled as moving first by choosing whether Hitler is amicable or belligerent. This move by Nature is not observed by Chamberlain—thereby capturing his lack of knowledge as to what Hitler’s payoffs are—but is observed by Hitler—since Hitler knows his own preferences. It is important to assume that the probabilities assigned by Nature to these two possibilities are common knowledge, and here we assume that there is a 60% chance that Hitler is amicable and a 40% chance that he is belligerent.

Determining the unique equilibrium

What should Chamberlain do given this strategy for Hitler?

Given Chamberlain’s uncertainty as to Hitler’s preferences, he isn’t sure how Hitler will respond to his action. Thus, Chamberlain will need to calculate expected payoffs in evaluating his two strategies.

By doing some trickery we can calculate that Chamberlain’s expected payoff from providing concessions is: 0.6 * 3 + 0.4 * 1 = 2.2.

If, instead, Chamberlain stands firm, his expected payoff is: 0.6 * 2 + 0.4 * 2 = 2.

The calculations result in an expected payoff of 2.2 by appeasing Hitler with the Sudetenland. Standing firm causes both Hitler types to go to war, so the payoff is 2.

The equilibrium. In sum, we contend that a solution to this game has Chamberlain offer concessions, in which case Hitler avoids war if he is amicable and goes to war if he is belligerent. Of course, we all know that he was belligerent, which obviously lead to World War II.

So, what are Bayesian games?

We saw in The Munich Agreement example above that the idea is to convert a game of incomplete information into a game of imperfect information, which is known as a Bayesian game.

A Bayesian game modifies a standard game by having an initial stage at which Nature determines the private information held by players.

A commonly used solution method for Bayesian games is Bayes–Nash (or Bayesian) equilibrium, which is a strategy profile that prescribes optimal behaviour for each and every type of a player, given the other players’ strategies, and does so for all players. From this we can use expected payoffs to determine the equilibrium strategies of each player.

Example II: Another attacker-defender game

Consider the attacker-defender game that was described in the previous post. This game consisted of two different players-an attacker and a defender-where the attacker was trying to capture a town that is being defended. Let’s specifically consider the situation where the attacker and defender make their move in a simultaneously, but now let’s add a little twist to include some private information.

The twist that the defender does not know the experience and strength of the attackers’ army. For simplicity we assume that the attackers army can be either ‘weak‘ or ‘strong‘, and that there is only one entrance into the town.

Now the strategies for the both the attacker and the defender are to either ‘engage in conflict’ or ‘wait’. Given this set-up, the defender would rather engage in conflict if they feel that the attacker is going to engage, and would rather wait if they feel that the attacker is going to wait. The rationale being that the defender does not want to waste resources on engaging with a benign attacker.

A strong attacker would rather engage in conflict if the defender does not, as the attacker would be neutralised if they do not. A weak attacker would rather not engage even if the defender engages.

The payoff matrices that illustrate the incentives can be shown below:


From the matrices we can note that the attackers strictly dominant strategy is to ‘wait’ if they are weak regardless of what the defender is going to do. Likewise, if the attacker is strong then attacker has an incentive to always engage in conflict. This makes sense, and due to the strict domination these dominant strategies inform part of the Bayesian equilibrium.

For generality (and unlike The Munich Agreement example above) let’s just assume that the attacker is strong with a probability p and that the attacker is weak with a probability 1 – p.

Given this we can use some magic to calculate that in Bayes-Nash equilibrium: if p > 1/3 then the defender will engage in conflict; if p < 1/3 then the defender will wait; and if p = 1/3 then the defender is completely indifferent as to what strategy they should choose.

Adding probabilities to the mix

So, we finished up the previous example with some probabilities. Specifically, the defenders’ optimal strategy depended on the the probability of the attacker being either ‘weak’ or ‘strong’. So much depended on p. But what is p? What is the probability of the attacker being strong? Can we estimate it? … well, that’s the key… in fact two things become important here…

The first thing is signalling: maybe the attacker is wearing some strong armour, or have daunting weaponry, or they look coordinated in their activities, or have built a reputation for being strong against other defenders. This allows us to more accurately estimate their probability of being strong.

The second thing is the use of statistics: by acquiring data on how these signals relate to the strength of the attacker we can build nice models to accurately estimate the probability of the attackers’ strength and better inform the actions of the defender.

Conclusions & next post

We considered a common and crucial feature of many strategic settings: A person may know something about him- or herself that others do not know. This scenario frequently arises in the form of a player’s payoffs being private information. As originally cast, the game is not common knowledge, because, for example, one player doesn’t know another player’s payoffs. We refer to such a game as having incomplete information. The trick to solving a game of incomplete information is to convert it into a game of imperfect information. The initial move in the game is now made by random forces, labeled Nature, that determine each player’s type, where a type encompasses all that is privately known to a player. This Nature-augmented game, which is known as a Bayesian game, is common knowledge, since, at its start, no player knows his type and thus has no private information. What is commonly known is that Nature will determine players’ types. What also is commonly known are the probabilities used by Nature in assigning a player’s type.

When solving Bayesian games the probabilities that Nature enforces  become important when informing the payoffs and equilibrium strategies of players. We noted that these probabilities are typically unknown, but can be estimated to some extent by two elements: signalling and statistical analysis.

Next post. Signalling and statistical analysis are two elements that we will discuss in more detail in future posts. Specifically, in the next post we will look at signalling games and how they apply to Bayesian games and our project regarding attacker-defender games.

Attacker-Defender Games: An Introduction

I had the pleasure of lecturing “Game Theory” this semester. Although I only prepared the module about a day in advance (it was a busy semester), I think it went down pretty well. But let’s face it, you can’t really go wrong with game theory; everyone loves it. More importantly though, I loved it. It’s always good to go back to the basics and, for whatever reason, concepts and theories tend to sink in a lot more when you’re forced to stand in front of students and talk about them for hours on end.

So over the next few posts I want to talk a bit about game theory, but I also want to talk about a new project I will be working on from January 2016. The project regards an application of game theoretic concepts, but also uses notions from econometrics and some machine learning too. It’s an exciting project I promise.

What is game theory?

Before I go into any form of detail I should probably first explain what game theory actually is. In short, game theory provides a toolkit for studying situations of interactive decision-making. It’s a general theory that helps us analyse how people, firms, armies, or governments (generally just called ‘players’) make decisions in social situations where there is an element of strategic interdependence. This strategic interdependence refers to situations in which a players utility (or ‘payoff’) depends not only on their own actions (or ‘strategies’), but also on the actions of other players in the game.

Game theory can be defined as the study of mathematical models of conflict and cooperation between intelligent rational decision-makers.

Therefore, all games have three main elements: players, strategies for each player, and utilities allocated to each player for each potential outcome of the game. Game theory provides a set of tools to analyse games; these are essential tools for economists but are now frequently used in computer science, law, psychology, and politics.

There are a number of different game formats. Two main formats include scenarios in which players move simultaneously (think of rock-paper-scissors) and scenarios in which players move sequentially (think of chess). Simultaneous games can be represented in the form of payoff matrices and sequential games can be represented in terms of decision trees.

There are a number of different ways to solve a game. The solution concept that should be used largely depends on the game form that is being analysed. A famous and commonly used solution concept is known as Nash equilibrium, which occurs where no player has any incentive to individually (unilaterally) deviate their strategy. A refinement of the Nash equilibrium concept is known as the subgame perfect Nash equilibrium, which is based on the notion of backward induction, and another concept is known as the mixed strategy Nash equilibrium.

Seriously though, if you’re new to game theory just check out the Wikipedia page for more information (here)… it will tell you everything you need to know!

The project

The project investigates ‘attacker-defender games’. These are games in which a number of defenders are attempting to defend a resource or piece of territory whilst a number of attackers are attempting to destroy or capture the same resource or territory that is being defended. These games can be applied to a lot of different scenarios such as terrorism & counter-terrorism, cybersecurity, industrial organisation, competition & firm entry, penalty shootouts, and international & civil wars. The games are typically represented in the form of payoff matrices and decision trees, but can take more complex forms when different tools are applied.

As noted above, all games have three basic elements: a set of players, a set of strategies given to each player, and a set of payoffs corresponding to each strategy tuple. Therefore attacker-defender games are comprised of a number of players; specifically, a set of ‘attackers’ and a set of ‘defenders’. Each player has a set of strategies; these strategies are determined by the type of player that is acting, either the attacker or defender. Each players payoffs are determined by their success in either attacking or defending a piece of territory or resource. Due to the interdependent nature of games, each players payoffs are dependent on both the strategies that they execute and the strategies that all other players execute as well.

Attacker-defender games: A simple example

I will assume that most people know a little bit of game theory and therefore know how attacker-defender games are traditionally constructed. For completeness, here is a simple attacker-defender game to jog your memory. For any game we need three main elements: players, strategies, and payoffs. Let’s go through each one below.

Players. The attacker-defender game we will initially analyse has just two players: an ‘attacker’ army and a ‘defender’ army.

Strategies. The defender army is defending a town that has two main entrances: one entrance is via an easy pass and the other entrance is via a hard pass. Therefore the defender army can defend either the easy path or the hard path. Likewise the attacker army can attack the town from either the easy path or the hard path. In all both players have identical strategy sets which contain two strategies each: the hard path or the easy path. To keep the terminology simple we will denote the easy path by ‘easy’ and the hard path by ‘hard’.

In all, the strategic situation is effectively shown by the simple diagram in Figure 1 below.

Figure 1: Attackers can attack via either the easy path (easy) or the hard path (hard) and the defender can also defend either the easy path (easy) or the hard path (hard).

Outcomes & payoffs. Given that there are two players, both of whom have two strategies each, there are four unique outcomes of the game. These are: (easy, easy); (easy, hard); (hard, easy); and (hard, hard), where the defenders executed strategy is the first element in the parenthesis and the attackers executed strategy is the second element.

Payoffs for each player depends on the strategy implemented by the other player. The payoffs allocated to each outcome are as follows.

  • If the attacker attacks via the easy path and the defender defends the easy path then both armies will meet and fight. Given that both armies choose the easy path, both armies are equally matched so there is a 50-50 chance that a given army wins. The payoffs to the outcome (easy, easy) is (1, 1).
  • If the attacker attacks via the easy pass and the defender defends the hard path then the armies will not meet and the attacker can position themselves to capture the town without serous confrontation. The defending army lose control of the town and lose the assault. The payoffs to the outcome (hard, easy) is (0, 2).
  • If the attacker attacks via the hard path and the defender defends the easy path then, again, the armies will not meet. However, the attacker army traverses the hard path meaning that they lose half (or a substantial number) of the army due to casualties and starvation, etc. Given the number of casualties endured by the attackers they are not in such a position to take over the town. As a consequence the defending army can regroup and defend the town. Therefore, the payoffs to the outcome (easy, hard) is (1, 1).
  • Finally, if the attacker attacks via the hard path and the defender defends the hard path then both armies meet. However, since the attacker has endured the casualties they are easily defeated by the defender. The payoffs to the outcome (hard, hard) is (2, 0).

So, we have defined all of the elements we need for a simple game. Specifically, we have a nice asymmetrical game which we can set up in a number of different ways. In this example we’ll look at two different ways: one in which the defender sees the attacker coming and selects a route to defend, and the other in which the attacker and defender have no idea where each other are going to attack or defend. The first game is one of perfect information and is described with the use of a decision tree, and the other is a game of imperfect information and is described with the use of a payoff matrix.

Game 1: Perfect information

Let’s say that the players move sequentially: first the attacker moves first, the defender sees the attacker coming and chooses to defend a certain path. This game is represented by the decision tree below.

Figure 2: Decision tree of the attacker-defender game with perfect information where the attacker moves first and the defender responds.

The unique subgame perfect Nash equilibrium (SPNE) of this game is obvious and is given by (easy, easy / hard). This is indicated by the red arrows. Regardless of what the attacker chooses, the defender always wants to coordinate their actions with the attacker. Knowing this, the attacker will rationally choose to attack via the easy path and the defender will defend the easy path. The result isn’t all that interesting.

More interesting is when the moves are reversed. If the defender were to move first and the attacker were to respond, then the defender should always defend the easy path and the attacker will be indifferent as to attacking the easy or hard path. Therefore there are two equally valid subgame perfect Nash equilibria: (easy, easy / easy) and (easy, hard / easy). Regardless, the defender always has an incentive to defend the easy path is they know that the attacker perfectly responding to them.

In all subgame Nash equilibria discussed the payoffs are all (1, 1).

Game 2: Imperfect information

A more realistic scenario is where the attacker doesn’t know where the defender is going to defend, and the defender doesn’t know where the attack is going to attack. This is a game of imperfect information and is represented, again, by a decision tree as below. The only difference is that there exists a dotted line between the decision nodes of the defender: the defender does not know what move the attacker has taken.

Figure 3: Decision tree of the attacker-defender game with imperfect information.

The attacker and defender have limited information; specifically, the defender does not know where the attacker will approach from. The game can be transformed into a payoff matrix of the following form.

Figure 4: Payoff matrix representing the attacker-defender game with imperfect information.

In this situation the defender again wants to coordinate actions with the attacker, but the attackers strategy of attacking via the easy path weakly dominates their strategy of attacking via the hard path. Therefore, there are is a single pure strategy Nash equilibrium where the attacker attacks via the easy path and the defender defends the easy path, leading to a payoff of (1, 1).

There are (technically) an infinite number of mixed strategy Nash equilibria. Let p be the probability that the defender will select the easy path and q be the probability that the attacker will select the easy path.

If the defender believes that q > 0.5 then they should choose the easy path. Conversely, if the defender believes that q < 0.5 then they should select the easy path. If q = 0.5 then the defender is indifferent of what strategy they should take.

Likewise, if the attacker believes that the defender will definitely select the easy path p = 1 then they are indifferent to choosing the easy or hard paths. If, on the other hand, p < 1 then the attacker should definitely select the easy path. The best response curves can be seen in the axes below; all Nash equilbria are shown by the purple, where the best response curves intersect.

Figure 5: Best response functions for the attacker and defender and full set of mixed strategy Nash equilibria.

Conclusions & future posts

I like game theory and I have recently been asked to undertake a project regarding attacker-defender games. Over the next few posts I want to elaborate on the problem, potential solutions, and applications of attacker-defender games. In doing so I will look more closely at minimax strategies, games of incomplete information, Bayesian games, machine learning, and network games with Byzantine players.

The next post will be on games of incomplete information and Bayesian reasoning… I’ll try not to leave it two years until my next post.