Simulating Prisoner's Dilemma
A recent Veritasium video sparked my interest in Game Theory, and I decided to try to implement some of the experiments in it.
When I was talking about this with a friend of mine, she incessantly pushed me to write an article about it in Marathi in a magazine. This is a post accompanying the article. I will only discuss the implementation in this post. I would highly recommend watching the Veritasium video before reading further. Derek has done a phenomenal job in this one.
The word “game” in game theory makes it seem that it is related to (possibly silly) computer or board games. But that is not true. In game theory, we create models which attempt to replicate real life situations in terms of rules and some scoring system to measure the outcomes. A “game” also has rules and a scoring system, hence the name.
Prisoner’s Dilemma is one of the famous games in game theory. You can read about it here.
We are going to implement an inverted version of this game. The rules of our game are as follows:
There are two players who can either cooperate with each other or defect. If both cooperate, they each get 3 points. If one cooperates and the other defects, the defector gets 5 points while the cooperator gets 0. If both defect, they each get 1 point.
Player 1 cooperates | Player 1 defects | |
---|---|---|
Player 2 cooperates | P1 - 3 P2 - 3 |
P1 - 5 P2 - 0 |
Player 2 deffects | P1 - 0 P2 - 5 |
P1 - 1 P2 - 1 |
My plan is to implement a two player version first and then implement a multi-player version (which will be a new post).
Implementation
This game will be played between two players over many rounds. Therefore, we need to store the moves that each player makes in all the rounds, as well as their scores. Let’s define the state of the game:
Now let’s see how the players can make a move. For this, we need a
function for each player which will return whether the player wants to
cooperate or defect in the current round. Let’s represent cooperation
with :co
and defection with :de
.
For the players to decide their next moves, we need to provide them with the history of their previous moves. So that they can devise a strategy which takes into account the previous moves made by them and their opponent.
The tit-for-tat
strategy replicates the other player’s previous
move. If it is the first round, it cooperates. The devil
strategy
always defects.
Now that we know how to write strategy functions to determine moves, let’s compute the scores based on the moves made by both the players. Here, we will implement the scoring system as described by the table above.
The compute-score
function, as the name suggests, calculates the
score based on the moves of the first and second players.
Now to play this game over and over again for many rounds, we have to update the state of the game after each round to keep the track of the moves history and the total score. Let’s do that:
This is the complete implementation of the two-player version of this game. Now we can simulate the game between two strategies:
I have written some more strategies here.
Play around and find out which is the best strategy! :-)
I have also written a multi player version here. I will soon write a post about it.
Feel free to make a PR if you think anything in the code needs to be improved.
Happy learning! :-)