Simulating P vs. G#

Following on from the previous section, we would like to be able to construct a way to simulate the Police vs. Guerrillas game. The functions below will create the correct game matrix and will calculate the winning percentages of the various strategies.

Based upon the winning percentages, the simulation will choose from the strategies that have performed best in the past. Using an exponential moving average (ema), the winning percentages will be updated after each round of simulation. The convergence to the correct strategy mixture for each player will be sped up as the ema will emphasize recent results over the older, less accurate results.

The simulation functions are described below.

## Do not change this cell, only execute it. 
## This cell initializes Python so that pandas, numpy and scipy packages are ready to use.

import random
from IPython.display import display, Markdown
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
import numpy as np
import scipy as sp
import math
import scipy.stats as stats
from fractions import Fraction

Setting up the Game#

When playing Police vs. Guerrillas, we would like functions that create the correct game matrix given any values of

\[2\leq p,q \leq 20\]

where, as discussed above, we have:

\[g < p < 2g \]

We will initialize the values we need:

  • \(p\) : number of police units

  • \(g\) : number of guerrilla units

  • \(\alpha\) : the memory factor for the exponential moving average

The player dictionaries will be created based upon the specific values of \(p\) and \(g\), and two functions will help accomplish the task:

  1. create_players_dict()

  2. game_matrix()

Dictionaries#

The function below creates player dictionaries and keeps track of the following:

  • Row-Column indexes in the results table,

  • Strategy numbers (integers), and

  • Strategy names.

def create_player_dict(name, anchor_val):
    strat = np.arange(math.ceil(anchor_val / 2), anchor_val + 1)
    index = np.arange(len(strat))
    strat_name = [f"{x} - {anchor_val - x}" for x in strat]
    return {
        'player': name,
        'index': index,
        'strat': strat,
        'strat_name': strat_name
    }

Game Matrix#

Given that we have two player dictionaries, the following function will create a game matrix with outcomes all set to \(1/2\).

def game_matrix(player1, player2):
   # Create an empty DataFrame with specified index and columns
   res = pd.DataFrame(1/2, index=player1['strat_name'], columns=player2['strat_name'], 
                       dtype='object')
   return res

Setting Up the Game#

The code below initializes values for \(p\) and \(g\) and creates the initial game matrix:

p = 8
g = 7
# Creating the dictionaries
P = create_player_dict('P', p)
G = create_player_dict('G', g)
results = game_matrix(P,G)
results
4 - 3 5 - 2 6 - 1 7 - 0
4 - 4 0.5 0.5 0.5 0.5
5 - 3 0.5 0.5 0.5 0.5
6 - 2 0.5 0.5 0.5 0.5
7 - 1 0.5 0.5 0.5 0.5
8 - 0 0.5 0.5 0.5 0.5

We can access the \((i,j)\) entry in the matrix above using the pandas method

.iloc[i,j]

We can also use

.loc[P[‘strat_name’][i],G[‘strat_name’][j]]

if we know the column strategy or row strategy indentifier.

Play() Function#

The interior entries of the results dataframe will be winning percentages of the row and column strategies when played against one another. To calculate the outcomes for the interior of the dataframe, we begin by playing each column strategy against each row strategy in \(10,000\) simulated battles. The init() function code is given below.

def play(p_max, g_max, p_strat, g_strat):
   A_0 = p_strat ; B_0 = p_max - A_0
   A = np.random.choice([A_0, B_0])
   B = p_max - A 
   reps = 10000
   win_pct_ema = 0.5
   for k in range(reps):
      a_0 = g_strat ; b_0 = g_max - a_0
      a = np.random.choice([a_0, b_0])
      b = g_max - a
      if ( ( a > A ) or ( b > B ) ):
         v = 0
      else:
         v = 1
      # Update the EMA instead of appending to a list
      alpha = min( 0.5 , ( 2 / ( k + 1 ) ) )
      win_pct_ema = (v * alpha) + (win_pct_ema * (1 - alpha))
   temp = win_pct_ema
   if temp <= 0.015:
      t = Fraction(0)
   elif temp >= 0.985:
      t = Fraction(1)
   else:
      t = Fraction(temp).limit_denominator(20)
   ## uncomment line below to help with debugging the init() function.
   #   print(f"police play ({A},{B}) while guerrillas play ({a},{b})")
   return t

The function above will be used in a nested for-loop to calculate the needed values and assign them to the correct entry in the results dataframe. The code below accomplishes the initialization of the results table created above.

for i in P['index']:
   ##print(f"Processing row {i}...") # Debugging line
   for j in G['index']:
      results.loc[P['strat_name'][i],G['strat_name'][j]] = play(p,g, int(P['strat_name'][i][0]),int(G['strat_name'][j][0]))

results = np.round(results, 4)
# To give the "table" a title in a Jupyter Notebook display:
results.style.set_caption(f"<b>{p} Police vs. {g} Guerrillas</b>")
8 Police vs. 7 Guerrillas
  4 - 3 5 - 2 6 - 1 7 - 0
4 - 4 1 0 0 0
5 - 3 1/2 1/2 0 0
6 - 2 0 1/2 1/2 0
7 - 1 0 0 1/2 1/2
8 - 0 0 0 0 1/2