Solution sketch: Game theory¶

In [ ]:
import numpy as np
import quantecon.game_theory as gt
from quantecon import cartesian

1. In the notebook by Oyama, he writes a function for a sequential games and shows how to solve a sequential cournot game. Take the case of 2 players and the parameters we used above (grid with 100 values, a=100, c=1) and work, step-by-step to find a solution. Try to explain each step.¶

We start by creating our cournout game just as in the lab

In [ ]:
a, c = 100, 1
q_grid = np.arange(0,100,1)

q_grid
Out[ ]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
In [ ]:
prod_combis = cartesian([q_grid, q_grid])
prod_combis
Out[ ]:
array([[ 0,  0],
       [ 0,  1],
       [ 0,  2],
       ...,
       [99, 97],
       [99, 98],
       [99, 99]])
In [ ]:
tot_prods = prod_combis.sum(axis=1)
tot_prods = tot_prods.reshape([100,100])
tot_prods
Out[ ]:
array([[  0,   1,   2, ...,  97,  98,  99],
       [  1,   2,   3, ...,  98,  99, 100],
       [  2,   3,   4, ...,  99, 100, 101],
       ...,
       [ 97,  98,  99, ..., 194, 195, 196],
       [ 98,  99, 100, ..., 195, 196, 197],
       [ 99, 100, 101, ..., 196, 197, 198]])

We create a payoff array for p1, which is the marginal profit for the player mp=a-c-Q (price less marginal cost, c)

In [ ]:
payoff_array_p1 = a-c-tot_prods
payoff_array_p1
Out[ ]:
array([[ 99,  98,  97, ...,   2,   1,   0],
       [ 98,  97,  96, ...,   1,   0,  -1],
       [ 97,  96,  95, ...,   0,  -1,  -2],
       ...,
       [  2,   1,   0, ..., -95, -96, -97],
       [  1,   0,  -1, ..., -96, -97, -98],
       [  0,  -1,  -2, ..., -97, -98, -99]])

To get the full payoff we then multiply by the production (in the form of our initial grid from 0 to 100)

In [ ]:
payoff_array = payoff_array_p1*q_grid.reshape([100,1])
payoff_array
Out[ ]:
array([[    0,     0,     0, ...,     0,     0,     0],
       [   98,    97,    96, ...,     1,     0,    -1],
       [  194,   192,   190, ...,     0,    -2,    -4],
       ...,
       [  194,    97,     0, ..., -9215, -9312, -9409],
       [   98,     0,   -98, ..., -9408, -9506, -9604],
       [    0,   -99,  -198, ..., -9603, -9702, -9801]])
In [ ]:
player1 = gt.Player(payoff_array)
player2 = gt.Player(payoff_array)
In [ ]:
cournGame = gt.NormalFormGame([player1, player2])

Now instead of using a brute force solver to find the nash equilibrium, we choose initial quantities, and iteratively let each player choose a best production based on the others choice.

In [ ]:
N=2 #players
a = np.array([0,0]) #set initial actions for each player to each produce 0
In [ ]:
a
Out[ ]:
array([0, 0])
In [ ]:
 
In [ ]:
new_a = np.empty(N, dtype=int) #the new action based on observing the action of the other
max_iter = np.prod(cournGame.nums_actions) #np.prod - product of the array - gives the max number of iterations
#set to the size of the action space 100x100 = 10,000
In [ ]:
cournGame.players
Out[ ]:
(Player([[    0,     0,     0, ...,     0,     0,     0],
         [   98,    97,    96, ...,     1,     0,    -1],
         [  194,   192,   190, ...,     0,    -2,    -4],
         ...,
         [  194,    97,     0, ..., -9215, -9312, -9409],
         [   98,     0,   -98, ..., -9408, -9506, -9604],
         [    0,   -99,  -198, ..., -9603, -9702, -9801]]),
 Player([[    0,     0,     0, ...,     0,     0,     0],
         [   98,    97,    96, ...,     1,     0,    -1],
         [  194,   192,   190, ...,     0,    -2,    -4],
         ...,
         [  194,    97,     0, ..., -9215, -9312, -9409],
         [   98,     0,   -98, ..., -9408, -9506, -9604],
         [    0,   -99,  -198, ..., -9603, -9702, -9801]]))

Below is the for loop that goes iteratively from the starting actions (0,0) and finds the best response for each of the players, then updates the actions

In [ ]:
tie_breaking='smallest'

for t in range(max_iter):
        new_a[:] = a #updates a
        for i, player in enumerate(cournGame.players): #iterates over both players
            a_except_i = new_a[1-i] #the other players action
            #the best response given the other players action
            new_a[i] = player.best_response(a_except_i,
                                            tie_breaking=tie_breaking) 
            print('player {0}: {1}'.format(i, new_a))
        if np.array_equal(new_a, a):
            break
        else:
            a[:] = new_a
player 0: [49  0]
player 1: [49 25]
player 0: [37 25]
player 1: [37 31]
player 0: [34 31]
player 1: [34 32]
player 0: [33 32]
player 1: [33 33]
player 0: [33 33]
player 1: [33 33]

Here we see we converge on the actions (33, 33)

  1. For our cournot game, consider the situation with two producers, but where the marginal costs are different. Player 1 has marginal costs of 1 and player 2 has marginal costs of 1.1. Can you compute a nash equilibria in this case (assuming all other parameters are the same? Can you experiment with using a finer grid?
In [ ]:
a = 100
c1 = 1 #now two seperate marginal costs for each producer
c2 = 1.1 
q_grid = np.arange(0,100,1)

Creation of grid is unchanged

In [ ]:
prod_combis = cartesian([q_grid, q_grid])
tot_prods = prod_combis.sum(-1)
tot_prods = tot_prods.reshape([100,100])

Now payoff arrays for each player

In [ ]:
payoff_array_p1 = a-c1-tot_prods
payoff_array1 = payoff_array_p1*q_grid.reshape([100,1])
In [ ]:
payoff_array_p2 = a-c2-tot_prods
payoff_array2 = payoff_array_p2*q_grid.reshape([100,1])
In [ ]:
payoff_array1 = payoff_array1.astype(float)
In [ ]:
player1 = gt.Player(payoff_array1)
player2 = gt.Player(payoff_array2)
In [ ]:
 
In [ ]:
payoff_array_p2.dtype
Out[ ]:
dtype('float64')
In [ ]:
cournGame = gt.NormalFormGame([player1, player2])
In [ ]:
courn_NEs = gt.pure_nash_brute(cournGame)
print(courn_NEs)
[(33, 33), (34, 32)]
In [ ]:
 
In [ ]: