Linear Algebra II¶

MET 430¶

NTNU Business School¶

Johannes Mauritzen¶

Learning Goals¶

  • Understand how a solution to a systems of equations can be represented in a upper triangular matrix
  • Understand how to do row-operations in order to achieve a solution matrix to a system of equations
  • Understand the geometric interpretation of vectors and vector operations
  • Understand the geometric meaning of orthogonality of two vectors.

Literature¶

  • EMEA Ch. 12.8-12.10
  • PPNM Ch. 14.1, 14.5

Gaussian Elimination¶

Now we go back to our original motivation of solving a system of linear equations. Previously, we discussed our "ad hoc" method of substituting one equation into another in order to eliminate a variable. In this first of two videos, we will show how we can systematize this method by using matrix notation, in what is called Gaussian elimination. The goal will be a method that gives us a straight-forward step-by-step process for solving any sized system of equations. Importantly, we can also program a computer to do these steps.

Gaussian Elimination¶

Now that we have seen how a solution to a systems of equations can be represented by a upper triangular matrix, we now show how we can use row operations on an augmented coefficient matrix in order to obtain such a matrix.

Here is the Khan Academy video for using Gaussian elimination to solve a system of equations.

Try it¶

Solve the following system by Gaussian elimination

$$x_1 + 2x_2 + x_3 = 4$$$$x_1 - x_2 + x_3 = 5$$$$2x_1 + 3x_2 - x_3 = 1$$

(EMEA exercise 12.8.1)

Geometric interpretation of vectors - 2d (video 2.3)¶

Operationally, vectors are just matrices where one of the dimensions is 1. But they hold a special place in linear algebra, partly because of the clear geometric interpretation we can give them. In the rest of this lab, we spend time discussing this interpretation. We start with a simple 2 dimensional space:

Try it¶

On a 2-dimensional plot, draw the vector $\mathbf{a} = \begin{bmatrix} 5\\-1 \end{bmatrix}$

Geometric interpretation of vectors: vector operations (video 2.4)¶

Vector operations like addition, subtraction and scalar multiplication can also be given straight-forward geometric interpretations:

Addition and subtraction¶

Scalar multiplication¶

Here is the Khan Academy lesson on vectors in linear algebra.

Try it¶

let $\mathbf{a} = [5,-1]$ and $\mathbf{b} = [-2,4]$. Compute $\mathbf{a+b}$ and $\mathbf{-\frac{1}{2}a}$, then illustrate geometrically with vectors stating at the origin.

(EMEA Exercise 12.9.1)

Vector length / norm (video 2.5)¶

In two dimensions, we define our vector with a width (x-axis) and heigh (y-axis). From this information, we can easily calculate the length of the vector (the pythagorean theorem). We can extend this idea to larger dimension vectors. The length of a vector is called the norm.

Try it¶

let $\mathbf{a} = [1,2,2]$, $\mathbf{b}=[0,0,-3]$ and $\mathbf{c}=[-2,4,-3]$. Compute $||a||, ||b|| and ||c||$. Verify that $\mathbf{a}$ and $\mathbf{b}$ satisfy the Cauchy-Schwarz inequality (See EMEA p.484)

Orthogonality (video 2.6)¶

Mathematically, orthogonality is a simple concept. In two dimensions, it just means that two vectors are pointed 90 degrees relative to each other. But the concept of orthoganality has major implications in how we use and even interpret mathematical models that make use of linear algebra. For example in econometrics, we use orthogonality in order to help us decide whether a variable can be interpreted as causal.

Try it¶

Check which of the following three pairs of vectors are orthogonal:

a.) [1,2] and [-2,1]

b.) [1,-1,1] and [-1,1,-1]

c.) [a,-b,1] and [b,a,0]

(EMEA Exercise 12.9.7)

Lines and vectors (video 2.7)¶

To someone who does not study mathematics, a vector and a line can appear to be the same. They are not. A vector has a direction. A line extends infinitely in both its directions. But vectors and lines are related. In fact, we can use two vectors to define any line:

Try it¶

Find the equation for the line:

a.) that passes through the point (3,-2,2) and (10,2,1)

b.) that passes through the point (1,3,2) and has the same direction as the vector $[0,-1,1]$

(EMEA Exercise 12.10.1)

Hint: If you have a point $\mathbf{p}$ and a vector $\mathbf{a}$, then the line, $\mathbf{x}$ that passes through point $\mathbf{p}$ in the direction of $\mathbf{a}$ can be written $\mathbf{x} = \mathbf{p} + t\mathbf{a}$, where t is a scalar. So your answer in (b) will be three equations (for $x_1$, $x_2$, and $x_3$) as a function of the scalar, $t$)

Planes and vectors (video 2.8)¶

Visualising vectors and lines in two dimensions is relatively straight-forward. But even just going up to 3 dimensions, makes the intuition quite a bit harder. Here we move from defining a 1-dimensional line, to a 2-dimensional plane (imagine a piece of paper floating in the air that extends infinitely from each edge).

The visualisation and intuition becomes tought here, so we try here both with pen-and-paper and my super sophisticated 3D simulator.

If you still are having a hard time with the intuition, don't worry, that's normal.

Bonus: Video with super sophisticated 3D simulator¶

Vectors and solving systems of linear equations in Python¶

In [ ]:
import numpy as np
from numpy.linalg import norm

We start by defining a row vector, then transposing to make it a column vector

In [ ]:
vector_row = np.array([[2,2]])

myVector = vector_row.T

myVector
Out[ ]:
array([[2],
       [2]])

Length and norm¶

To calculate the length (or norm) as we defined above, we can use the norm function from numpy:

In [ ]:
norm(myVector, 2)
Out[ ]:
2.8284271247461903

The "2" paramater refers to the fact that we are calculating the norm by taking the square (2) root of the sum of squared entries in the vector.

If we want to calculate the norm as sum of absolute values we would calculate

In [ ]:
norm(myVector,1)
Out[ ]:
4.0

Angles and orthogonality¶

The following formula defines the angle between two vectors, $v$, and $w$ $$v \cdot w = \Vert v \Vert \Vert w \Vert cos( \theta) $$

Creating two vectors

In [ ]:
v = np.array([1,2]).T #try to be consistent in representing vectors as column vectors
w = np.array([3, 1]).T

Recalling that the inverse of cos is arccos ($x=cos(\theta) => arccos(x) = \theta$)

In [ ]:
theta = np.arccos(np.dot(v, w.T)/(norm(v,2)*norm(w,2)))

theta
Out[ ]:
0.7853981633974484

Python returns angles in the form of radians and not degrees. $\pi \equiv 3.1415... \equiv 180^{\circ}$ and $\pi/2 \equiv 90^{\circ}$

In our lecture we said that a test for orthogonal vectors was whether their inner "dot" product was equal to zero. Recall, that the definition of orthogonal vectors is that they point in 90 degree angles relative to each other.

Let's check this.

First, let's design two vectors with inner product =0

In [ ]:
v = np.array([1,0,0])
w = np.array([0,2,1])

np.dot(v,w.T)
Out[ ]:
0

Now let's calculate the angle:

In [ ]:
theta = np.arccos(np.dot(v, w.T)/(norm(v,2)*norm(w,2)))

theta
Out[ ]:
1.5707963267948966

which is $\pi/2 \equiv 90^{\circ}$

In [ ]:
np.pi/2
Out[ ]:
1.5707963267948966

Solving systems of equations with python¶

Remember all the work required to solve even a 3x3 system of equations by hand? Well, this will either make you happy or frustrated over all the wasted time:

In [ ]:
import numpy as np

A = np.array([[4, 3, -5], 
              [-2, -4, 5], 
              [8, 8, 0]])
y = np.array([2, 5, -3])

x = np.linalg.solve(A, y)
print(x)
[ 2.20833333 -2.58333333 -0.18333333]

behind the scences, the algorithm being used is not actually the Gaussian elimination we did by hand (but that could easily be programmed in). Instead there is something called LU decomposition, which is more efficient (quicker). We won't cover this algorithm in this course, but you can read more about it here

Assignment¶

1.) Gaussian elimination¶

Solve each of the following equation systems by Gaussian elimination

a.)

$$x_1 + 4x_2 = 1$$$$2x_1 + 2x_2 = 8$$

b.)

$$2x_1 + 2x_2 - x_3 = 2$$$$x_1 - 3x_2 + x_3 = 0$$$$3x_1 + 4x_2 - x_3 = 1$$

c.)

$$x_1 + 3x_2 + 4x_3=0$$$$5x_1 + x_2 + x_3 = 0$$

(EMEA Review exercise 12.7)

2.) Length and norms¶

Let $\mathbf{a} = [-1,5,3]$, $\mathbf{b} = [1,1,-3]$ and $\mathbf{c} = [-1,2,8]$. Compute $||a||$, $||b||$ and $||c||$. Then verify the Cauchy-Schwarz inequality holds for a and b.

(EMEA Review exercise 12.9)

3.) Computational problem: drawing vectors with Python¶

In this lecture from the QuantEcon website, code for drawing vectors in python is presented, which is copied below with 3 2-d example vectors:

In [ ]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (11, 5)  #set default figure size
import numpy as np
from matplotlib import cm
from scipy.interpolate import interp2d
from scipy.linalg import inv, solve, det, eig
In [ ]:
fig, ax = plt.subplots(figsize=(10, 8))
# Set the axes through the origin
for spine in ['left', 'bottom']:
    ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
    ax.spines[spine].set_color('none')

ax.set(xlim=(-5, 5), ylim=(-5, 5))
ax.grid()
vecs = ((2, 4), (-3, 3), (-4, -3.5))
for v in vecs:
    ax.annotate('', xy=v, xytext=(0, 0),
                arrowprops=dict(facecolor='blue',
                shrink=0,
                alpha=0.7,
                width=0.5))
    ax.text(1.1 * v[0], 1.1 * v[1], str(v))
plt.show()

See if you can draw a similar figure for the vectors $\mathbf{a} = [3,2]$ and $\mathbf{b} = [1,-2]$, as well as the vector, $\mathbf{c}$ that is formed by $\mathbf{c=a+b}$. First, alter the code above. See also if you can draw a similar figure, but instead of the annotate command, use the *plt.plot() command.

4.) Computational problem: solving a larg(er) system of equations in python¶

Below is a system of equations

$$x_1 + 2x_3 + 3x_3 = 80$$$$x_1 + 2x_2 + 2x_4 = 100$$$$x_1 + 2x_3 + 2x_4 = 75$$$$2x_1 + x_2 + x_3 + 3x_4=100$$

Put this system in matrix form and solve numerically.

5.) Showing the relationship between orthogonality and OLS.¶

In an OLS regression, the predicted values (or "fitted" values) from a regression model will be orthogonal to the residuals of the model.

If we simulate data as below for a model $y=\alpha + \beta_1 x + \beta_2 z + e$, where $e\sim N(0,1)$

In [ ]:
import numpy as np
#initiate a random number generator
rng = np.random.default_rng()

#generate 100 random number from a N(3,2) distribution
x = rng.normal(3, 2, 10000)
zeta = .5
z = zeta*x + rng.normal(0, 1, 10000)
e = rng.normal(0, 1, 10000)

beta1 = .3
beta2 = .2
alpha = 2
y = alpha + beta1*x + beta2*z + e

Now we can use statsmodels to estimate an incorrectly specified model with only x in the ols regression model

In [ ]:
#import sm
import statsmodels.api as sm


mod1 = sm.OLS(y, sm.add_constant(x)).fit()
#the "add_constant()" command simply says we also want to estimate an intercept term.

#we save the residuals and fitted values for later use
resids1 = mod1.resid
fitted1 = mod1.fittedvalues
mod1.summary()
Out[ ]:
OLS Regression Results
Dep. Variable: y R-squared: 0.385
Model: OLS Adj. R-squared: 0.384
Method: Least Squares F-statistic: 6246.
Date: Fri, 14 Apr 2023 Prob (F-statistic): 0.00
Time: 15:23:59 Log-Likelihood: -14337.
No. Observations: 10000 AIC: 2.868e+04
Df Residuals: 9998 BIC: 2.869e+04
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 1.9937 0.018 108.904 0.000 1.958 2.030
x1 0.4000 0.005 79.031 0.000 0.390 0.410
Omnibus: 1.695 Durbin-Watson: 2.005
Prob(Omnibus): 0.429 Jarque-Bera (JB): 1.694
Skew: -0.013 Prob(JB): 0.429
Kurtosis: 2.942 Cond. No. 6.88


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Here we see the result we know from taking a multivariate statistics course. If we have a z variable that is not included in the model and which is correlated with both the x variable and the y variable then we get a biased estimate of the coefficient on the x variable. Here we get an estimate of .4, where the true value was really .3.

Estimate the correctly specified model with both x and z as explanatory variables (remember to include a constant term). Save the residual and fitted values in objects. Show that in both the correctly and incorrectly specified models, the vectors consisting of the residuals and fitted values are orthogonal to each other. Can you provide intuition for this?