The aim of this book is to provide a bridge between school and university mathematics centred on algebra and geometry Apart from pro forma proofs by
From a more general viewpoint, linear algebra is the careful study of the mathematical language for expressing one of the most widespread
OXFORD GRADUATE TEXTS IN MATHEMATICS Books in the series 1 Keith Hannabuss: An introduction to quantum theory 2 Reinhold Meise and Dietmar Vogt:
24 oct 2014 · The official Mathematics Subject Classification currently divides math- ematics into 64 broad areas in any one of which a mathematician
1 5 1 Geometric interpretation of multiplication Alan F Beardon Algebra and Geometry Cambridge University Press 1987 (£20 95 paperback)
DAVID HILBERT, PH D PROFESSOR OF MATHEMATICS, UNIVERSITY OF GÖTTINGEN AUTHORIZED TRANSLATION BY E J TOWNSEND
22 mai 2012 · 1School of Mathematics, University of Bristol geometry we find that the unit vector with angle ? to the x-axis is given by
SOLID GEOMETRY Bachelor of Arts (B A ) Three Year Programme New Scheme of Examination DIRECTORATE OF DISTANCE EDUCATION MAHARSHI DAYANAND UNIVERSITY
In this way, mathematics in UK schools is generally presented as an integrated subject, although students may well experience a curriculum diet of mathematics
For the nonspecialist in mathematics, which subject makes the greater educational contribution, algebra or geometry? My remarks have been organized into
![[PDF] Mathematical Tripos: IA Algebra & Geometry (Part I) Contents [PDF] Mathematical Tripos: IA Algebra & Geometry (Part I) Contents](https://pdfprof.com/EN_PDFV2/Docs/PDF_6/6180_6notes.pdf.jpg)
6180_6notes.pdf
Linear Algebra & Geometry
Roman Schubert
1
May 22, 2012
1
School of Mathematics, University of Bristol.
c
University of Bristol 2011 This material is copyright of the University unless explicitly stated oth-
erwise. It is provided exclusively for educational purposes at the University and is to be downloaded
or copied for your private study only. 2 These are Lecture Notes for the 1st year Linear Algebra and Geometry course in Bristol. This is an evolving version of them, and it is very likely that they still contain many misprints. Please report serious errors you nd to me (roman.schubert@bristol.ac.uk) and I will post an update on the Blackboard page of the course. These notes cover the main material we will develop in the course, and they are meant to be used parallel to the lectures. The lectures will follow roughly the content of the notes, but sometimes in a dierent order and sometimes containing additional material. On the other hand, we sometimes refer in the lectures to additional material which is covered in the notes. Besides the lectures and the lecture notes, the homework on the problem sheets is the third main ingredient in the course. Solving problems is the most ecient way of learning mathematics, and experience shows that students who regularly hand in homework do reasonably well in the exams. These lecture notes do not replace a proper textbook in Linear Algebra. Since Linear Algebra appears in almost every area in Mathematics a slightly more advanced textbook which complements the lecture notes will be a good companion throughout your mathematics courses. There is a wide choice of books in the library you can consult. 11 c
University of Bristol 2013 This material is copyright of the University unless explicitly stated otherwise.
It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your
private study only.
Contents
1 The Euclidean plane and complex numbers 5
1.1 The Euclidean planeR2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 The dot product and angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Euclidean spaceRn15
2.1 Dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Angle between vectors inRn. . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Linear subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Linear equations and Matrices 25
3.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 The structure of the set of solutions to a system of linear equations . . . . . . 33
3.3 Solving systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 Elementary row operations . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Elementary matrices and inverting a matrix . . . . . . . . . . . . . . . . . . . 41
4 Linear independence, bases and dimension 45
4.1 Linear dependence and independence . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Bases and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5 Linear Maps 55
5.1 Abstract properties of linear maps . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Rank and nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6 Determinants 65
6.1 Denition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2 Computing determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Some applications of determinants . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.1 Inverse matrices and linear systems of equations . . . . . . . . . . . . 79
6.3.2 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3.3 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3
4CONTENTS
7 Vector spaces 83
7.1 On numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.4 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.5 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.6 Direct sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.7 The rank nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.8 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.9 Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.10 Change of basis and coordinate change . . . . . . . . . . . . . . . . . . . . . . 105
7.11 Linear maps and matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8 Eigenvalues and Eigenvectors 117
9 Inner product spaces 127
10 Linear maps on inner product spaces 137
10.1 Complex inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.2 Real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Chapter 1
The Euclidean plane and complex
numbers 1
1.1 The Euclidean planeR2
To develop some familiarity with the basic concepts in linear algebra let us start by discussing the Euclidean planeR2: Denition 1.1.The setR2consists of ordered pairs(x;y)of real numbersx;y2R.
Remarks:
In the lecture we will denote elements inR2often by underlined letters and arrange the numbersx;yvertically v= x y Other common notations for elements inR2are by boldface lettersv, and this is the notation we will use in these notes, or by an arrow above the letter~v. But often no special notation is used at all and one writesv2R2andv= (x;y). That the pair is ordered means thatx y 6=y x ifx6=y. The two numbersxandyare called thex-component, or rst component, and the y-component, or second component, respectively. For instance the vector 1 2 hasx-component 1 andy-component 2. We visualise a vector inR2as a point in the plane, with thex-component on the horizontal axis and they-component on the vertical axis. 5
6CHAPTER 1. THE EUCLIDEAN PLANE AND COMPLEX NUMBERSx
y v w-w v+wvFigure 1.1: Left: An elementv= (y;x) inR2represented by a vector in the plane. Right: vector addition,v+w, and the negative v. We will dene two operations on vectors. The rst one is addition:
Denition 1.2.Letv=v1
v 2 ;w=w1 w 2
2R2, then we dene the sum ofvandwby
v+w:=v1+w1 v 2+w2 And the second operation is multiplication by real numbers:
Denition 1.3.Letv=v1
v 2
2R2and2R, then we dene the product ofvbyby
v:=v1 v 2 Some typical quantities in nature which are described by vectors are velocities and forces. The addition of vectors appears naturally for these, for example if a ship moves through the water with velocityvSand there is a current in the water with velocityvC, then the velocity of the ship over ground isvS+vC. By combining these two relation we can form expressions likev+wforv;w2R2and ;2R, we call this alinear combinationofvandw. For instance 5 1 1 + 60 2 =5 5 +0 12 =5 7 : We can as well consider linear combinations ofkvectorsv1;v2;;vk2R2withcoecients
1;2;k2R,
1v1+2v2++kvk=kX
i=1 ivi1 c
University of Bristol 2011 This material is copyright of the University unless explicitly stated otherwise.
It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your
private study only.
1.1. THE EUCLIDEAN PLANER27
Notice that 0v=0
0 for anyv2R2and we will in the following denote the vector whose entries are both 0 by 0, so we have v+ 0 = 0 +v=v for anyv2R2. We will use as well the shorthand vto denote ( 1)vandw v:=w+( 1)v.
Notice that with this notation
v v= 0 for allv2R2.
The norm of a vector is dened by
Denition 1.4.Letv=v1
v 2
2R2, then the norm ofvis dened by
kvk:=qv
21+v22:
By Pythagoras Theorem the norm is just the geometric length of the distance between the point in the plane with coordinates (v1;v2) and the origin 0. Furthermorekv wkis the distance between the pointsvandw.
For instance the norm of a vector of the formv=5
0 , which has noycomponent, is justkvk= 5, whereas ifw=3 1 we ndkwk=p9 + 1 = p10 and the distance between vandwiskv wk=p4 + 1 = p5. Let us now look how the norm relates to the structures we dened previously, namely addition and scalar multiplication:
Theorem 1.5.The norm satises
(i)kvk 0for allv2R2andkvk= 0if and only ifv= 0. (ii)kvk=jjkvkfor all2R;v2R2 (iii)kv+wk kvk+kwkfor allv;w2R2. Proof.We will only prove the rst two statements, the third statement, which is called the triangle inequalitywill be proved in the exercises. For the rst statement we use the denition of the normkvk=pv
21+v220. It is clear
thatk0k= 0, but ifkvk= 0, thenv21+v22= 0, but this is a sum of two non-negative numbers, so in order that they add up to 0 they must both be 0, hencev1=v2= 0 and sov= 0 The second statement follows from a direct computation: kvk=p(v1)2+ (v2)2=q
2(v21+v22) =p
2qv
21+v22=jjkvk:We have represented a vector by its two components and interpreted them as Cartesian
coordinates of a point inR2. We could specify a point inR2as well by giving its distance to the origin and the angle between the line connecting the point to the origin and thex-axis. We will develop this idea, which leads topolar coordinatesin calculus, a bit more:
8CHAPTER 1. THE EUCLIDEAN PLANE AND COMPLEX NUMBERS
Denition 1.6.A vectoru2R2is called a unit vector ifkuk= 1. Remark: A unit vector has length one, hence all unit vectors lie on the circle of radius one inR2, therefore a unit vector is determined by its anglewith thex-axis. By elementary geometry we nd that the unit vector with angleto thex-axis is given by u() :=cos sin :(1.1) Theorem 1.7.For everyv2R2,v6= 0, there exist unique2[0;2)and2(0;1)with v=u()xy v q lFigure 1.2: A vectorvinR2represented by Cartesian coordinates (x;y) or by polar coordi- nates;. We havex=cos,y=sinand=px
2+y2and tan=y=x.
Proof.Givenv=v1
v 2
6= 0 we have to nd >0 and2[0;2) such that
v1 v 2 =u() =cos sin : Sinceku()k=ku()k=(note that >0, hencejj=) we get immediately =kvk:
To determinewe have to solve the two equations
cos=v1kvk;sin=v2kvk; which is in principle easy, but we have to be a bit careful with the signs ofv1;v2. Ifv2>0 we can divide the rst by the second equation and obtain cos=sin=v1=v2, hence = cot 1v1v
22(0;):
Similarly ifv1>0 we obtain= arctanv2=v1, and analogous relations hold ifv1<0 and v 2<0.
1.2. THE DOT PRODUCT AND ANGLES9
The converse is of course as well true, given2[0;2) and0 we get a unique vector with directionand length: v=u() =cos sin :
1.2 The dot product and angles
Denition 1.8.Letv=v1
v 2 ;w=w1 w 2
2R2, then we dene the dot product ofvandw
by vw:=v1w1+v2w2:
Note thatvv=kvk2, hencekvk=pvv.
The dot product is closely related to the angle, we have:
Theorem 1.9.Letbe the angle betweenvandw, then
vw=kvkkwkcos : Proof.There are several ways to prove this result, let us present two. (1) The rst method uses the following trigonometric identity cos'cos+ sin'sin= cos(' ) (1.2) We will give a proof of this identity in (1.9). We use the representation of vectors by length and angle relative to thex-axis, see Theorem 1.7, i.e.,v=kvku(v) and w=kwku(w), wherevandware the angles ofvandwwith thex-axis, respectively.
Using these we get
vw=kvkkwku(v)u(w): So we have to computeu(v)u(w) and using the trigonometric identity (1.2) we obtain u(v)u(w) = cosvcosw+ sinvsinw= cos(w v); and this completes the proof since=w v. (ii) A dierent proof can be given using the law of cosines which was proved in the exercises. The sides of the triangle spanned by the vectorsvandwhave lengthkvk,kwkand kv wk. Applying the law of cosines andkv wk2=kvk2+kwk2 2vwgives the result.Remarks: (i) Ifvandware orthogonal, thenvw= 0.
10CHAPTER 1. THE EUCLIDEAN PLANE AND COMPLEX NUMBERS
(ii) If we rewrite the result as cos=vwkvkkwk;(1.3) ifv;w6= 0, then we see that we can compute the angle between vectors from the dot- product. For instance ifv= ( 1;7) andw= (2;1), then we ndvw= 5,kvk=p50 andkwk=p5, hence cos= 5=p250 = 1=p10. (iii) Another consequence of the result above is that sincejcosj 1 we have jvwj kvkkwk:(1.4) This is called theCauchy Schwarz inequalityand we will prove a more general form of it later.
1.3 Complex Numbers
One way of looking at complex numbers is to view them as elements inR2which can be multiplied. This is a nice application of the theory ofR2we have developed so far. The basic idea underlying the introduction of complex numbers is to extend the set of real numbers in a way that polynomial equations have solutions. The standard example is the equation x 2= 1 which has no solution inR. We introduce then in a formal way a new number i with the property i
2= 1 which is a solution to this equation. The set of complex numbers is the set
of linear combinations of multiples of i and real numbers:
C:=fx+ iy;x;y2Rg
We will denote complex numbers byz=x+ iyand callx= Rezthereal partofzand y= Imztheimaginary partofz. We dene a addition and multiplication on this set by setting forz1=x1+ iy1and z
2=x2+ iy2
z
1+z2:=x1+x2+ i(y1+y2)
z
1z2:=x1x2 y1y2+ i(x1y2+x2y1)
Notice that the denition of multiplication just follows if we multiplyz1z2like normal numbers and use i 2= 1: z
1z2= (x1+ iy1)(x2+ iy2) =x1x2+ ix1y2+ iy1x2+ i2y1y2=x1x2 y1y2+ i(x1y2+x2y1):
A complex number is dened by a pair of real numbers, and so we can associate a vector in R
2with every complex numberz=x+iybyv(z) = (x;y). I.e., with every complex number
we associate a point in the plane, which we call then thecomplex plane. E.g., ifz=xis real, then the corresponding vector lies on the real axis. Ifz= i, thenv(i) = (0;1), and any purely imaginary numberz= iylies on they-axis. The addition of vectors corresponds to addition of complex numbers as we have dened it, i.e,, v(z1+z2) =v(z1) +v(z2):
1.3. COMPLEX NUMBERS11iy
xz=x+iy CFigure 1.3: Complex numbers as points in the plane: with the complex numberz=x+ iy we associate the pointv(z) = (x;y)2R2. But the multiplication is a new operation which had no correspondence for vectors. There- fore we want to study the geometric interpretation of multiplication a bit more carefully. To this end let us rst introduce another operation on complex numbers,complex conjugation, forz=x+ iywe dene z=x iy :
This corresponds to re
ection at thexaxis. Using complex conjugation we nd zz= (x+ iy)(x iy) =x2 ixy+ iyx+y2=x2+y2=kv(z)k2; and we will denote themodulusofzby jzj:=pzz=px 2+y2: Complex conjugation is useful when dividing complex numbers, we have forz6= 0 1z =zzz=zjzj2=xx
2+y2+ iyx
2+y2: and so, e.g., z1z
2=z2z1jz2j2:
Examples:
(2 + 3i)(4 2i) = 8 6i2+ 12i 4i = 14 + 8i
12 + 3i
=2 3i(2 + 3i)(2 3i)=2 3i4 + 9 =213 313 i
4 2i2 + 3i
=(4 2i)(2 3i)(2 + 3i)(2 3i)=2 10i4 + 9 =213 1013 i
12CHAPTER 1. THE EUCLIDEAN PLANE AND COMPLEX NUMBERS
It turns out that to discuss the geometric meaning of multiplication it is useful to switch to the polar representation. Recall the exponential function e zwich is dened by the series e z= 1 +z+12 z2+13! z3+14! z4+=1X n=01n!zn(1.5) This denition can be extended toz2C, since we can compute powersznofzand we can add complex numbers.
2We will use that for arbitrary complexz1;z2the exponential function
satises 3 e z1ez2= ez1+z2:(1.6)
We then have
Theorem 1.10(Eulers formula).We have
e i= cos+ isin :(1.7) Proof.This is basically a calculus result, we will sketch the proof, but you might need more calculus to fully understand it. We recall that the sine function and the cosine function can be dened by the following power series sin(x) =x 13! x3+15! x5 =1X k=0( 1)k(2k+ 1)!x2k+1 cos(x) = 1 12 x2+14! x4 =1X k=0( 1)k(2k)!x2k: Now we use (1.5) withz= i, and since (i)2= 2, (i)3= i3, (i)4=4, (i)5= i5, , we nd by comparing the power series e i= 1 + i 12 2 i13! 3+14! 4+ i15! 5+ = 1 12 2+14! 4+ + i 13! 3+15! 5+ = cos+ isin :Using Euler's formula we see that v(ei) =u(); see (1.1), so we can use the results from the previous section. We nd in particular that we can write any complex numberz,z6= 0, in the form z=ei: where=jzjandis called theargumentofz.2 We ignore the issue of convergence here, but the sum is actually convergent for allz2C.
3The proof of this relation for realzcan be directly extended to complexz
1.3. COMPLEX NUMBERS13
For the multiplication of complex numbers we nd then that ifz1=1ei1,z2=2ei2 then z
1z2=12ei(1+2);z1z
2=1
2ei(1 2);
so multiplication corresponds to adding the arguments and multiplying the modulus. In particular if= 1, then multiplying by eicorresponds to rotation byin the complex plane. The result (1.7) has as well some nice applications to trigonometric functions. (i) By (1.6) we have forn2Nthat (ei)n= ein, and since ei= cos+ isinand e in= cos(n) + isin(n) this gives us the following identity which is known asde
Moivre's Theorem:
(cos+ isin)n= cos(n) + isin(n) (1.8) If we choose for instancen= 2, and multiply out the left hand side, we obtain cos2+
2isincos sin2= cos(2) + isin(2) and separating real and imaginary part leads
to the two angle doubling identities cos(2) = cos2 sin2 ;sin(2) = 2sincos :
Similar identities can be derived for largern.
(ii) If we use e ie i'= ei( ')and apply (1.7) to both sides we obtain (cos+isin)(cos' isin') = cos( ') + isin( ') and multiplying out the left hand side gives the two relations cos( ') = coscos'+ sinsin' ;sin( ') = sincos' cossin' :(1.9) (iii) The relationship (1.7) can as well be used to obtain the following standard representa- tions for the sine and cosine functions: sin=ei e i2i ;cos=ei+ e i2 :(1.10)
14CHAPTER 1. THE EUCLIDEAN PLANE AND COMPLEX NUMBERS
Chapter 2
Euclidean spaceRn
We introducedR2as the set of ordered pairs (x1;x2) of real numbers, we now generalise this concept by allowing longer lists of numbers. For instance instead of ordered pairs we could take ordered triples (x1;x2;x3) of numbersx1;x2;x32Rand if we take 4, 5 or more numbers we arrive at the general concept ofRn Denition 2.1.Letn2Nbe a positive integer, the setRnconsists of all ordered n-tuples x= (x1;x2;x3;;xn)wherex1;x2;xnare real numbers. I.e., R n=f(x1;x2;;xn);x1;x2:;xn2Rg:z x yFigure 2.1: A vectorv= (x;y;z) inR3.
Examples:
(i)n= 1, then we just get the set of real numbersR. (ii)n= 2, this is the case we studied before,R2. (iii)n= 3, this isR3and the elements inR3provide for instance coordinates in 3-space. To a vectorx= (x;y;z) we associate a point in 3-space by choosingxto be the distance 15
16CHAPTER 2. EUCLIDEAN SPACERN
to the origin in thex-direction,yto be the distance to the origin in they-direction and zto be the distance to the origin in thez-direction. (iv) Letf(x) be a function dened on an interval [0;1], than we can consider adiscretisation off. I.e., we consider a grid of pointsxi=i=n,i= 1;2;;nand evaluatefat these points, (f(1=n);f(2=n);;f(1))2Rn: These values offform a vector inRnwhich gives us an approximation forf. The largernbecomes the better the approximation will usually be. We will mostly write elements ofRnin the fromx= (x1;x2;x3;;xn), but in some areas, e.g., physics one often sees x=0 B BB@x 1 x 2... x n1 C CCA; and we might occasionally use this notation, too. The elements ofRnare just lists ofnreal numbers and in applications these are often lists of data relevant to the problem at hand. As we have seen in the examples, these could be coordinates giving the position of a particle, but they could have as well a completely dierent meaning, like a string of economical data, e.g., the outputs ofndierent economical sectors, or some biological data like the numbers ofndierent species in an eco-system. Another way in which the setsRnoften show up is by by taking direct products. Denition 2.2.LetA;Bbe non-empty sets, then the setAB, called the direct product, is the set of ordered pairs(a;b)wherea2Aandb2B, i.e.,
AB:=f(a;b);a2A; b2Bg:(2.1)
IfA=Bwe sometimes writeAA=A2.
Examples
(i) IfA=f1;2gandB=f1;2;3gthen the setABhas the elements (1;1);(1;2);(1;3);(2;1);(2;2);(2;3). (ii) IfA=f1;2gthenA2has the elements (1;1);(1;2);(2;1);(2;2). (iii) IfA=R, thenR2=RRis the set with elements (x;y) wherex;y2R, so it coincides with the set we called alreadyR2. A further way how sets of the formRnfor largencan arise in applications is the following example. Assume we have two particles in 3 space. The position of particleAis described by points inR3, and the position of particleBis as well described by points inR3. If we want to describe now both particle at once, then it is natural to combine the two vectors with three components into one with six components: R
3R3=R6(2.2)
This example can be generalised. If we haveNparticles inR3then the positions of all these particles give rise toR3N.
2.1. DOT PRODUCT17
The construction of direct products can of course be extended to other sets, and for instanceCnis the set ofn-tuples of complex numbers (z1;z2;;zn). Now we will extend the results from Chapter 1. We can extend directly the denitions of addition and multiplication by scalars fromR2toRn.
Denition 2.3.Letx;y2Rn,x=0
B BB@x 1 x 2... x n1 C
CCA,y=0
B BB@y 1 y 2... y n1 C
CCA, we dene the sum ofxandy,x+y,
to be the vector x+y:=0 B BB@x 1+y1 x
2+y2...
x n+yn1 C CCA:
If2Rwe dene the multiplication ofx2Rnbyby
x:=0 B BB@x 1 x 2... x n1 C CCA: A simple consequence of the denition is that we have for anyx;y2Rnand2R (x+y) =x+y:(2.3) We will usually write 02Rnto denote the vector whose components are all 0. We have that x:= ( 1)xsatisesx x= 0 and 0x= 0 where the 0 on the left hand side is 02R, whereas the 0 in the right hand side is 0 = (0;0;;0)2Rn.
2.1 Dot product
We can extend the denition of the dot-product fromR2toRn: Denition 2.4.Letx;y2Rn, then the dot product ofxandyis dened by xy:=x1y1+x2y2+xnyn=nX i=1x iyi: Theorem 2.5.The dot product satises for allx;y;v;w2Rnand2R (i)xy=yx (ii)x(v+w) =xv+xwand(x+y)v=xv+yv (iii)(x)y=(xy)andx(y) =(xy)
Furthermorexx0andxx= 0is equivalent tox= 0.
18CHAPTER 2. EUCLIDEAN SPACERN
Proof.All these properties follow directly from the denition. So we leave most of them as an exercise, let us just prove (ii) and the last remark. To prove (ii) we use the denition x(v+w) =nX i=1x i(vi+wi) =nX i=1x ivi+xiwi=nX i=1x ivi+nX i=1x iwi=xv+xw; and the second identity in (ii) is proved the same way. Concerning the last remark, we notice that vv=nX i=1v 2i is a sum of squares, i.e., no term in the sum can be negative. Therefore, if the sum is 0, all
terms in the sum must be 0, i.e.,vi= 0 for alli, which means thatv= 0.Denition 2.6.The norm of a vector inRnis dened as
kxk:=pxx= nX i=1x 2i 12 : As inR2we think of the norm as a measure for the size, or length, of a vector. We will see below that we can use the dot product to dene the angle between vectors, but a special case we will introduce already here, namely orthogonal vectors. Denition 2.7. x;y2Rnare calledorthogonalifxy= 0. We often writex?yto indicate thatxy= 0holds.
Pythagoras Theorem:
Theorem 2.8.Ifxy= 0then
kx+yk2=kxk2+kyk2:
This will be shown in the exercises.
A fundamental property of the dot product is the Cauchy Schwarz inequality:
Theorem 2.9.For anyx;y2Rn
jxyj kxkkyk: Proof.Notice thatvv0 for anyv2Rn, so let us try to use this inequality by applying it tov=x ty, wheretis a real number which we will choose later. First we get
0(x ty)(x ty) =xx 2txy+t2yy;
and we see how the dot products and the norm related in the Cauchy Schwarz inequality appear. Now we have to make a clever choice fort, let us try t=xyyy; this is actually the value oftfor which the right hand side becomes minimal. With this choice we obtain
0 kxk2 (xy)2kyk2
and so (xy)2 kxk2kyk2which after taking the square root gives the desired result.
2.2. ANGLE BETWEEN VECTORS INRN19
This proof is maybe not very intuitive. We will actually give later on another proof, which is a bit more geometrical.
Theorem 2.10.The norm satises
(i)kxk 0, andkxk= 0only ifx= 0. (ii)kxk=jjkxk (iii)kx+yk kxk+kyk. Proof.(i) follows from the denition and the remark in Theorem 2.5 (ii) follows as well just by using the denition, see the corresponding proof in Theorem 1.5. To prove (iii) we consider kx+yk2= (x+y)(x+y) =xx+ 2xy+yy=kxk2+ 2xy+kyk2: and now applying the Cauchy Schwarz inequality in the formxy kxkkykto the right hand side gives kx+yk2 kxk2+ 2kxkkyk+kyk2= (kxk+kyk)2; and taking the square root gives the triangle inequality (iii).2.2 Angle between vectors inRn We found inR2that forx;y2R2,kxk;kyk 6= 0 that the angle between the vectors satises cos'=xykxkkyk ForRnwe take this as a denition of the angle between two vectors. Denition 2.11.Letx;y2Rnwithx6= 0andy6= 0, the angle'between the two vectors is dened by cos'=xykxkkyk: Notice that this denition makes sense because the Cauchy Schwarz inequality holds, nameley Cauchy Schwarz gives us 1xykxkkyk1 and therefore there exist an'2[0;) such that cos'=xykxkkyk:
20CHAPTER 2. EUCLIDEAN SPACERN
2.3 Linear subspaces
A "Leitmotiv" of linear algebra is to study the two operations of addition of vectors and multiplication of vectors by numbers. In this section we want to study the following two closely related questions: (i) Which type of subsets ofRncan be generated by using these two operations? (ii) Which type of subsets ofRnstay invariant under these two operations? The second question immediately leads to the following denition: Denition 2.12.A subsetVRnis called alinear subspaceofRnif (i)V6=;, i.e.,Vis non-empty. (ii) for allv;w2V, we havev+w2V, i.e.,Vis closed under addition (iii) for all2R,v2V, we havev2V, i.e.,Vis closed under multiplication by numbers.
Examples:
there are two trivial examples,V=f0g, the set containing only 0 is a subspace, and V=Rnitself satises as well the conditions for a linear subspace. Letv2Rnbe non-zero vector and let us take the set of all multiples ofv, i.e.,
V:=fv;2Rg
This is a subspace since, (i)V6=;, (ii) ifx;y2Vthen there are1;22Rsuch thatx=1vandy=2v, this follows from the denition ofV, and hencex+y=
1v+2v= (1+2)v2V, and (iii) ifx2V, i.e.,x=1vthenx=1v2V.
In geometric termsVis a straight line through the origin, e.g., ifn= 2 andv= (1;1), thenVis just the diagonal inR2.Vv Figure 2.2: The subspaceVR2(a line) generated by a vectorv2R2.
2.3. LINEAR SUBSPACES21
The second example we looked at is related to the rst question we initially asked, here we xed one vector and took all its multiples, and that gave us a straight line. Generalising this idea to two and more vectors and taking sums as well into account leads us to the following denition: Denition 2.13.Letx1;x2;;xk2Rnbekvectors, thespanof this set of vectors is dened as spanfx1;x2;xkg:=f1x1+2x2+kxk:1;2;;k2Rg:
We will call an expression like
1x1+2x2+kxk(2.4)
anlinear combinationof the vectorsx1;;xkwith coecients1;;k. So the span of a set of vectors is the set generated by taking all linear combinations of the vectors from the set. We have seen one example already above, but if we take for instance two vectorsx1;x22R3, and if they do point in dierent directions, then their span is a plane through the origin inR2. The geometric picture associated with a span is that it is a generalisation of lines and planes through the origin inR2andR3toRn.y xVFigure 2.3: The subspaceVR3generated by two vectorsxandy, it contains the lines throughxandy, and is spanned by these. Theorem 2.14.Letx1;x2;xk2Rnthenspanfx1;x2;;xkgis a linear subspace ofRn. Proof.The set is clearly non-empty. Now assumev;w2spanfx1;x2;;xkg, i.e., there exist1;2;;k2Rand1;2;;k2Rsuch that v=1x1+2x2++kxkandw=1x1+2x2++kxk:
Therefore
v+w= (1+1)x1+ (2+2)x2++ (k+k)xk2spanfx1;x2;;xkg; and v=1x1+2x2++kxk2spanfx1;x2;;xkg; for all2R. So spanfx1;x2;;xkgis closed under addition and multiplication by numbers, hence it is a subspace.
22CHAPTER 2. EUCLIDEAN SPACERN
Examples:
(a) Consider the set of vectors of the form (x;1), withx2R, i.e.,V=f(x;1);x2Rg. Is this a linear subspace? To answer this question we have to check the three properties in the denition. (i) since for instance (1;1)2Vwe haveV6=;, (ii) choose two elements inV, e.g., (1;1) and (2;1), then (1;1) + (2;1) = (3;2)=2V, hence the condition (ii) is not fullled andVis not a subspace. (b) Now for comparison chooseV=f(x;0);x2Rg. Then (i) ,V6=;, (ii) since (x;0) + (y;0) = (x+y;0)2Vwe have thatVis closed under addition. (iii) Since(x;0) = (x;0)2V,Vis closed under scalar multiplication. HenceVsatises all three conditions of the denition and is a linear subspace. (c) Now consider the setV= spanfx1;x2gwithx1= (1;1;1) andx2= (2;0;1). The span is the set of all vectors of the form
1x1+2x2;
where1;22Rcan take arbitrary values. For instance if we set2= 0 and let1run throughRwe obtain the line throughx1, similarly by setting1= 0 we obtain the line throughx2. The setVis now the plane containing these two lines, see Figure 2.3. To check if a vector is in this plane, i.e, inV, we have to see if it can be written as a linear combination ofx1andx2. (i) Let us check if (1;0;0)2V. We have to nd1;2such that (1;0;0) =1x1+2x2= (1+ 22;1;1+2): This gives us three equations, one for each component:
1 =1+ 22; 1= 0; 1+2= 0:
From the second equation we get1= 0, then the third equation gives2= 0 but the rst equation then becomes 1 = 0, hence there is a contradiction and (1;1;1)=2V. (ii) On the other hand side (0;2; 1)2V, since (0;2;1) = 2x1 x2: Another way to create a subspace is by giving conditions on the vectors contained in it. For instance let us chose a vectora2Rnand let us look at the set of vectorsxinRnwhich are orthogonal toa, i.e., which satisfy ax= 0 (2.5) i.e.,Wa:=fx2Rn;xa= 0g.
2.3. LINEAR SUBSPACES23a
W aFigure 2.4: The plane orthogonal to a non-zero vectorais a subspaceWa.
Theorem 2.15.Wa:=fx2Rn;xa= 0gis a subspace ofRn.
Proof.Clearly 02Wa, soWa6=;. Ifxa= 0, then (x)a=xa= 0 hencex2Waand
ifxa= 0 andya= 0, then (x+y)a=xa+ya= 0, and sox+y2Wa.For instance ifn= 2, thenWais the line perpendicular toa(ifa6= 0, otherwiseWa=R2)
and ifn= 3, thenWais a plane perpendicular toa(ifa6= 0, otherwiseWa=R3). There can be dierent vectorsawhich determine the same subspace, in particular notice that since for6= 0 we havexa= 0 if and only ifx(a) = 0 we getWa=Wafor6= 0.
In terms of the subspaceV:= spanathis means
W a=Wb;for allb2Vnf0g; and soWais actually perpendicular to the whole subspaceVspanned bya. This motivates the following denition: Denition 2.16.LetVbe a subspace ofRn, then theorthogonal complementV?is dened as V ?:=fx2Rn;xy= 0for ally2Vg: So the orthogonal complement consists of all vectorsx2Rnwhich are perpendicular to all vectors inV. So for instance ifVis a plane inR3, thenV?is the line perpendicular to it. Theorem 2.17.LetVbe a subspace ofRn, thenV?is as well a subspace ofRn. Proof.Clearly 02V?, soV?6=;. Ifx2V?, then for anyv2Vxv= 0 and therefore (x)v=xv= 0 and sox2V?, soV?is closed under multiplication by numbers. Finally ifx;y2V?, thenxv= 0 andyv= 0 for allv2Vand hence (x+y)v=xv+yv= 0 for allv2V, thereforex+y2V?.
24CHAPTER 2. EUCLIDEAN SPACERN
IfV= spanfa1;;akgis spanned bykvectors, thenx2V?means that thekconditions a
1x= 0
a
2x= 0
. ..... a kx= 0 hold simultaneously.
Subspaces can be used to generate new subspaces:
Theorem 2.18.AssumeV;Ware subspaces ofRn, then
V\Wis a subspace ofRn V+W:=fv+w;v2Vw2Wgis a subspace ofRn. The proof of this result will be left as an exercise, as will the following generalisation: Theorem 2.19.LetW1;W2;Wmbe subspaces ofRn, then W
1\W2\ \Wm
is a subspace ofRn, too.
We will sometimes use the notion of adirect sum:
Denition 2.20.A subspaceWRnis said to be thedirect sumof two subspacesV1;V2 R nif (i)W=V1+V2 (ii)V1\V2=f0g As example, considerV1= spanfe1g,V2= spanfe2gwithe1= (1;0);e2= (0;1)2R2, then R
2=V1V2:
If a subspaceWis the sum of two subspacesV1;V2, every element ofWcan be written as a sum of two elements ofV1andV2, and ifWis a direct sum this decomposition is unique: Theorem 2.21.LetW=V1V2, then for anyw2Wthere exist uniquev12V1;v22V2 such thatw=v1+v2. Proof.It is clear that there existv1;v2withw=v1+v2, what we have to show is uniqueness. So let us assume there is another pairv012V1andv022V2such thatw=v01+v02, then we can subtract the two dierent expressions forwand obtain
0 = (v1+v2) (v01+v02) =v1 v01 (v02 v2)
and thereforev1 v01=v02 v2. But in this last equation the left hand side is a vector inV1, the right hand side is a vector inV2and since they have to be equal, they lie inV1\V2=f0g, sov01=v1andv02=v2.
Chapter 3
Linear equations and Matrices
1 The simplest linear equation is an equation of the form 2x= 7 wherexis an unknown number which we want to determine. For this example we nd the solutionx= 7=2. Linear means that no powers or more complicated expressions ofxoccur, for instance the following equations
3x5 2x= 3 cos(x) + 1 = etan(x2)
arenot linear. But more interesting than the case of one unknown are equations where we have more than one unknown. Let us look at a couple of simple examples: (i)
3x 4y= 3
wherexandyare two unknown numbers. In this case the equation is satised for all x;ysuch that y=34 x 34 ; so instead of determining a single solution the equation denes a set ofx;ywhich satisfy the equation. This set is a line inR2. (ii) If we add another equation, i.e., consider the solutions to two equations, e.g.,
3x 4y= 3;3x+y= 1
then we nd again a single solution, namely subtracting the second equation from the rst gives 5y= 2, hencey= 2=5 and then from the rst equationx= 1+43 y= 7=15. Another way to look at the two equations is that they dene two lines inR2and the joint solution is the intersection of these two straight lines.1 c
University of Bristol 2011 This material is copyright of the University unless explicitly stated otherwise.
It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your
private study only. 25
26CHAPTER 3. LINEAR EQUATIONS AND MATRICES
(ii) But if we look instead at the slightly modied system of two equations
3x 4y= 3; 6x+ 8y= 0;
then we nd that these two equations havenosolutions. To see this we multiply the rst equation by 2, and then the set of two equations becomes 6x+ 8y= 3; 6x+ 8y= 0; so the two equations contradict each other and the system has no solutions. Geometri- cally speaking this means that the straight lines dened by the two equations have no intersection, i.e., are parallel.xy xyFigure 3.1: Left: A system of two linear equations in two unknowns (x;y) which determines two lines, their intersection gives the solution. Right: A system of two linear equations in two unknowns (x;y) where the corresponding lines have no intersection, hence the system has no solution. We found examples of linear equations which have exactly one solution, many solutions, and no solutions at all. We will see in the folowing that these examples cover all the cases which can occur in general. So far we have talked about linear equations but haven't really dened them. A linear equation innvariablesx1;x2;;xnis an equation of the form a
1x1+a2x2++anxn=b
wherea1;a2;;anandbare given numbers. The important fact is that no higher powers of the variablesx1;x2;;xnappear. A system ofmlinear equation is then just a collection ofmsuch linear equations: Denition 3.1.A system ofmlinear equations innunknownsx1;x2;;xnis a collection ofmlinear equations of the form a
11x1+a12x2++a1nxn=b1
a
21x1+a22x2++a2nxn=b2
......=... a m1x1+am2x2++amnxn=bm 27
where the coecientsaijandbjare given numbers. When we ask for a solutionx1;x2;;xnto a system of linear equations, then we ask for a set of numbersx1;x2;;xnwhich satisfy allmequations simultaneously. One often looks at the set of coecientsaijdening a system of linear equations as an independent entity in its own right. Denition 3.2.Letm;n2N, amnmatrixA(an "mbyn" matrix) is a rectangular array of numbersaij2R,i= 1;2;mandj= 1;;nof the form A=0 B BB@a
11a12a1n
a
21a22a2n.........
a m1am2amn1 C
CCA(3.1)
The numbersaijare called theelementsof the matrixA, and we often writeA= (aij)to denote the matrixAwith elementsaij. The set of allmnmatrices with real elements will be denoted by M m;n(R); and ifn=mwe will write M n(R): One can similarly dene matrices with elements in other sets, e.g,.Mm;n(C) is the set of matrices with complex elements.
An example of a 32 matrix is0
@1 3 1 0 2 21 A Anmnmatrix hasmrows andncolumns. Thei'throworrow vectorofA= (aij) is given by (ai1;ai2;;ain)2Rn and is a vector withncomponents, and thej'thcolumn vectorofAis given by 0 B BB@a 1j a 2j... a mj1 C
CCA2Rm;
and it hasmcomponents For the example above the rst and second column vectors are 0 @1 1 21
A and0 @3 0 21
A ; respectively, and the rst second and third row vectors are 1;3 1;0 2;2:
28CHAPTER 3. LINEAR EQUATIONS AND MATRICES
In Denition 3.1 the rows of the matrix of coecients are combined with thenunknowns to producemnumbersbi, we will take these formulas and turn them into a denition for the action ofmnmatrices on vectors withncomponents: Denition 3.3.LetA= (aij)be anmnmatrix andx2Rnwith componentsx= (x1;x2;;xn), then the action ofAonxis dened by Ax:=0 B BB@a
11x1+a12x2++a1nxn
a
21x1+a22x2++a2nxn...
a m1x1+am2x2++amnxn1 C
CCA2Rm(3.2)
Axis a vector inRmand if we writey=Axthen the components ofyare given by y i=nX j=1a ijxj(3.3) which is the dot-product betweenxand thei'th row vector ofA. The action ofAon elements ofRnis a map fromRntoRm, i.e.,
A:Rn!Rm:(3.4)
Another way of looking at the action of a matrix on a vector is as follows: Leta1;a2;;an2 R mbe the column vectors ofA, then
Ax=x1a1+x2a2++xnan:(3.5)
SoAxis a linear combination of the column vectors ofAwith coecients given by the components ofx. This relation follows directly from (3.3).
This map has the following important properties:
Theorem 3.4.LetAbe anmnmatrix, then the map dened in denition 3.3 satises the two properties (i)A(x+y) =Ax+Ayfor allx;y2Rn (ii)A(x) =Axfor allx2Rnand2R. Proof.This is most easily shown using (3.3). Let us denote the components of the vector A(x+y) byzi,i= 1;2;;m, i.e.,z=A(x+y) withz= (z1;z2;;zm), then by (3.3) z i=nX j=1a ij(xj+yj) =nX j=1a ijxj+nX j=1a ijyj; and on the right hand side we have the sum of thei'th components ofAxandAy, again by (3.3). The second assertionA(x) =Axfollows again directly from (3.3) and is left as a simple exercise.Corollary 3.5.Assumex=1x1+2x2++kxk2Rnis a linear combination ofk vectors, andA2Mmn(R), then
Ax=1Ax1+2Ax2+kAxk:(3.6)
3.1. MATRICES29
Proof.We use (i) and (ii) from Theorem 3.4
Ax=A(1x1+2x2+kxk)
=A(1x1) +A(2x2+kxk) =1Ax1+A(2x2+kxk)(3.7)
and we repeat this stepk 1 times.Using the notation of matrices and their action on vectors we have introduced, a system
of linear equations of the form in Denition 3.1 can now be rewritten as
Ax=b:(3.8)
So using matrices allows us to write a system of linear equations in a much more compact way. Before exploiting this we will pause and study matrices in some more detail.
3.1 Matrices
The most important property of matrices is that one can multiply them under suitable con- ditions on the number of rows and columns. The product of matrices appears naturally of we consider a vectory=Axand apply another matrix to it, i.e.,By=B(Ax) the question is then if there exist a matrixCsuch that
Cx=B(Ax);(3.9)
then we would callC=BAthe matrix product ofBandA. If we use the representation (3.5) and Corollary 3.5 we obtain
B(Ax) =B(x1a1+x2a2++xnan)
=x1Ba1+x2Ba2++xnBan:(3.10) Hence ifCis the matrix with columnsBa1;;Ban, then, again by (3.5), we haveCx=
B(Ax).
We formulate this now a bit more precisely:
Theorem 3.6.LetA= (aij)2Mm;n(R)andB= (bij)2Ml;m(R)then there exist a matrix
C= (cij)2Ml;n(R)such that for allx2Rnwe have
Cx=B(Ax) (3.11)
and the elements ofCare given by c ij=mX k=1b ikakj:(3.12) Note thatcijis the dot product between thei'th row vector ofBand thej'th column vector ofA. We callC=BAthe product ofBandA.
30CHAPTER 3. LINEAR EQUATIONS AND MATRICES
The theorem follows from (3.10), but to provide a dierent perspective we give another proof: Proof.We writey=Axand note thaty= (y1;y2;;ym) with y k=nX j=1a kjxj(3.13) and similarly we writez=Byand note thatz= (z1;z2;;zl) with z i=mX k=1b ikyk:(3.14) Now inserting the expression (3.13) forykinto (3.14) gives z i=mX k=1b ikn X j=1a kjxj=nX j=1m X k=1b ikaijxj=nX j=1c ijxj;(3.15)
where we have exchanged the order of summation.Note that in order to multiply to matricesAandB, the number of rows ofAmust be the
same as the number of columns ofBin order thatBAcan be formed. Theorem 3.7.LetA;Bbemnmatrices andCamlmatrix, then
C(A+B) =CA+AB :
LetA;Bbemlmatrices andCanmmatrix, then
(A+B)C=AC+BC : LetCbe amnmatrix,Bbe almmatrix andAalkmatrix, then
A(BC) = (AB)C :
The proof of this Theorem will be a simple consequence of general properties of linear maps which we will discuss in Chapter 5. Now let us look at a few examples of matrices and products of them. We say that a matrix is asquare matrixifm=n. IfA= (aij) is annsquare matrix, then we call the elements a iithediagonal elementsofAandaijfori6=jtheo-diagonalelements ofA. A square matrixAis called adiagonal matrixif all o-diagonal elements are 0. E.g. the following is a
33 diagonal matrix0
@ 2 0 0 0 3 0
0 0 11
A ; with diagonal elementsa11= 2;a22= 3 anda33= 1. A special role is played by the so called unit matrixI, this is a matrix with elements ij:=( 1i=j
0i6=j;(3.16)
3.1. MATRICES31
i.e., a diagonal matrix with all diagonal elements equal to 1: I=0 B
BB@1 00
0 10
............
0 011
C CCA: The symbolijis often called theKronecker delta. If we want to specify the size of the unit matrix we writeInfor thennunit matrix. The unit matrix is the matrix of the identity in multiplication, i.e., we have for anymnmatrixA AI n=ImA=A : Let us now look at a couple of examples of products of matrices. Lets start with 22 matrices, a standard product gives e.g. 1 2 1 0 5 1 3 1 =1( 5) + 23 11 + 2( 1) 1( 5) + 03 11 + 0( 1) = 1 1 5 1 ;(3.17) where we have explicitly written out the intermediate step where we write each element of the product matrix as a dot product of a row vector of the rst matrix and a column vector of the second matrix. For comparison, let us compute the product the other way round 5 1 3 1 1 2 1 0 = 6 10 4 6 (3.18) and we see that the result is dierent. So contrary to the multiplication of numbers,the product of matrices depends on the order in which we take the product.I.e., in general we have
AB6=BA :
A few other interesting matrix products are
(a)1 0 0 0 0 0 0 1 =0 0 0 0 = 0, the product of two non-zero matrices can be 0 (b) 0 1 0 0 2 =0 1 0 0 0 1 0 0 =0 0 0 0 = 0 the square of a non-zero matrix can be 0. (c) LetJ=0 1 1 0 thenJ2= I, i.e., the square ofJis I, very similar to i =p 1. These examples show that matrix multiplication behaves very dierent from multiplication of numbers which we are used to. It is as well instructive to look at products of matrices which are not square matrices. Recall that by denition we can only form the product ofAandB,AB,if the number of rows ofBis equal to the number of columns ofA.Consider for instance the following matrices
A= 1 1 2; B=0
@2 0 11 A ; C=1 3 0 2 1 3 ; D=0 1 1 0 ;
32CHAPTER 3. LINEAR EQUATIONS AND MATRICES
thenAis 13 matrix,Ba 31 matrix,Ca 23 matrix andDa 22 matrix. So we can form the following products
AB= 4; BA=0
@2 2 4 0 0 0
1 1 21
A ; CB=2 1 ; DC= 2 1 3
1 3 0
and apart fromD2=Ino others. There are a few types of matrices which occur quite often and have therefore special names. We will give a list of some we will encounter: triangular matrices: these come in two types, {upper triangular:A= (aij) withaij= 0 ifi > j, e.g.,0 @1 3 1 0 2 1
0 0 31
A {lower triangular:A= (aij) withaij= 0 ifi < j, e.g.,0 @1 0 0 5 2 0
2 7 31
A symmetric matrices:A= (aij) withaij=aji, e.g.,0 @1 2 3 2 1 0
3 0 11
A . anti-symmetric matrices:A= (aij) withaij= aji, e.g.,0 @0 1 2 1 0 3 2 3 01 A . The following operation on matrices occurs quite often in applications. Denition 3.8.LetA= (aij)2Mm;n(R)then thetransposedofA,At, is a matrix in M n;m(R)with elementsAt= (aji)(the indicesiandjare switched). I.e.,Atis obtained fromAby exchanging the rows with the columns. For the matricesA;B;C;Dwe considered above we obtain A t=0 @1 1 21
A ; Bt= 2 0 1; Ct=0 @1 2 3 1 0 31 A ; Dt=0 1 1 0 ; and for instance a matrix is symmetric ifAt=Aand anti-symmetric ifAt= A. Any square matrixA2Mn;n(R) can be decomposed into a sum of a symmetric and an anti-symmetric matrix by A=12 (A+At) +12 (A At): One of the reasons why the transposed is important is the following relation with the dot-product. Theorem 3.9.LetA2Mm;n(R), then we have for anyx2Rnandy2Rm yAx= (Aty)x:
3.2. THE STRUCTURE OF THE SET OF SOLUTIONS TO A SYSTEM OF LINEAR EQUATIONS33
Proof.Thei'th component ofAxisPn
j=1aijxjand soyAx=Pm i=1P n j=1yiaijxj. On the other hand thej'th component ofAtyisPm i=1aijyiand so (Aty)x=Pn j=1P m i=1xjaijyi. And since the order of summation does not matter in a double sum the two expressions agree.One important property of the transposed which can be derived from this relation is
Theorem 3.10.
(AB)t=BtAt Proof.Using Theorem 3.9 for (AB) gives (AB)tyx=y(ABx) and now we apply Theorem
3.9 rst toAand then toBwhich givesy(ABx) = (Aty)(Bx) = (BtAty)xand so we
have (AB)tyx= (BtAty)x:
Since this is true for anyx;ywe have (AB)t=AtBt.3.2 The structure of the set of solutions to a system of linear
equations In this section we will study the general structure of the set of solutions to a system of linear equation, in case that is has solutions. In the next section we will then look at methods to actually solve a system of linear equations.
Denition 3.11.LetA2Mm;n(R)andb2Rm, then we set
S(A;b) :=fx2Rn;Ax=bg:
This is a subset ofRnand consists of all the solutions to the system of linear equations
Ax=b. If there are no solutions thenS(A;b) =;.
One often distinguishes between two types of systems of linear equations. Denition 3.12.The system of linear equationsAx=bis calledhomogeneousifb= 0, i.e., if it is of the form
Ax= 0:
Ifb6= 0the system is calledinhomogeneous.
The structure of the set of solutions of a homogeneous equation leads us back to the theory of subspaces. Theorem 3.13.LetA2Mm;n(R)thenS(A;0)Rnis a linear subspace. Before proving the theorem let us just spell out what this means in detail for a homoge- neous set of linear equationsAx= 0: (i) there is alway at least one solution, namelyx= 0. (ii) the sum of any two solutions is again a solution, (iii) any multiple of a solution is again a solution.
34CHAPTER 3. LINEAR EQUATIONS AND MATRICES
Proof.We will actually give two proofs, just to emphasise the relation of this result with the theory of subspace we developed previously. The equationAx= 0 means that the following mequations hold; a
1x= 0;a2x= 0;amx= 0;
wherea1;a2;;amare themrow vectors ofA. But thex2Rnwhich satisfya1x= 0 form a subspaceWa1by Theorem 2.15, and similarly the otherm 1 equationsaix= 0 dene subspaceWa2;;Wam. Now a solution toAx= 0 lies in all these spaces and, vice versa, ifxlies in all these spaces it is a solution toAx= 0, hence
S(A;0) =Wa1\Wa2\ \Wam;
and this is a subspace by Theorem 2.19 which was proved in the exercises. The second proof uses Theorem 3.4 to check the conditions a subspace has to fulll directly. We nd (i)S(A;0) is nonempty sinceA0 = 0, hence 02S(A;0), (ii) ifx;y2S(A;0), then A(x+y) =Ax+Ay= 0 + 0 = 0, hencex+y2S(A;0) and nally (iii) ifx2S(A;0) then
A(x) =Ax=0 = 0 and thereforex2S(A;0).The second proof we gave is more direct, but the rst proof has a geometrical interpretation
which generalises our discussion of examples at the beginning of this chapter. InR2the spaces W aiare straight lines and the solution to a system of equations was given by the intersection of these lines. InR3the spaceWaiare planes, and intersecting two of them will give typically a line, intersecting three will usually give a point. The generalisations to higher dimensions are called hyperplanes then, and the solution to a system of linear equations can be described in terms of intersections of these hyperplanes. If the system is inhomogeneous, then it doesn't necessarily have a solution. But for the ones which have a solution we can determine the structure of the set of solutions. The key observation is that if we have one solution, sayx02Rnwhich satisesAx0=b, then we can create further solutions by adding solutions of the corresponding homogeneous system,
Ax= 0, since ifAx= 0
A(x0+x) =Ax0+Ax=b+ 0 =b;
and sox0+xis a another solution to the inhomogeneous system. Theorem 3.14.LetA2Mm;n(R)andb2Rnand assume there exist anx02Rnwith
Ax0=b, then
S(A;b) =fx0g+S(A;0) :=fx0+x;x2S(A;0)g(3.19)
Proof.As we noticed above, ifx2S(A;0), thenA(x0+x) =b, hencefx0g+S(A;0)
S(A;b).
On the other hand side, ify2S(A;b) thenA(y x0) =Ay Ax0=b b= 0, and so y x02S(A;0). ThereforeS(A;b) fx0g+S(A;0) and soS(A;b) =fx0g+S(A;0).Remarks: (a) The structure of the set of solutions is often described as follows: The general solution of the inhomogeneous systemAx=bis given by a special solutionx0to the inho- mogeneous system plus a general solution to the corresponding homogeneous system
Ax= 0.
3.2. THE STRUCTURE OF THE SET OF SOLUTIONS TO A SYSTEM OF LINEAR EQUATIONS35
(b) The case that there is unique solution toAx=bcorresponds toS(A;0) =f0g, then
S(A;b) =fx0g.
At rst sight the denition of the setfx0g+S(A;0) seems to depend on the choice of the particular solutionx0toAx0=b. But this is not so, another choicey0just corresponds to a dierent labelling of the elements of the set. Let us look at an example of three equations with three unknowns:
3x+z= 0
y z= 1
3x+y= 1
this set of equations corresponds to A=0 @3 0 1 0 1 1
3 1 01
A b=0 @0 1 11 A : To solve this set of equations we try to simplify it, if we subtract the rst equation from the third the third equation becomesy z= 1 which is identical to the second equation, hence the initial system of 3 equations is equivalent to the following system of 2 equations:
3x+z= 0; y z= 1:
In the rst one we can solve forxas a function ofzand in the second foryas a function of z, hence x= 13 z ; y= 1 +z :(3.20) Sozis arbitrary, but oncezis chosen,xandyare xed, and the set of solutions ins given by
S(A;b) =f( z=3;1 +z;z) ;z2Rg:
A similar computation gives for the corresponding homogeneous system of equations
3x+z= 0
y z= 0
3x+y= 0
the solutionsx= z=3,y=z, andz2Rarbitrary, hence
S(A;0) =f( z=3;z;z) ;z2Rg:
A solution to the inhomogeneous system is given by choosingz= 0 in (3.20), i.e.,x0= (0;1;0), and then the relation
S(A;b) =fx0g+S(A;0)
can be seen directly, since forx= ( z=3;z;z)2S(A;0) we havex0+x= (0;1;0) + ( z=3;z;z) = ( z=3;1+z;z) which was the general form of an element inS(A;b). But what
36CHAPTER 3. LINEAR EQUATIONS AND MATRICES
happens if we choose another element ofS(A;b)? Let2R, thenx:= ( =3;1 +;) is inS(A;b) and we again have
S(A;b) =fxg+S(A;0);
sincex+x= ( =3;1+;)+( z=3;z;z) = ( (+z)=3;1+(+z);(+z)) and ifzruns throughRwe again obtain the whole setS(A;b), independent of whichwe chose initially. The choice ofonly determines the way in which we label the elements inS(A;b). Finally we should notice that the setS(A;0) is spanned by one vector, namely we have ( z=3;z;z) =z( 1=3;1;1) and hence withv= ( 1=3:1:1) we haveS(A;0) = spanfvgand
S(A;b) =fxg+ spanfvg:
In the next section we will develop systematic methods to solve large systems of linear equations.
3.3 Solving systems of linear equations
To solve a system of linear equations we will introduce a systematic way to simplify it until we can read of directly if it is solvable and compute the solutions easily. Again it will be useful to write the system of equations in matrix form.
3.3.1 Elementary row operations
Let us return for a moment to the original way of writing a set of m linear equations inn unkowns, a
11x1+a12x2++a1nxn=b1
a
21x1+a22x2++a2nxn=b2
a m1x1+am2x2++amnxn=bm We can perform the following operations on the set of equations without changing the solu- tions, (i) multiply an equation by a non-zero constant (ii) add a multiple of any equation to one of the other equations (iii) exchange two of the equations. It is clear that operations (i) and (iii) don't change the set of solutions, to see that operation (ii) doesn't change the set of solutions we can ague as follows: Ifx2Rnis a solution to the system of equations and we change the system by addingtimes equationito equationj thenxis clearly a solution of the new system, too. But ifx02Rnis a solution to the new system we can return to the old system by substractingtimes equationifrom equationj, and the previous argument gives thatx0must be a solution to the old system, too. Hence both systems have the same set of solutions.
3.3. SOLVING SYSTEMS OF LINEAR EQUATIONS37
The way to solve a system of equations is to use the above operations to simplify a system of equations systematically until we can basically read of the solutions. It is useful to formulate this using the matrix representation of a system of linear equations Ax=b: Denition 3.15.LetAx=bbe a system of linear equations, theaugmented matrix associated with this system is Ab: It is obtained by addingbas the nal column toA, hence it is am(n+ 1)matrix if the system hasnunknowns andmequations. Now we translate the above operations on systems of equations into operations on the augmented matrix. Denition 3.16.Anelementary row operation (ERO)is one of the following operations on matrices: (i) multiply a row by a non-zero number (rowi!rowi) (ii) add a multiple of one row to another row (rowi!rowi+rowj) (iii) exchange two rows (rowi$rowj) Theorem 3.17.LetA2Mm;n(R),b2Rmand assume(A0b0)is obtained from(Ab) by a sequence of ERO's, then the corresponding systems of linear equations have the same solutions, i.e.,
S(A;b) =S(A0;b0):
Proof.If we apply these operations to the augmented matrix of a system of linear equations then they clearly correspond to the three operations (i), (ii), and (iii) we introduced above,
hence the system corresponding to the new matrix has the same set of solutions.We want to use these operations to systematically simplify the augmented matrix. Let us
look at an example to get an idea which type of simplication we can achieve. Example: Consider the following system of equations x+y+ 2z= 9
2x+ 4y 3z= 1
3x+ 6y 5z= 0
this is of the formAx=bwith A=0 @1 1 2 2 4 3
3 6 51
A b=0 @9 1 01 A ; hence the corresponding augmented matrix is given by 0 @1 1 2 9
2 4 3 1
3 6 5 01
A :
38CHAPTER 3. LINEAR EQUATIONS AND MATRICES
Applying elementary row operations gives
0 @1 1 2 9
2 4 3 1
3 6 5 01
A 0 @1 1 2 9
0 2 7 17
0 3 11 271
A row2 2row1row3 3row1 0 @1 1 2 9
0 2 7 17
0 1 4 101
A row3 row2 0 @1 1 2 9
0 1 4 10
0 2 7 171
A row3$row2 0 @1 1 2 9
0 1 4 10
0 0 1 31
A row3 2row2 where we have written next to the matrix which elementary row operations we applied in order to arrive at the given line from the previous one. The system of equations corresponding to the last matrix is x+y+ 2z= 9 y 4z= 10 z= 3 so we havez= 3 from the last equation, substituting this in the second equation gives y= 10+4z= 10+12 = 2 and substituting this in the rst equation givesx= 9 y 2z=
9 2 6 = 1. So we see that if the augmented matrix is in the above triangular like form we
can solve the system of equations easily by what is called backsubtitution. But we can as well continue applying elementary row operations and nd 0 @1 1 2 9
0 1 4 10
0 0 1 31
A 0 @1 0 6 19
0 1 4 10
0 0 1 31
A row1 row2 0 @1 0 0 1
0 1 0 2
0 0 1 31
A row1 6row3row2 + 4row3 Now the corresponding system of equations is of even simpler form x= 1 y= 2 z= 3
3.3. SOLVING SYSTEMS OF LINEAR EQUATIONS39
and gives the solution directly. The dierent forms into which we brought the matrix by elementary row operations are of a special type:
Denition 3.18.A matrixMis inrow echelon formif
(i) in each row the leftmost non-zero number is1(theleading1in that row) (ii) if rowiis above rowj, then the leading1of rowiis to the left of rowj A matrix is inreduced row echelon formif, in addition to(i)and(ii)it satises (iii) in each column which contains a leading1, all other numbers are0.
The following matrices are in row echelon form
0 @14 3 2 016 2
0 0151
A0 @11 0 010
0 0 01
A0 @012 6 0
0 01 1 0
0 0 0 011
A and these ones are in reduced row echelon form: 0 @10 0 010 0 011 A0 @10 0 4 010 7
0 01 11
A0 B
B@01 2 0 1
0 0 013
0 0 0 0 0
0 0 0 0 01
C CA In the examples we have marked the leading 1's, we will see that their distribution determines the nature of the solutions of the corresponding system of equations. The reason for introducing these denitions is that elementary row operations can be used to bring any matrix to these forms: Theorem 3.19.Any matrixMcan by a nite number of elementary row operations be brought to row echelon form, this is called Gaussian elimination reduced row echelon form, this is called Gauss-Jordan elimination. Proof.LetM= (m1;m2;;mn) wheremi2Rmare the column vectors ofM. Take the leftmost column vector which is non-zero, say this ismj, and exchange rows until the rst entry in that vector is non-zero, and divide the rst row by that number. Now the matrix is M
0= (m01;m02;;m0n) and the leftmost non-zero column vector is of the form
m 0j=0 B BB@1 a j2... a jm1 C CCA: Now we can substract multiples of the rst row from the other rows until all numbers in the j'th column below the top 1 are 0, more precisely, we substract from rowi aj1times the rst row. We have transformed the matrix now to the form0 1 0 0 ~M
40CHAPTER 3. LINEAR EQUATIONS AND MATRICES
and now we apply the same procedure to the matrix ~M. Eventually we arrive at row echelon form. To arrive at reduced row echelon form we start from row echelon form and use the
leading 1's to clear out all non-zero elements in the columns containing a leading 1.The example above is an illustration on how the reduction to row echelon and reduced
row echelon form works. Let us now turn to the question what the row echelon form tells us about the structure of the set of solutions to a system of linear equations. The key information lies in the distribution of the leading 1's. Theorem 3.20.LetAx=bbe a system of equations innunknowns, andMbe the row echelon form of the associated augmented matrix. Then (i) the system has no solutions if and only if the last column ofMcontains a leading1, (ii) the system has a unique solution if every column except the last one of contains a leading 1, (iii) the system has innitely many solutions if the last column ofMdoes not contain a leading1and there are less thannleading1's. Then theren kunknowns which can chosen arbitrarily, wherekis the number of leading1' s ofM Proof.Let us rst observe that the leading 1's of the reduced row echelon form of a system are the same as the leading 1' of the row echelon form. Therefore we can assume the system is in reduced row echelon form, that makes the arguments slightly simpler. Let us start with the last non-zero row, that is the row with the rightmost leading 1, and consider the corresponding equation. If the leading 1 is in the last column, then this equation is of the form
0x1+ 0x2++ 0xn= 1;
and so we have the contradiction 0 = 1 and therefore there is nox2Rnsolving the set of equations. This is case (i) of the theorem. If the last column does not contain a leading 1, but all other columns contain leading 1's then the reduced row echelon form is 0 B
BBBB@1 00b010 10b02...............
0 01b0n1
C CCCCA and ifm > nthere arem nrows with only 0's. The corresponding system of equations is then x
1=b01; x2=b02;xn=b0n
and so there is a unique solution. This is case (ii) in the theorem. Finally let us consider the case that there arekleading 1's withk < nand none of them is in the last column. Let us index the column with leading 1's byj1;j2;;jk, then the
3.4. ELEMENTARY MATRICES AND INVERTING A MATRIX41
system of equations corresponding to the reduced row echelon form is of the form x j1+X inot leadinga j1ixi=b0j 1 x j2+X inot leadinga j2ixi=b0j 2 x jk+X inot leadinga jkixi=b0jk where the sums contain only thosexiwhose index is not labelling a column with a leading
1. These aren kunknownsxiwhose value can be chosen arbitrarily and once their value
is xed, the remainingkunknowns are determined uniquely. This proves part (iii) of the Theorem.Let us note one simple consequence of this general Theorem which we will use in a couple of proofs later on, it gives a rigorous basis to the intuitive idea that if you havenunknowns, you need at leastnlinear equations to determine the unknowns uniquely.
Corollary 3.21.LetA2Mm;n(R)and assume that
S(A;0) =f0g;
i.e., the only solution toAx= 0isx= 0, thenmn. Proof.This follows from part (ii) of the previous Theorem, if every column has a leading one
then there are at least as many rows as columns, i.e.,mn.3.4 Elementary matrices and inverting a matrix
We now want to discuss inverses of matrices in some more detail. Denition 3.22.A matrixA2Mn;n(R)is calledinvertible, ornon-singular, if there exist a matrixA 12Mn;n(R)such that A 1A=I :
IfAis not invertible then it is calledsingular.
We will rst give some properties of inverses, namely that a left inverse is as well a right inverse, and that the inverse is unique. These properties are direct consequences of the corresponding properties of linear maps, see Theorem 5.23, so we will give a proof in Chapter 5. Theorem 3.23.LetA2Mn(R)be invertible with inverseA 1, then (i)AA 1=I (ii) IfBA=Ifor some matrixB2Mn(R)th