Remaining TODOs: 30
Relevant reading:
For chapters 1-5: Jim Hefferon. Linear Algebra.
For chapter 6: Lloyd N. Trefethen & David Bau. Numerical Linear Algebra.
1 Linear systems
1.i Solving Linear Systems
In a system of equations, the th row is denoted with the Greek letter rho as .
Definition 1.1.3: Elementary row operations are operations that can be carried out on a system of linear equations without changing the solution set.
These operations, all affecting row , are:
- swapping two rows, , where ;
- scaling a row, where ;
- adding one row to another row, , where and .
Lemma 1.1.4: Elementary row operations preserve the solution set of a linear system.
Proof:
Consider some system of linear equations
For the tuple to satisfy this system of equations, we must have that and that , and so on; swapping the order of two of the linear equations in the system does not change the validity of this solution, as logical and is commutative. Hence operation (i) is an elementary row operation.
If we multiply a linear equation by some non-zero scalar , that equation still holds true, and so the solution set to the system is still valid. Hence operation (ii) is an elementary operation.
Any solution set that satisfies the system of linear equations must satisfy each individual equation. If we add two equations together, the solutions will still hold, so operation (iii) is an elementary row operation.
Example: If we have a system of equations
then and are free variables. We can parametrise the solution set using and as parameters to say that
but it is more clear to write and as new variables:
The Gaussian elimination of a system of linear equations follows the same operations as the Gaussian elimination of its associated system of homogeneous linear equations.
It is helpful to study homogeneous systems because these always have a particular solution, the zero vector.
In general, we can say that the general solution to a system of linear equations is the sum of a particular solution and the solution to the associated system of homogeneous equations.
Lemma 1.1.9: The zero vector is always a particular solution of a homogeneous system of linear equations.
Proof:
Take substituting for in each equation in a homogeneous system gives which is trivially true and so is a particular solution of any homogeneous system of linear equations.
Lemma 1.1.10: For any homogeneous linear system of equations, there exist vectors such that the solution set of the system is
where is the number of free variables in an echelon form of the system.
This can be rephrased as: every leading variable in a homogeneous system can be expressed as a linear combination of free variables.
Proof:
Consider some homogeneous system that has been transformed into echelon form, ignoring any trivially true equations (). If the system consists entirely of tautologies, the lemma is trivially true as there are no leading variables which need be expressed.
We will denote the index of the leading variable in a row as .
Consider the bottom equation, row ,
where . Here, are free variables as there are no rows below this in which they could be leading. The leading variable can be expressed in terms of the free variables:
Suppose that every leading variable for for arbitrary can be expressed as a linear combination of free variables. Then the leading variable of the th equation can be expressed as a linear combination of free variables, since every other variable in the th equation is either a free variable, or is a leading variable in a lower equation and so can be expressed as a linear combination of free variables itself.
Hence if the th leading variable can be expressed as a linear combination of free variables, so can the th equation. Hence by induction, every leading variable can be expressed as a linear combination of free variables.
Lemma 1.1.12: For a linear system and a particular solution , the system is satisfied by the solution set
Proof:
Suppose that satisfies the system. Then satisfies the associated homogeneous system since for all
where is the constant of the th equation and and are the th components of and respectively.
Suppose that we have a vector that satisfies the associated homogeneous system, then satisfies the system because, for all ,
Theorem 1.1.13: The solution set of any linear system has the form
where is any particular solution and is the number of free variables in the echelon form of the system.
Proof:
This follows from Lemma 1.1.10 and Lemma 1.1.12.
1.ii Linear Geometry
Definition 1.2.2: The line (in ), plane (in ), or -dimensional linear surface, -flat or -dimensional hyperplane (in is a set of the form
where and . More specifically, it is comprised of the endpoints of these vectors when their tips are at the origin.
Definition 1.2.3: The length of a vector is defined as
Definition 1.2.5: The dot product, scalar product or inner product of two vectors and is defined as
Theorem 1.2.6 (Cauchy-Schwartz inequality): For any ,
with equality only if there exists such that .
Proof:
which is true because is always positive, so is also always positive.
Equality occurs iff .
If , then .
If , then
Hence equality occurs iff .
Theorem 1.2.7 (Triangle inequality): For any , with equality iff .
Proof:
Both sides of the inequality are positive, so squaring both sides does not change the validity of the inequality.
which is true by the Cauchy-Schwartz inequality (Theorem 1.2.6).
Equality occurs iff , by the Cauchy-Schwartz inequality again.
Definition 1.2.8: The angle between two vectors is defined as
Remark: That this is always defined since, following from the Cauchy-Schwartz inequality,
Remark: The formula in Definition 1.2.8 follows from the Law of Cosines.
Proof:
Consider two vectors , and call the angle between them . Using the Law of Cosines on the triangle with sides and gives that
The left side gives
while the right side gives
Combining the two sides and cancelling like terms gives
which simplifies and rearranges to give Definition 1.2.8.
Remark: That the Law of Cosines is just an extension of Pythagoras’ theorem:
1.iii Reduced echelon form
This is useful because we can just read the solution off.
Example: Consider the following system matrix in reduced echelon form:
The solution set of this system is simply
To get to reduced echelon form, we use Gauss-Jordan reduction, whereby we first carry out Gaussian elimination, then scale each row such that the leading variables have coefficients of 1, then eliminate any non-leading and non-free variables in each row by working upwards.
Lemma 1.3.2: Elementary row operations are reversible.
Proof:
has the inverse of . has the inverse of for . has the inverse , where .
Lemma 1.3.4: Between matrices, ‘reduces to’ is an equivalence relation.
Proof:
To be an equivalence relation, ‘reduces to’ must satisfy reflexivity, symmetry and transitivity.
Reflexivity is trivial since we can reduce through carryig out no operations.
Symmetry follows from Lemma 1.3.2.
The relation is transitive because if we can reduce and , we can carry out the operations in order to go from to reduce .
Lemma 1.3.5 (Linear combination lemma): A linear combination of linear combinations is a linear combination.
Proof:
Given the set of linear combinations , consider the linear combination of these
which is a linear combination.
Lemma 1.3.7: In an echelon form matrix, no nonzero row is a linear combination of the other nonzero rows.
Proof:
Consider any row of the matrix. The leading entry in that row is one, but the entry in that column in every other row is equal to zero (by the definition of reduced echelon form), so there is no linear combination of other rows that can make this row.
Proof:
We proceed by induction on the number of columns, , for a fixed arbitrary number of rows .
Let .
If is the zero matrix, then it is in reduced echelon form and is not row-equivalent to any other matrix (this follows from Corollary 1.3.6). Otherwise, when reduced to reduced-echelon form, the only nonzero entry must be a 1 in the first row. Regardless, the reduced echelon form matrix is unique.
Now assume that, for some , all matrices for have a unique reduced echelon form matrix, and consider .
Suppose that reduces to two distinct reduced echelon form matrices, and . Let be the matrix consisting of the first columns of . Any sequence of operations that reduces to reduced echelon form must also reduce to reduced echelon form, and by the inductive hypothesis, reduces to a unique reduced echelon form matrix. Therefore, and must differ in the th column.
We write the linear system with matrix of coefficients as
the systems with matrices of coefficients and are written similarly. These systems must all have the same solution sets, since carrying out elementary row operations on linear systems does not change the solution set. With and differing only in the th column, suppose they differ in row . We subtract row of the systems represented by and to get that . Since , .
Therefore, is not a free variable, so the th columns of and must contain the leading entry of some row, since any column that does not have a leading entry is associated with a free variable.
Because the first columns of and are equal, a leading variable in the th column must be in the same position in and , and must be equal to 1; therefore, and are equal, so reduces to a unique reduced echelon form matrix.
2 Vector spaces
2.i Definition of vector space
Definition 2.1.1: A vector space (over ) consists of a set along with two operations ‘’ and ‘’ subject to the condition that, for all :
- is closed under addition i.e.
- addition is commutative i.e.
- addition is associative
- there is an additive identity, i.e.
- every element of has an additive inverse, i.e.
- is closed under scalar multiplication i.e.
- scalar multiplication distributes over addition of scalars, i.e.
- scalar multiplication distributes over vector addition, i.e.
- ordinary scalar multiplication associates with scalar multiplication, i.e.
Lemma 2.1.3: In any vector space , for any and , we have
Proof:
For (i), note that . Add to both sides the additive inverse such that .
(ii): .
(iii)
Lemma 2.1.6: For a nonempty subset of a vector space, under the inherited operations the following are equivalent statements
- is a subspace of that vector space
- is closed under linear combinations of pairs of vectors
- is closed under linear combinations of any number of vectors
Lemma 2.1.9: In a vector space, the span of any subset is a subspace.
Proof:
If the subset is empty, its span is the trivial subspace by definition. Otherwise, for two elements of , , where and ,
which is a linear combination of elements of and so is a member of , hence is closed under linear combinations of pairs of elements; by Lemma 2.1.6, is a subspace.
2.ii Linear Independence
Lemma 2.2.3: A subset of a vector space is linearly independent iff among its elements the only linear relationship is the trivial one, .
Proof:
If is linearly independent, there is no vector that can be expressed as a linear combination of other elements of , so there is no linear relationship between elements of with non-zero coefficients, i.e. the only linear relationship is the trivial one.
If is linearly dependent, there exists a vector that can be represented as a linear combination of other elements of , so there is a linear relationship among the elements of with at least one non-zero coefficient, being that of . Hence by contraposition, if the only linear relationship among elements of is the trivial one, is linearly independent.
Lemma 2.2.4: Suppose is a vector space, is a subset of that space, and , then
Proof:
: Suppose , then since ; the implication follows from contraposition.
: Suppose , then for . Then for any ,
so . Therefore, . Note also that clearly , so by double inclusion .
Lemma 2.2.7: Suppose that is linearly independent and that , then is linearly independent iff .
Proof:
Suppose that , so for . Since , it is not equal to any of the s so this is a nontrivial linear dependence among elements of . Therefore, is not linearly independent.
Now suppose that is linearly dependent, so there exists a nontrivial combination . By assumption, is linear independent, so . Therefore, we can rearrange the equation to find that , so .
Therefore we have that is linearly dependent iff , so is linear independent iff .
Corollary 2.2.8: In a vector space, any finite set has a linearly independent subset with the same span.
Proof:
For a finite set , if is linearly independent then the statement is trivially true as . Note that if is a singleton set, it is trivially linearly independent.
Otherwise, there exists an which can be expressed as a linear combination of the other elements of ; by Corollary 2.2.5, has the same span as . , and since is finite we can keep removing dependent elements until we either reach a nontrivial independent set with the same span, or we reach a singleton set, which is trivially independent.
Corollary 2.2.9: A subset is linearly dependent iff some can be represented as a linear combination of other elements in listed before it.
Proof:
Let . Then if is linearly dependent, there exists a first such that is linearly dependent, i.e. .
2.iii Basis and dimension
Definition 2.3.2: For any ,
is the standard basis or natural basis. We denote these vectors . (Sometimes the vectors in are denoted .)
Theorem 2.3.3: In any vector space, a subset is a basis iff each vector in the space can be expressed as a linear combination of elements of the subset in exactly one way.
Proof:
A sequence is a basis iff its elements form a spanning subset of the space, and is linearly independent. A subset is spanning iff each vector in the space can be expressed as a linear combination of elements in the subset, so we need only show that a spanning subset is linearly independent iff every vector in the space has a unique expression as a linear combination of elements of the subset.
Consider two such expressions of a vector , and for a subset . These are equal by transitivity of equality, so we can subtract them to get that . The s equal the s iff for , which holds iff is linearly independent, since the only linear dependence amongst its elements is the trivial one.
Definition 2.3.4: In a vector space with bases the representation of with respect to is the column vector of the coefficients used to express as a linear combination of the bases vectors:
where and .
are the coordinates of with respect to .
Lemma 2.3.5: If is an -element basis of a vector space , then for any set of vectors ,
Proof:
Take a basis and suppose is the th column of a matrix for such that etc.
Then we can substitute these into to give
Because elements of bases are linearly independent, this only holds if . This gives us a linear system which we can re-express with column vectors, giving
Note that not only does the relationship hold, but the coefficients are the same.
Theorem 2.3.7: In any finite-dimensional vector space, all bases have the same number of elements.
Proof:
Consider some finite-dimensional vector space; by definition, it has a finite basis. Pick one finite basis of minimal size, and, for the sake of contradiction, take any other basis , with , noting that since is minimal.
Assume that the elements of are unique (otherwise it would not be a basis), then the only linear relationship amongst the s must be the trivial one. By Lemma 2.3.5, the only linear relationship amongst the representations of the s with respect to is the trivial one.
A linear relationship amongst the representations of the s with respect to can be represented by a homogeneous system of linear equations, with unknowns. Since there are more unknowns than equations, and the system is homogeneous, there are infinitely many solutions and so there must be more than just the trivial linear relationship among the s. Therefore, the s are not linearly independent, so is not a basis; this is a contradiction, so all bases of must be the same size.
Corollary 2.3.10: Any linearly independent set can be expanded to make a basis.
Proof:
Take a linearly independent set of a vector space .
If is not already a basis for , then it must not span the space. Therefore we can add a vector that is not in to , and retain linear independence by Lemma 2.2.7. Keep adding until the resulting set does span the space, which will occur in a finite number of steps by Corollary 2.3.9.
Corollary 2.3.11: Any spanning set can be shrunk to a basis.
Proof:
Take a spanning set . If then it is already a basis (and the enclosing space must be trivial). If then it can be shrunk to the empty basis without changing its span.
Otherwise, s.t. and we can form a basis . If then we are done. Otherwise, s.t. , so we can append to . By Lemma 2.2.7, is still linearly independent. We can then check if , and keep repeating this process until the spans are equal, which must happen in a finite number of steps.
Corollary 2.3.12: In an -dimensional space, a set composed of vectors is linearly independent iff it spans the space.
Proof:
A set composed of vectors is linearly independent iff it is a basis: the “if” direction is clear, and the other directions holds because any linearly independent set can be expanded to a basis, but it already has elements so must already be a basis.
A subset of vectors spans the space iff it is a basis: again the “if” direction is clear, and the other direction holds because any spanning set can be shrunk to a basis, but it already has elements so must already be a basis.
Lemma 2.3.14: If two matrices and are related by a row operation, , or for , then their row spaces are equal.
Proof:
By Corollary 1.3.6, each row of is a linear combination of the rows of ; that is, each row of is in the rowspace of . It follows by the Linear Combination Lemma that Rowspace() Rowspace(), since each element of the rowspace of is a linear combination of the rows of , so is a linear combination of a linear combination of the rows of , i.e. just a linear combination of the rows of . Because elementary row operations are reversible (by Lemma 1.3.2), the same argument can be applied to show that Rowspace() Rowspace(); by double-inclusion, the rowspaces are equal.
Lemma 2.3.16: The nonzero rows of an echelon form matrix make up a linearly independent set.
Proof:
This is a restatement of Lemma 1.3.7 using new terminology.
Lemma 2.3.19: Row operations do not change the column rank.
Proof:
By Lemma 1.1.4, row operations do not change the solution space of a system, so they must preserve linear relationships among columns. Therefore, they do not change the number of linearly independent columns, so they preserve column rank.
Theorem 2.3.20: For any matrix, the row rank and column rank are equal.
Proof:
Any matrix is row-equivalent to a reduced echelon form matrix, with the same row and column rank as by Lemmas 2.3.15 and 2.3.19. The row rank of is the number of nonzero rows in the reduced echelon form matrix, which is equal to the number of leading entries in the reduced echelon form matrix; the column rank of is also equal to the number of leading entries, since the columns containing leading entries form a subset of the standard basis, so all other columns can be expressed as a linear combination of the rows containing leading entries.
Theorem 2.3.22: For linear systems with unknowns and with matrix of coefficients , the rank of is iff the vector space of solutions of the associated homogeneous system has dimension .
Proof:
The rank of is iff the reduced echelon form matrix of has leading variables. There must be free variables in the reduced echelon form matrix, so the solution space of the associated homogeneous system has dimension .
Corollary 2.3.23: Where the matrix is , the following statements are equivalent:
- the rank of is
- is nonsingular
- the rows of form a linearly independent set
- the columns of form a linearly independent set
- any linear system whose matrix of coefficients is has one and only one solution.
Proof:
TODO
Definition 2.3.25: The concatenation of the sequences adjoins them into a single sequence.
Lemma 2.3.27: Let be a vector space that is the sum of some of its subspaces, . Let be bases for these subspaces. Then the following are equivalent:
- Any can be expressed as a combination of unique , ;
- The concatenation is a basis for ;
- The s are independent.
Corollary 2.3.29: The dimension of a direct sum is the sum of the dimensions of its summands.
Proof:
In Lemma 2.3.27, the number of vectors in the concatenated basis is the same as the sum of the number of vectors in each sub-basis.
Definition 2.3.30: When a vector space is the direct sum of two of its subspaces, those subspaces are complements.
Lemma 2.3.31: A vector space is the direct sum of two of its subspaces iff and i.e. their intersection is trivial.
Proof:
If , then by definition of direct sum, ; furthermore, since and are independent, the linear combination for can only have the trivial solution if we want a member of on the left side and a member of on the right, w.l.o.g.
Conversely, if then if we express as a linear combination of , then must also be in , as linear combinations of elements of a vector space are also in that vector space. But the intersection of and is trivial, so and satisfy the definition of independence; since they sum to , they form a direct sum of .
3 Maps between spaces
3.i Isomorphisms
Definition 3.1.1: An isomorphism between two vector spaces and is a map such that
- is a bijection/correspondence;
-
preserves structure: if then
and if and then
(We write , read “V is isomorphic to W”, when such a map exists.)
To verify that is an isomorphism, show that:
- is injective/one-to-one, i.e. ;
- is surjective/onto, i.e. ;
- preserves addition, i.e. ;
- preserves scalar multiplication, i.e. .
Lemma 3.1.3: An isomorphism maps the zero vector to the zero vector.
Proof: Where is an isomorphism, fix some . Then
Lemma 3.1.4: For any map between vector spaces, the following statements are equivalent:
- preserves structure
- preserves linear combinations of two vectors
- preserves linear combinations of any finite number of vectors
Proof:
TODO (in Hefferon)
Lemma 3.1.5: The inverse of an isomorphism is also an isomorphism.
Proof:
Suppose that by . Because isomorphisms are bijections, has an inverse .
Suppose that . Because it is an isomorphism, is surjective and . Then
Therefore preserves structure and hence satisfies the necessary conditions to be an isomorphism.
Theorem 3.1.6: Isomorphism is an equivalence relation between vector spaces.
Proof:
Isomorphism is reflexive, since the identity function is an isomorphism, since it is clearly a bijection and preserves structure:
Symmetry holds by Lemma 3.1.5.
For transitivity, suppose that and , and let and be isomorphisms. Then is bijective, and structure preserving:
Lemma 3.1.7: If spaces are isomorphic then they have the same dimension.
Proof:
TODO (in Hefferon)
Lemma 3.1.8: If spaces have the same dimension then they are isomorphic.
Proof:
TODO (in Hefferon)
Theorem 3.1.9: Vector spaces are isomorphic iff they have the same dimension.
Proof:
From Lemma 3.1.7 and Lemma 3.1.8.
Corollary 3.1.10: Every finite-dimensional vector space is isomorphic to exactly one of the .
Thus the real spaces form a set of canonical representatives of the isomorphism classes.
3.ii Homomorphisms
Lemma 3.2.2: A linear map maps the zero vector to the zero vector.
Proof:
The proof is the same as the proof to Lemma 3.1.3.
Lemma 3.2.3: The following statements are equivalent for any map between vector spaces:
- is a homomorphism;
- ;
- .
Proof:
TODO (in Hefferon)
Definition 3.2.4: The trace of a square matrix is the sum down the upper-left to bottom-right diagonal. So is:
Theorem 3.2.5: A homomorphism is determined by its action on a basis: if is a vector space with basis , is a vector space, and (not necessarily distinct), then there exists a homomorphism from to sending each to , and that homomorphism is unique.
Proof:
TODO (in Hefferon)
Example (Derivation of (rotation)):
Extending linearly:
Lemma 3.2.8: For vector spaces and , the set of linear functions from to is itself a vector space, a subspace of the space of all functions from to . We denote the space of linear maps from to by .
Alternatively: a linear combination of homomorphisms is also a homomorphism.
Proof:
TODO (in Hefferon)
Lemma 3.2.9: Under a homomorphism, the image of any subspace of the domain is a subspace of the codomain. In particular, the image of the entire space, the range of the homomorphism, is a subspace of the codomain.
That is, given vector spaces and a homomorphism , where and are subspaces of and respectively.
Proof:
TODO (in Hefferon)
Definition 3.2.10: The range space of a homomorphism is
sometimes denoted . The dimension of the range space is the map’s rank.
Lemma 3.2.12: For any homomorphism the inverse image of a subspace of the range is a subspace of the domain. In particular, the inverse image of the trivial subspace of the range is a subspace of the domain.
Proof:
TODO (in Hefferon)
Definition 3.2.13: The null space of kernel of a linear map is the inverse image of .
Theorem 3.2.15 (rank-nullity theorem): Given a linear map ,
Proof:
TODO (in Hefferon)
Lemma 3.2.16: Under a linear map, the image of a linearly dependent set is linearly dependent.
Proof:
Suppose that with some nonzero. Apply to both sides: and . We still have some nonzero, so the set of the s is linearly dependent.
Theorem 3.2.17: Where is an -dimensional vector space, these are equivalent statements:
- is one-to-one/injective;
- has an inverse from its range to its domain that is a linear map;
- , that is, ;
- ;
- if is a basis for then is a basis for .
Proof:
TODO (in Hefferon)
3.iii Computing linear maps
Definition 3.3.1: Suppose that and are vector space of dimensions and with bases and , and that is a linear map. If