12.07.2015 Views

math methods 1 lecture notes

math methods 1 lecture notes

math methods 1 lecture notes

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

To show this first note that x ∈ D means x ∈ A or x is a memberof both B and C. If x ∈ A then x ∈ (A ∪ B) and x ∈ (A ∪ C).If x ∈ (B ∩ C) then x ∈ (A ∪ B) and x ∈ (A ∪ C). In either casex ∈ (A ∪ B)) ∩ (A ∪ C). This shows that D ⊂ E. To show E ⊂ D notethat x ∈ E means that x is a member of both (A ∪ B) and (A ∪ C).If x ∈ A then x ∈ A ∪ (B ∩ C). If x ∉ A, then it must be in both Band C or B ∩ C. Combining these cases we see x ∈ E ⇒ x ∈ D. ThusE = D.In this course we will deal with sets of three different sizes.1. Finite sets.2. Countably infinite sets.3. Uncountably infinite sets.Two sets, A and B, are the same size if there is a 1 to 1 correspondencebetween the members of A and the members of B.A set has N members if the members can be put into a 1 to 1 correspondencewith the first N natural numbers, {0, 1, 2 · · · N − 1}. Sets with Nmembers are finite sets. Countably infinite sets are sets whose members canbe put into a 1 − 1 correspondence with the natural numbers N. Infinite setsthat are not countable are uncountable. Uncountable sets also come in differentsizes, however we will not need to concern ourselves with the orderingof different uncountable sets.The real numbers R are uncountable while the rational numbers Q arecountable.It might seem odd that the set of irrational numbers is much larger thanthe set of rational numbers since between every pair of irrational numbersthere is a rational number.Later we will show that the volume occupied by the irrational numbersbetween zero and one is one, while the volume occupied by the rationalnumbers between zero and one is zero. For this <strong>lecture</strong> I give a standardproof that the rational numbers are countable.To show this it is necessary to construct a 1 − 1 correspondence betweenthe elements of Q and N. To construct the desired correspondence note thatany rational number can be expressed in the formx ∈ Q ⇒ x = (−1) l n mn, m ∈ N, l ∈ {0, 1} (2)3


A set is closed if it contains all of its accumulation points.Example:(a, b) := {x ∈ R|a < x < b} is open[a, b] := {x ∈ R|a < x < b}is closedA region is an open set with the property that any two points can beconnected by a continuous line in the set.Example:(a, b) is an open region(a, b) ∪ (c, d), c > b, is open, it is not a region.We say F (a) = b from metric space A to metric space B is continuousat a if for every ɛ > 0 there exists a R > 0 such that for every a ′ ∈ N a,R ,ρ B (b, F (a ′ )) < ɛ.We say F (a) = b from metric space A to metric space B is uniformlycontinuous on C ⊂ A if for every ɛ > 0 and every a ∈ C exists a R >(independent of a ∈ C) such that for every a ′ ∈ N a,R , ρ B (F (a), F (a ′ )) < ɛ.Note that in formulating this definition we only used the metric functionson A and B ( the metric function on A was used to define the region N a,R .You should convince yourselves that this reduces to the standard definitionthat you already know when it is applied to real valued functions of a realvariable.In <strong>math</strong>ematics there are a number of ways of proving results. Understandinghow this is done helps to understand when certain results can beextended or not. In this class we will use three <strong>methods</strong> to prove results:1. Proof by construction: In this method the existence of an object satisfyingcertain <strong>math</strong>ematical properties is established by an explicitconstruction of the desired object.Example: The statement that there exists a prime number greaterthan 11 can be established by showing that 13 is prime and that it isgreater than 11.Example: In physics quantum field theory can be characterized by axiomsthat define the expected physically sensible properties of quantumfields:A1: The theory is a quantum theory.6


A2: The theory consistent with special relativity.A3: The energy of the system is bounded from below.A4: The theory can describe particles.A5: Microscopic causalityIt turns out that the non-trivial examples of theories satisfying theseaxioms are difficult to construct. One can however prove the consistencyof these axioms by showing that free field theories are an exampleof a (trivial) theory that is consistent with these axioms. This is anexample of proof by construction.2. Proof by contradiction: To prove something by contradiction assumethat the desired result is false. Use that assumption to establish a contradiction.This then implies that the starting assumption is incorrect.Example: (Aristotle) To prove √ 2 is irrational assume by contradictionthat it is rational. Then we can write2 = m nwhere m and n are assumed to have no common divisors. Squaringthis equation gives2n 2 = m 2 .Since m 2 is even (by above) and the square of any odd number is odd,it follows that m is even. This means that m 2 is divisible by 4. If thisis true then n 2 is divisible by 2. This means that n cannot be odd, so itmust be even. This contradicts the assumption that m and n have nocommon divisors. Thus the assumption that √ 2 must be false. Thisestablishes that √ 2 is irrational.3. Proof by induction: When proving results that apply to an infiniteor large number of results, the next best thing to proof by constructionis proof by induction. In this case we label the desired resultsR 1 · · · R n · · · . We begin by establishing that R 1 is true by constructionor contradiction. Then we show by construction or contradiction thatif R n is true for n < m then R m is true. The principle of <strong>math</strong>ematicalinduction then implies that R k is true for any k ≥ 1.7


Example: For real numbers x it is easy to show |x 1 x 2 | = |x 1 ||x 2 | by goingthrough all of four possible sign combinations for the two numbers.If we assume (out induction assumption) |x 1 · · · x n−1 | = |x 1 | · · · |x n−1 |then|x 1 · · · x n | = |x 1 · · · x n−1 ||x n | = |x 1 | · · · |x n | (9)• which shows that if the result holds for N − 1 factors it also holds forN factors.Example: Cluster properties in relativistic quantum mechanics is thestatement that if a relativistically invariant system is divided into isolatedsubsystems, then each isolated subsystem is relativistically invariant.It turns out it is easy to establish the result for a system of two particles.One can then show that if the result holds for M < N particles it alsoholds for N particles. This is a problem that is central to my ownresearch program.Complex numbersConsider the polynomial equationP (z) = z 2 + 1 = 0. (10)This equation has no solutions when z is restricted to be a real number.Complex numbers are an extension of the real numbers so that solutions ofpolynomial equations have roots.In order to construct the desired extension it is enough to introduce thenew numberi = √ −1. (11)This notation was introduced by Euler in 1779. With the introduction of iequation (??) can be factoredwith rootsA general complex number z has the formP (z) = (z + i)(z − i) (12)z = ±i. (13)x = x + iy (14)8


where x, y ∈ R. The number x is called the real part of z, denoted byx = R(z). (15)The number y is called the imaginary part of z, denoted byy = I(z). (16)A complex number z = x + iy is real if y = 0. It is imaginary if x = 0. Itis zero if x = y = 0.Complex numbers can be added, subtracted, and multiplied:z 1 + z 2 = (x 1 + iy 1 ) + (x 2 + iy 2 ) = (x 1 + x 2 ) + i(y 1 + y 2 ) (17)z 1 − z 2 = (x 1 + iy 1 ) − (x 2 + iy 2 ) = (x 1 − x 2 )i(y 1 − y 2 ) (18)z 1 z 2 = (x 1 + iy 1 )(x 2 + iy 2 ) = x 1 x 2 + i(x 1 y 2 + y 2 x 1 ) + i 2 y 1 y 2 =(x 1 x 2 − y 1 y 2 ) + i(x 1 y 2 + y 2 x 1 ). (19)9


0.3 Lecture 3If a complex number z ≠ 0 then it has an multiplicative inverse1z = 1 x − iyx + iy x − iy =xx 2 + y − i y(20)2 x 2 + y 2The complex conjugate z ∗ of a complex number z = x + iy isIt follows thatz ∗ = (x + iy) ∗ = x − iy (21)i ∗ = −i (22)A direct calculation shows that for any complex number z = x + iy thatThe real numberzz ∗ = (x + iy)(x − iy) = x 2 + y 2 ≥ 0 (23)is called the modulus of the complex number z.Let z = x + iy and writeThis can be writtenwhere|z| := √ zz ∗ = √ x 2 + y 2 (24)z = |z|( x|z| + i y|z| ) (25)z = |z|(cos(φ) + i sin(φ)) (26)sin(φ) = x|z|cos(φ) = y|z| . (27)The angle φ is the argument of z The argument of a complex number isonly defined up to an integer multiple of 2π.Any complex number can be expressed in terms of its real and imaginarypart or in terms of its modulus and argument. It can be represented by avector in the complex plane with x coordinate x and y coordinate y. Themodulus and phase are like a representation of the same vector in terms ofpolar coordinates. The modulus is the length of the vector and the argumentis the angle that the vector makes with the x axis.10


Note thatz 1 z 2 = |z 1 |(cos(φ 1 ) + i sin(φ 1 ))|z 2 |(cos(φ 2 ) + i sin(φ 2 )) =|z 1 ||z 2 |[(cos(φ 1 ) cos(φ 2 )−sin(φ 1 ) sin(φ 2 ))+i(cos(φ 1 ) sin(φ 2 )+sin(φ 1 ) cos(φ 2 ))](28)Using the trigonometric identitiesgivescos(φ 1 + φ 2 ) = cos(φ 1 ) cos(φ 2 ) − sin(φ 1 ) sin(φ 2 ) (29)sin(φ 1 + φ 2 ) = cos(φ 1 ) sin(φ 2 ) + sin(φ 1 ) cos(φ 2 ) (30)z 1 z 2 = |z 1 ||z 2 |(cos(φ 1 + φ 2 ) + i sin(φ 1 + φ 2 )) (31)This means that the modulus of the product of two complex numbers isthe product of the modulus of each complex number,|z 1 z 2 | = |z 1 ||z 2 | (32)and the argument of the product of two complex numbers is the sum of thearguments of each individual complex number,arg(z 1 z 2 ) = arg(z 1 ) + arg(z 2 ) (33)The addition of complex numbers can be represented graphically in thecomplex plane.From the geometric observation that the length of any side of a triangleis less that the sum of the lengths of the two other sides gives|z 1 + z 2 | ≤ |z 1 | + |z 2 | (34)|z 1 | ≤ |z 1 + z 2 | + |z 2 | (35)|z 2 | ≤ |z 1 + z 2 | + |z 1 | (36)which for obvious reasons are called triangle inequalities.If we defineρ(z 1 , z 2 ) := |z 1 − z 2 | (37)the using the definition of modulus (23) and the triangle inequality (34) it iseasy to showρ(z 1 , z 2 ) ≥ 0 ρ(z 1 , z 2 ) = 0 ⇒ z 1 = z 2 (38)11


ρ(z 1 , z 2 ) ≤ ρ(z 1 , z 3 ) + ρ(z 3 , z 2 ) (39)which means that ρ(z 1 , z 2 ) is a metric function for the complex numbers.We started our discussion of complex numbers by looking for roots of theequationP (z) = z 2 + 1 (40)Initially this was considered as a real valued function of a real variable,however showed that it could also be considered as a complex valued functionof a complex variable.This new interpretation has the advantage that this equation has tworoots when considered as a complex valued function of a complex variable.For this reason it is useful to introduce a new class of complex valuedfunctions of a complex variable.In general a function f is a mapping from a set A to a set B. The set Ais called the domain of f and the set B is called the range of f. When A andB are both subsets of the complex plane C f is called a complex function ofa complex variable.It is also possible to have real valued functions of a complex variable suchasf(z); = zz ∗ + 4(zz ∗ ) 2 (41)f(z) := I(z) = y (42)and complex valued functions of a real variable such asf(φ) = cos(φ) + i sin(φ) (43)Polynomials in z are a class of complex valued functions of complex arguments.A degree N polynomial is a function of the formP (z) =N∑a n z n (44)n=0where z is a complex variable and a n are complex coefficients.One of the great features of complex numbers is that while we introducedthe complex number i to find roots of 0 = z 2 + 1, we will show later thatany degree N polynomial has N complex roots. No further extensions of thecomplex numbers are needed to completely factorize any polynomial. Thisproperty of the complex is called algebraic completeness. The theorem that12


the complex numbers are algebraically complete is called the fundamentaltheorem of algebra.One of the simplest non-trivial complex functions of a complex argumentis the exponential function. It is defined byNote thate z := 1 +|e z − 1 +|∞∑n=N+1∞∑n=1N∑n=1z nn! . (45)z nn! | =z nn! | (46)Mathematical induction can be used to show |z n | = |z| n follows from (32),|z 2 | = |z| 2 . Combining this result with repeated use of the triangle inequalitygives∞∑ |z| n≤n! . (47)n=N+1This is a real sum which is known to converges to zero as N → ∞ becausethis is a property of the real valued exponential function. (I will show thisshortly)Thus this series representation converges to the exponential function asN goes to infinity for every z. This shows that the definition (45) makessense.One special case of the exponential function is∞∑n=0∞∑n=0(−1) n(2n)! φ2n + ie iφ = 1 +(i) 2n(2n)! φ2n + i∞∑n=0∞∑n=1∞∑n=0i 2 φ nn!=(i) 2n(2n + 1)! φ2n+1 =(−1) n(2n + 1)! φ2n+1 = cos(φ) + i sin(φ) (48)The series on the last line of (48) are recognized as the series representationsfor cos(φ) and sin(φ).13


This allows us to writeand leads toz = |z|(cos(φ) + i sin(φ)) = |z|e iφ (49)z 1 z 2 = |z 1 ||z 2 |(cos(φ 1 + φ 2 ) + i sin(φ 1 + φ 2 )) = |z 1 ||z 2 |e i(φ 1+φ 2 )(50)One way to construct new functions is to approximate them by knownfunctions. This can be done formally by using the metric function on C.A sequence of functions f n (x) : A → B converges to a function f(x) atx = p if for every ɛ > 0 there is an N such that for every n > Nρ(f(p), f n (p)) < ɛ (51)where ρ(·, ·) is a metric function on B.In general for a given ɛ the number N may depend on the point p. In thespecial case when the N can be chosen independent of p ∈ A the convergenceis uniform on A:A sequence of functions f n (x) : A → B converges uniformly to a functionf(x) if for every ɛ > 0 there is an N such that for every n > Nρ(f(x), f n (x)) < ɛ (52)for all x ∈ A.Since convergence can be expressed in terms of a metric function, thesedefinitions can be applied equally to real or complex valued functions.In many cases one does not know the limiting function. It is possible totest for convergence of a sequence of approximate functions by consideringonly the terms in the sequence.A Cauchy sequence f n is a sequence of elements of a metric space B withthe property that for every ɛ > 0 there is an N such that whenever m, n > Nρ(f n − f m ) < ɛ (53)To prove this assume that f n converges to f. Pick ɛ > 0 so there is an Nwithρ(f, f n ) < ɛ (54)2Let m, n > N. Thenρ(f n , f m ) ≤ ρ(f n , f) + ρ(f m , f) ≤ ɛ 2 + ɛ 2 = ɛ (55)14


29:171 - Homework Assignment #11. Provee z 1e z 2= e z 1+z 2for any pair of complex numbers z 1 and z 2 . Use the definition∞e z = 1 + z + z22! + z33! + · · · = ∑ z nn! .2. Find the real and imaginary part of sin(z) and cos(z). Express yourresults in terms of real valued functions.3. Show that zz ∗ is always real and non-negative.n=04. Use the quadratic formula to factorize the polynomialP (z) = z 2 + 3z + 12.Use complex arithmetic to verify that the polynomial is recovered bymultiplying the factors.5. Calculate the real and imaginary parts of10 + i57 − i6 .6 Find the modulus and argument of cos(ix).15


0.4 Lecture 4Conversely assume that {f n } is a Cauchy sequence. We can choose a subsequence{f m1 , f m2 , · · · , f mk , · · · } with m l sufficiently large soρ(f mk , f mk+n ) < 1 2 k . (56)To do this first choose m 1 so n > m 1 implies ρ(f m1 , n) < 1 . Next choose2m 2 > m 1 so n > m 2 implies ρ(f m2 , n) < 1 . This process can be continued2 2to satisfy the inequalities in (56). DefineSinceρ(f, f ml ) ≤f := f m1 +∞∑k=l+1∞∑(f mk+1 − f mk ) (57)k=1ρ(f mk +1, f mk ) ≤ 1 2 ( ∑ ∞1l 2 ) = 1 . (58)m 2l−1 The right hand side can be made as small as desired by choosing a largeenough l. Specifically, for any ɛ > 0 it is possible to find an such thatɛ > 1 . This shows that the sequence f2 l−1 nk converges to f as k → ∞.This shows that a necessary and sufficient for condition for a series toconverge is that it is a Cauchy sequence. We also showed how to express thelimit as a convergent series.In the above the Cauchy sequence was a sequence of numbers. The aboveproof also extends to the case that f → f(z) and f n → f n (z) are functions.Cauchy sequences can be used to show either convergence or uniformconvergence of sequences of functions:A sequence f n (z) converges for z = p if for every ɛ > 0 there is an N suchthat whenever m, n > Nm=0ρ(f n (p), f m (p)) < ɛ (59)A sequence f n (z) is uniformly convergent on a set S if for every ɛ > 0there is an N such that whenever m, n > Nindependent of z ∈ S.ρ(f n (z), f m (z)) < ɛ (60)16


Example: We show that the partial sums of the exponential series area Cauchy sequence for each z, they are not however uniformly Cauchy.First define the partial sums:z m :=m∑n=0z nn!(61)To show {z m } is a Cauchy sequence consider (for k > m)z m − z k =k∑n=m+1z nn!(62)We can write the right hand side of (??) asz m+1(m + 1)! (1 + xm + 2 + x 2(m + 2)(m + 3) + · · · x k−m−1(m + 2) · · · k )Repeated use of the triangle inequality means that the amplitude ofthis complex number is bounded by≤|z|m+1 |z|(1 +(m + 1)! m + 2 + |z|2(m + 2) + · · · |z| k−m−12 (m + 2) ) ≤k−m−1≤|z|m+1 |z|(1 +(m + 1)! m + 2 + |z|2(m + 2) + · · · =2|z| m+1 1(m + 1)! 1 − |z|/(m + 2)(63)where we have used the geometric series∞∑n=0x n = 11 − x(64)for |x| < 1. For any fixed |z| it is possible to choose m sufficiently largeto make (63) as small as desired.17


This same proof can be used to show that the series for the the followingfunctions∞∑ (−) nsin(z) :=(2n + 1)! z2n+1 (65)n=0cos(z) :=sinh(z) :=∞∑n=0cosh(z) :=∞∑n=0(−) n(2n)! z2n (66)1(2n + 1)! z2n+1 (67)∞∑n=01(2n)! z2n (68)converge for all z in the complex plane.Because these functions are represented by absolutely convergent series,many identities that one derives for the corresponding functions of a realvariable have generalizations to a complex variable.An immediate consequence of equations (65), (66),(67),(68) are the identitiescosh(iz) = cos(z) cos(iz) = cosh(z) (69)sinh(iz) = i sin(z) sin(iz) = i sinh(z) (70)cosh(z) = 1 2 (ez + e −z ) sinh(z) = 1 2 (ez − e −z ) (71)e ±z = cosh(z) ± sinh(z) (72)e ±iz = cos(z) ± i sin(z) (73)cos(z) = 1 2 (eiz + e −iz ) sin(z) = 1 2i (eiz − e −iz ) (74)Four homework I will ask you to proveFrom (69)- (76) we getcos(z 1 + z 2 ) = cos(z 1 ) cos(z 2 ) − sin(z 1 ) sin(z 2 ) (75)sin(z 1 + z 2 ) = sin(z 1 ) cos(z 2 ) + cos(z 1 ) sin(z 2 ) (76)cos(x + iy) = cos(x) cosh(y) − i sin(x) sinh(y) (77)18


sin(x + iy) = sin(x) cosh(y) + i cos(x) sinh(y) (78)cosh(z 1 + z 2 ) = cosh(z 1 ) cosh(z 2 ) + sinh(z 1 ) sinh(z 2 ) (79)sinh(z 1 + z 2 ) = sinh(z 1 ) cosh(z 2 ) + cosh(z 1 ) sinh(z 2 ) (80)cosh(x + iy) = cosh(x) cos(y) + i sinh(x) sin(y) (81)sinh(x + iy) = sinh(x) cos(y) + i cosh(x) sin(y) (82)We can also define the logarithm of a complex variable. It is defined tosatisfy the relationse ln(z) = z (83)If we setln(z) = u(z) + iv(z) (84)then using the result of the homework problem 1 givesorPutting all of this together givese ln(z) = e u(z)+iv(z) = e u(z) e iv(z) = |z|e iφ (85)u(z) = ln |z| v(z) = φ(z) + 2nπ (86)ln(z) = ln |z| + i(φ(Z) + 2nπ). (87)Here n is any integer. The complex logarithm is an example of a multiplevalued complex function.19


0.5 Lecture 5Complex derivativesA general complex-valued function of a complex variable z = x + iy hasthe formf(z) = u(x, y) + iv(x, y). (88)This function has a complex derivative ifdff(z + ∆z) − f(z)(z) = limdz ∆z→0 ∆z(89)exists and is unique. The existence of a complex derivative is a restrictivecondition. The problem is that while this looks similar to the ordinarydefinition of the derivative of a real valued function of a real variable, theuniqueness requirement means that the result is independent of the directionof ∆z approaches zero in the complex plane.If we write∆z = |∆z|e i∆φ (90)it follows that |∆z| → 0 implies ∆z → 0, however in general the limit mayhave a residual dependence on φ. When the function has a complex derivativethere can be no residual φ dependence.Example 1: Let z = x + iy and f(z) = x + 2iy. Thendfx + 2iy(0) = limdz ∆z→0 x + iy = lim x 2 + 2y 2 + ixy(91)∆z→0 x 2 + y 2If we set y = 0 and let x → 0 then df (0) = 1; on the other hand if wedzset x = 0 and let y → 0 we get df (0) = 2. Clearly the derivative of thisdzfunction at zero depends on how ∆z approaches 0. This means thatf(z) = x + 2iy does not have a complex derivative.Example 2: Consider f(z) = z 2 . Thendfdz = lim z 2 + 2z∆z + (∆z) 2∆z→0 ∆zwhich is clearly independent of how ∆z → 0.= lim (2z + ∆z) = 2z (92)∆z→020


When f(z) does have a complex derivative, since the definition of thederivative is essentially the same as it is for real valued functions of a realargument, it follows that the complex derivative of f(z) obeys the standardrules of differential calculus:ddz (f 1(z) + f 2 (z)) = df 1dz (z) + df 2(z) (93)dzddz (f 1(z)f 2 (z)) = df 1dz (z)f 2(z) + f 1 (z) df 2(z) (94)dzddz (f 1(f 2 (z))) = df 1dz (f 2(z)) df 2(z) (95)dzd f 1 (z)dz f 2 (z) = df 1dz (z) 1f 2 (z) − f 1(z) df 2(z) =f 2 (z) 2 dz(96)(1ff 2 (z) 2 2 (z) df 1dz (z) − f 1(z) df )2dz (z) . (97)The proof of these results are identical to the corresponding proofs for realvalued functions.Since the existence of a complex derivative is not always immediatelyapparent, it is useful to have a simple test to see if a function has a complexderivative. Letf(z) = u(x, y) + iv(x, y). (98)If f(z) has a complex derivative then we can compute it by letting ∆z = ∆xor ∆z = i∆y. Both must give the same result:df ∂f(z) = (z) = −i∂f (z) (99)dz ∂x ∂yExpress this in terms of the real and imaginary components to getdfdz(z) =∂u∂x(z) + i∂v∂x(z) =∂v∂y(z) − i∂u(z). (100)∂yEquating the real and imaginary parts of these expressions gives∂u∂x = ∂v∂y∂u∂y = −∂v ∂x(101)21


Equation (101) are called the Cauchy-Riemann equations. They are a necessarycondition for f(z) = u(x, y) + iv(x, y) to have a complex derivative.One immediate consequence of the existence of a complex derivative isthat it can be computed by letting ∆z → 0 in any direction in the complexplane.The Cauchy-Riemann equations have some interesting consequences. Iff(z) = u(x, y) + iv(x, y) (102)and the second partial derivatives of u and v with respect to x and y exist,then we have∂ 2 u∂x = ∂2 v(103)2 ∂x∂y∂ 2 u∂y = − ∂2 v2 ∂y∂dxAdding equations (103) and (104) gives(104)∂ 2 u∂x + ∂2 u2 ∂y = 0 (105)2which means that if u is the real part of a function with a complex derivativethat it is necessarily a solution of Laplace’s equation in two dimensions.Using the same method it also follows that the imaginary part of a functionwith a complex derivative is also a solution of Laplace’s equation∂ 2 v∂x + ∂2 v2 ∂y = 0 (106)2These two solutions are not independent, they are necessarily related by theCauchy-Riemann equations.This has an interesting geometric interpretation. Consider the gradientsof the functions u(x, y) and v(x, y):⃗∇u(x, y) = ( ∂u∂x , ∂u∂y )⃗ ∇v(x, y) = (∂v∂x , ∂y∂y ) (107)The dot product of these two vectors is⃗∇u(x, y) · ⃗∇v(x, y) = ∂u ∂v∂x ∂x + ∂u∂y∂v∂y(108)22


Use the Cauchy Riemann equations in the last two terms to getand⃗∇u(x, y) · ⃗∇v(x, y) = ∂u ∂v∂x ∂x + (−∂v ∂x )∂u ∂x= 0. (109)To understand the meaning of this relation note that the equationsu(x, y) = c 1 (110)v(x, y) = c 2 , (111)where c 1 and c 2 are constants, define curves in the complex plane. Thegradient of u is perpendicular to the curve u(x, y) = c 1 , while the gradientof v is perpendicular to the curve v(x, y) = c 2 . Since we have shown thatthe gradients are perpendicular to each other, it follows that if the curvesu(x, y) = c 1 and v(x, y) = c 2 intersect, they necessarily intersect at rightangles!Thus the real and imaginary parts of complex function with a complexderivative are solutions of Laplace’s equations whose level surfaces are everywhereperpendicular. Functions satisfying Laplace’s equation (in any numberof variables) are called harmonic functions.The requirement that the real and imaginary parts of a function with acomplex derivative are harmonic indicates the restrictive nature of functionsthat have Complex derivatives.We have shown that the Cauchy-Riemann equations are a necessary conditionfor complex function of a complex variable to have a complex derivative.I will show that if a complex valued function of a complex variable satisfiesthe Cauchy Riemann equations and the first partial derivatives of thatfunction are continuous, then that function has a complex derivative. Thisprovides a simple means to test for the existence of a complex derivative.Theorem 5.1 Let f(z) = u(x, y) + iv(x, y) and assume∂u∂x , ∂u∂y , ∂v∂x , ∂v∂y(112)exist, are continuous in a region R containing z, and satisfy the CauchyRiemann identities. Then f(z) has a complex derivative.To prove this we first pick a point z = x+iy. The existence of the partialderivatives means that23


whereSimilarlyu(x + ∆x, y) − u(x, y) = ∆x[ ∂u∂x (x, y) + δ ux(x, y, ∆x)] (113)δ ux (x, y; ∆x) → 0 as ∆x → 0 (114)whereandwhereandwhereu(x, y + ∆y) − u(x, y) = ∆y[ ∂u∂y (x, y) + δ uy(x, y; ∆y)] (115)δ uy (x, y; ∆y) → 0 as ∆y → 0 (116)v(x + ∆x, y) − v(x, y) = ∆x[ ∂v∂x (x, y) + δ vx(x, y; ∆x)] (117)δ vx (x, y; ∆x) → 0 as ∆x → 0 (118)v(x, y + ∆y) − v(x, y) = ∆y[ ∂v∂y (x, y) + δ vy(x, y; ∆y)] (119)δ vy (x, y; ∆y) → 0 as ∆y → 0. (120)Continuity of these partial derivatives meanswhere continuity requiresetc. as ∆x → 0 and ∆y → 0∂u∂u(x + ∆x, y) =∂x ∂x (x, y) + ξ uxx(x, y; ∆x) (121)∂u∂u(x, y + ∆y) =∂x ∂x (x, y) + ξ uxy(x, y; ∆y) (122). (123)ξ uxx (x, y; ∆x) → 0 (124)ξ uxy (x, y; ∆y) → 0 (125)24


I use these relations along with the Cauchy Riemann equations to computethe complex derivative of f(z):f(z + ∆z) − f(z)∆zu(x + ∆x, y + ∆y) + iv(x + ∆x, y + ∆y) − u(x, y) − iv(x, y)(126)∆x + i∆yI expand all terms using the above relations and use the Cauchy -Riemannequations to factor out a ∆z. I first do the expansion for u and v individually.=u(x + ∆x, y + ∆y) =u(x, y + ∆y) + ∆x[ ∂u∂x (x, y + ∆y) + δ ux(x, y + ∆y; ∆x)] =u(x, y + ∆y) + ∆x[ ∂u∂x (x, y) + ξ uxy(x, y; ∆y) + δ ux (x, y + ∆y; ∆x)] =u(x, y) + ∆y[ ∂u∂y (x, y) + δ uy(x, y; ∆y)]+∆x[ ∂u∂x (x, y) + ξ uxy(x, y; ∆y) + δ ux (x, y + ∆y; ∆x)] (127)It follows thatu(x + ∆x, y + ∆y) − u(x, y) =∆y[ ∂u∂y (x, y) + δ uy(x, y; ∆y)]+∆x[ ∂u∂x (x, y) + ξ uxy(x, y; ∆y) + δ ux (x, y + ∆y, ∆x)] (128)An identical calculation replacing u with v givesv(x + ∆x, y + ∆y) − v(x, y) =∆y[ ∂v∂y (x, y) + δ vy(x, y; ∆y)]+∆x[ ∂v∂x (x, y) + ξ vxy(x, y, ∆y) + δ vx (x, y + ∆y, ∆x)] (129)Using the Cauchy-Riemann equations and adding i times the second term tothe first givesu(x + ∆x, y + ∆y) − u(x, y) + iv(x + ∆x, y + ∆y) − iv(x, y) =25


∆y[− ∂v∂x (x, y) + δ uy(x, y, ∆y)]+∆x[ ∂u∂x (x, y) + ξ uxy(x, y, ∆y) + δ ux (x, y + ∆y, ∆x)]++i∆y[ ∂u∂x (x, y) + δ vy(x, y, ∆y)]++i∆x[ ∂v∂x (x, y) + ξ vxy(x, y, ∆y) + δ vx (x, y + ∆y, ∆x)) =(∆x + i∆y)[ ∂u (x, y) + i∂v (x, y)]+∂x ∂x∆y[δ uy (x, y, ∆y) + iδ vy (x, y, ∆y)]+∆x[ξ uxy (x, y, ∆y)+δ ux (x, y +∆y, ∆x))+iξ vxy (x, y, ∆y)+iδ vx (x, y +∆y, ∆x)](130)Dividing by ∆z = ∆x + i∆y givesf(z + ∆z) − f(z)∆z[ ∂u (x, y) + i∂v (x, y)]+∂x ∂x∆y∆z [δ uy(x, y, ∆y) + iδ vy (x, y, ∆y)]+∆x∆z [ξ uxy(x, y, ∆y)+δ ux (x, y+∆y, ∆x))+iξ vxy (x, y, ∆y)+iδ vx (x, y+∆y, ∆x)](131)If I take the limit as ∆z → 0 the last two lines of equation (131) vanish. Thesurviving term is the complex derivative:=dff(z + ∆z) − f(z)(z) = limdz ∆z→0 ∆z∂u(x, y) + i∂v (x, y) (132)∂x ∂xThis result has no residual dependence on the argument of ∆z.This completes the proof of theorem 5.1. This shows that the existenceof a complex derivative of f(z) can be established by (1) checking that thereal and imaginary parts of f(z) satisfy the Cauchy-Riemann equations andthat the partial derivatives are continuous functions.26=


0.6 Lecture 6One more simple calculation will help clarify the meaning of a function witha complex derivative. Letf(x, y) = u(x, y) + iv(x, y) (133)and assume that the double Taylor series in x and y about the origin converges.Then∞∑ x n y m ∂ n ∂ mseries =n!m! ∂x n ∂y (u + iv) m | 0,0. (134)m,n=0The Cauchy-Riemann equations imply∂∂y (u + iv) | 0,0= ∂∂x (−v + iu) | 0,0= i ∂∂x (u + iv) | 0,0(135)I use this to convert all of the y derivatives in the Taylor expansion to xderivatives to get∞∑ x n (iy) m ∂ nseries =n!m! ∂x (u + iv) n+m | 0,0(136)m,n=0Next I let k = m + n replace the sum over m and n by a sum over k and n:series =∞∑k=01k!The binomial theorem givesseries =k∑n=0k!n!(k − n)! xn k−n∂n(iy)∂x (u + iv) k | 0,0(137)∞∑ (x + iy) k ∂ nk! ∂x (u + iv) k | 0,0(138)k=0This shows that if the function has a convergent Taylor series and the real andimaginary parts satisfy the Cauchy-Riemann equations, then the function isdepends only on the combination (x + iy), rather than independently on xand y.Since we can alway write a general functions of x and y as a functionof z and z ∗ , what we have shown that if the function satisfies the CauchyRiemann equations that∂f(z, x ∗ )∂z ∗ = 0 (139)27


For Homework I will ask you to show that equation (139) is equivalent to theCauchy Riemann equations.Integrals of functions of a complex variableComplex functions can also be integrated. I will define the integral of acomplex valued functions of a complex variable along a path in the complexplane.By a path in the complex plane we mean a function of the formz(t) = u(t) + iv(t) (140)where t is a real parameter that varies between t a and t b withz(t a ) = a z(t b ) = b (141)We are interested in integrals where the path is continuous. In addition,while we can tolerate kinks in the path, we do not want to consider pathsthat have an infinite number of kinks when t ranges over a finite interval.A regular curve C is a curve in the complex plane where the functions u(t)and v(t) are piecewise differentiable. For any finite subinterval [t a , t b ], thismeans that it is possible to divide [t a , t b ] into a finite number of subintervals[t a , t b ] = [t a , t a1 ] ∪ [t a1 , t a2 ] ∪ · · · ∪ · · · ∪ [t aN−1 , t aN−1 ][t aN−1 , t b ] (142)where u(t) and v(t) continuous on [t a , t b ] and have continuous derivatives oneach of the subintervals [t ak , t ak+1 ].To define the integral of a complex function f(z) along a regular path Cfirst subdivide the curve into n segmentsa = z 0 , z 1 , z 2 · · · z n−1 , z n = b (143)On each of the sub intervals I choose a point ζ i and define the sum:I n :=N∑f(ζ k )(z k − z k−1 ) (144)k=1This quantity is a sum of products of complex numbers and is thus a complexnumber.28


This can be repeated for any number of points n. I choose the points so|z k − z k+1 | → 0 and n → ∞.The limitI := lim I n , (145)n→∞provided it exists and is independent of how the points z k and ζ k are chosenis called the contour integral of f(z) along C and is written as∫I := f(z)dz (146)CIn general the value of a contour integral will depend on both the endpointsand the choice of path, C.While a contour integral can be evaluated directly from the definition, itcan be reduced to an ordinary integral by writingI n :=N∑f(ζ k )(z k − z k−1 ) =k=1SinceimpliesN∑(u(ζ k ) + iv(ζ k ))(x k − x k−1 + iy k − iy k−1 ) =k=1N∑[u(ζ k )(x k − x k−1 ) − v(ζ k ))(y k − iy k−1 )]+k=1iN∑[v(ζ k )(x k − x k−1 ) + u(ζ k )(y k − y k−1 )] (147)k=1|z k − z k+1 | → 0 (148)|x k − x k+1 | → 0 |y k − y k+1 | → 0 (149)this become the definition of the integral of a function of two variables alonga path in a plane: ∫I = [u(x, y)dx − v(x, y)dy]+∫i[v(x, y)dx + u(x, y)dy] (150)29


and if the path C = (x(t), y(t)) this becomesI =i∫ b∫ baa[u(x, y) dxdt[v(x, y) dxdt− v(x, y)dydt ]++ u(x, y)dy]dt (151)dtwhich is an ordinary integral. If C has continuous partial derivatives, thendxand dy are continuous functions of t. If C has only piecewise continuousdt dtpartial derivatives, then the integral is replaced by a sum of terms of the aboveform, corresponding to integrals between the points where the derivativeshave discontinuities.Equation (151) is the standard way to calculate the Contour integral. Ingeneral there will be many ways to parameterize the same curve. Differentparameterizations will give the same value of the integral.30


29:171 - Homework Assignment #21. Showcos(z 1 + z 2 ) = cos(z 1 ) cos(z 2 ) − sin(z 1 ) sin(z 2 )You may use the results from problem 1 of the first set, but be carefulto work from definitions.2. Show that if f 1 (z) = u 1 (x, y) + iv 1 (x, y) and f 2 (z) = u 2 (x, y) + iv 2 (x, y)have continuous partial derivatives and satisfy the Cauchy-Riemannequations then g(z) = f 1 (f 2 (z)) has continuous partial derivatives andsatisfies the Cauchy-Riemann equations.3. Show that f(z) = e z2 has a complex derivative.4. Find the real and imaginary parts of f(z) = sin(z). Show explicitlythat they both are solutions to Laplace’s equation and show that theirgradients are orthogonal at any point.5. For z = re iφ = x + iy, let f(z) = u(r, φ) + iv(r, φ). Find the form ofthe Cauchy-Riemann equations in terms of the (r, φ) variables.6. Let u(x, y) = ax 3 + bx 2 y + cxy 2 + dy 3 . Find the values of a, b, c, d forwhich this function satisfies Laplace’s equation. For these values ofa, b, c, d, find a corresponding v(x, y) that satisfies Laplace’s equationand the Cauchy-Riemann equations with u(x, y)?7. Let a and b be complex numbers. Show thathas a complex derivative for z ≠ b.f(z) = a − zb − z31


0.7 Lecture 7One property of ordinary real integrals that is shared by contour integrals isthat the integral of a derivative depends on the value of the function at theendpoints.Equation (151) can be written asAssume thatf(z) =I =dF (z)dz∫ baf(z(t)) dz dt (152)dtF (z) = U(z) + iV (z) (153)where F (z) has a complex derivative at all points on the path C. Thenusing the definition of the complex derivative it is easy to show, because thederivative is independent of direction, thatwhich implies∫ badF (z(t)) dzdz dtI =∫ ba∫ ba[ dU(z(t))dt=dF (z(t))dtf(z(t)) dzdt dt =dF (z(t))dt =dt+ i dU(z(t)) ]dt =dt(154)(U(b) + iU(b)) − (U(a) + iV (a)) = F (b) − F (a) (155)This result is the complex version of the fundamental theorem of calculus.In applying this theorem, some extra care is required to treat integralsof derivatives of multiple valued functions, like the complex logarithm. Thiswill be discussed later.Contour integrals satisfy standard properties that are shared by ordinaryRiemann integrals, since both are expressed as limits of sums of products:∫∫af(z)dz = a f(z)dz (156)C32C


∫C∫(f(z) + g(z))dz =C∫f(z)dz +Cg(z)dz (157)If c is a point on C that divides C into two paths C 1 and C 2 then∫ ∫∫f(z)dz = f(z)dz + f(z)dz. (158)CC 1 C 2The integration by parts formula follows fromf(b)g(b) − f(a)g(a) =∫ ba∫ bddz (f(z)g(z))dz =af(z) dgdz (z)dz + ∫ bag(z) df (z)dz (159)dzExample: 1 Let f(z) = x + 2iy and let C be the path composed by joiningthe two curvesC 1 : z 1 (t) = 2t 0 ≤ t ≤ 1 (160)C 2 : z 2 (t) = 2 + it 0 ≤ t ≤ 1 (161)∫∫∫I = f(z)dz = f(z)dz + f(z)dz =C 1 +C 2 C 1 C 2∫ 10∫ 10∫ 10f(z 1 (t)) dz 1(t)dt +dtf(z 1 (t)) dz 1(t)dt +dt2t · 2dt +∫ 10∫ 10∫ 10f(z 2 (t)) dz 2(t)dt =dtf(z 2 (t)) dz 2(t)dt =dt(2 + 2it)idt = 2 + 2i − 11 − 2i (162)In this example the value of the integral will dependent on the path in thecomplex plane.Example: 2 Let f(z) = z 2 and let C be the curveC : z ( t) = t + it 0 ≤ t ≤ 1 (163)∫I =Cf(z)dz =∫ 1033f(z(t)) dz(t) dt+dt


∫ 1In this case we note that since0(t + it) 2 (1 + i)dt = 1 3 (t + it)3 (164)z 2 = 1 dz 33 dz(165)we can also use the fundamental theorem of calculus to show that∫ ∫1 dI = f(z)dz =3 dz z3 dz = 1 3 z3 | 1+i0 = 1 3 (1 + i)3 (166)CCThis gives the same result at the explicit integral over C, but since this is theintegral of a derivative, all that matters in the value of the function beingdifferentiated at the endpoints of C. In this case the integral is independentof the choice of path.Example: 3 Let f(z) = 1/z and let C be the a circle of radius 1 about theoriginC : z 1 (t) = e iφ 0 ≤ φ ≤ 2π (167)∫ ∫ 2πI = f(z)dz = f(z(φ)) dz(φ)dφ dφ+ =C∫ 2π001e iφ ieiφ dφ = i∫ 2πThe interesting thing about this integral is that0dφ = 2πi (168)1z = d ln(z), (169)dzhowever the integral around a closed curve does not give zero. The problemin this case is that the natural logarithm is a multiple valued function. Whilethe value of the integral does not depend on the starting value, going aroundthe loop once always increases the imaginary part by 2π above its previousvalue. So while the fundamental theorem of calculus still works, we have topay attention for which part of the multiple valued function is evaluated ateach point. This example will be important later.In applications that follow it will be necessary to compute bounds onthe modulus of certain contour integrals. The simplest such bound follows34


immediately from the definition:I n =n∑f(ζ k )(z k − z k−1 ) (170)k=1Using our standard inequalities|I n | ≤n∑|f(ζ k )||z k − z k−1 | (171)k=1Since|f(ζ k )| < |f| (172)where |f| is the maximum value of the modulus of f(z) on C the above canbe replaced byn∑|I n | ≤ |f| |z k − z k−1 | (173)k=1Put the sum is the length of a polygon inscribed inside of the curve, whichbounded by the length of the curve. If L is the length of the curve we get|I n | ≤ |f|L (174)Since the right hand side is independent of n it follow that if the integralexists then∫| f(z)dz| ≤ |f|L (175)This inequality is called Darboux’s inequality.Analytic functionsCA complex function f(z) is analytic at a point z = z 0 if there is a neighborhoodon z 0 where f(z) is single valued and has a complex derivative.The set of points in the complex plane where f(z) is analytic is called thedomain of analyticity of f(z).A complex function f(z) whose domain of analyticity is the entire complexplane is called an entire function.A point z 0 where f(z) is analytic is called a regular point of f(z).A point z 0 where f(z) is not analytic is called a singular point of f(z).35


A singular point z 0 of f(z) is isolated if there is a neighborhood of z 0where f(z) is analytic for z ≠ z 0Conformal MappingFor a function f(x) of a single real variable, the condition that the equationy = f(x) can be solved for x = g(y) in a neighborhood of y 0 = f(x 0 ) isthatdfdx (x 0) ≠ 0 (176)and is continuous in a neighborhood of x 0 This theorem is called the inversefunction theorem. It can be proved using concepts from calculus.The basic idea behind the proof is to note that the existence of a derivativeat x = x 0 means that it is possible to writewhere the remaindery = f(x) = f(x 0 ) + dfdx (x 0)(x − x 0 ) + R(x 0 , x) (177)R(x, x 0 ) = f(x) − f(x 0 ) − dfdx (x 0)(x − x 0 ) (178)is a known function of x, and x 0 that vanishes as x → x 0 .Using y 0 = f(x 0 ) and the assumption thatgivesdfdx (x 0) ≠ 0 (179)x = x 0 + 1dfdx (x 0) [y − y 0 − R(x 0 , x)]. (180)Equation (181) is not a solution of the inverse problem because it has an xdependence on both sides, however when y is close to y 0 one expects that theremainder R(x 0 , x) will be small. This suggests trying to solve this equationby iterationx 0 (y) = x 0x 1 (y) = x 0 + 1dfdx (x 0) [y − y 0 − R(x 0 , x 0 )].36


x n (y) = x 0 + 1df(x dx 0) [y − y 0 − R(x 0 , x n−1 (y))] (181)The sequence {x n (y)} are functions of y that converges to the desired inversefunction provided that the sequence is Cauchy. It can be shown that thissequence will be a Cauchy for small enough |y − y 0 | if df (x) is non-zero anddxcontinuous in a neighborhood of x 0 .Clearly this construction can be generalized to complex functions. Thisimplies that an analytic function with a non-zero complex derivative at z = z 0can be interpreted as a locally invertible mapping for a neighborhood U ofz 0 in the complex plane to a neighborhood V of z 0 ′ = f(z 0).The mapping can be expressed as a mapping from (x, y) to (x ′ , y ′ ):z ′ = x ′ + iy ′ = f(z) = u(x, y) + iv(x, y) (182)Equating the real and imaginary parts givesx ′ = u(x, y) y ′ = v(x, y) (183)A mapping of this form, where f(z) is analytic in a neighborhood ofz = z 0 with a non-zero derivative at z 0 , is called a conformal transformation.37


0.8 Lecture 8In general conformal mappings are transformations of N dimensional spacesinto themselves that preserve angles between intersecting curves. I demonstratethat complex conformal transformations have this property.Letf(z) = u(x, y) + iv(x, y) (184)be a conformal transformation from a neighborhood U of z 0 to a neighborhoodV of z ′ 0 = f(z 0). Let c 1 (t) = x 1 (t) + iy 1 (t) and c 2 (t) = x 2 (t) + iy 2 (t) betwo curves in U that intersect at z 0 when t = 0.The image of these two curves are two curves in V that intersect at z ′ 0when t = 0:c ′ 1(t) = u(x 1 (t), y 1 (t)) + iv(x 1 (t), y 1 (t)) (185)c ′ 2(t) = u(x 2 (t), y 2 (t)) + iv(x 2 (t), y 2 (t)) (186)To show that both of these curves intersect at the same angle letc 1 (t) − z 0 = r 1 (t)e iφ 1(t)(187)c ′ 1(t) − z ′ 0 = r ′ 1(t)e iφ′ 1 (t) (188)Computing the complex derivative at z = z 0 along the curve z 1 (t) givesdz ′dz (z c ′0) = lim1(t) − z 0′ r= lim1(t)′t→0 c 1 (t) − z 0 t→0 r 2 (t) ei(φ′ 1 (t)−φ1(t) =r ′ 1 (0)r 1 (0) ei(φ′ 1 (0)−φ 1(0)(189)I can also compute this same derivative along the second curve. Since thevalue of the complex derivative is independent of how z 0 is approached itfollows thatr 1 ′ (0)r 1 (0) ei(φ′ 1 (0)−φ1(0) = r′ 2 (0)r 2 (0) ei(φ′ 2 (0)−φ 2(0)(190)Comparing the arguments of these two complex numbers givesφ ′ 1 (0) − φ 1(0) = φ ′ 2 (0) − φ 2(0) (191)While it is possible to satisfy these equations by adding multiples of 2π, thatdoes not change the position of the tangent curve in the plane.38


Equation (191) can be written asφ ′ 1 (0) − φ′ 2 (0) = φ 1(0) − φ 2 (0) (192)which show that the two curves in U and their images in V intersect withthe same angle, φ 1 (0) − φ 2 (0).Application : Heat equationConsider thermal energy of a two dimensional solid of volume V . Ifwe ignore thermal expansion and assume that the heat capacity c per unitvolume of the solid is constant, then the thermal enegy of a small volume ∆Vis ∆E = cT ∆V . Integrating this over a finite volume V gives the thermalenergy in the volume:∫E = cT dV. (193)VIf there are no sources of heat in the volume, the change in thermal energy inthe volume is due to the heat transported across the boundary of the volumeV . This can be expressed in terms of the heat current J ⃗ by the equationddt∫∫cT dV = −⃗J · d ⃗ S. (194)The heat current is proportional to the temperature gradient, and points inthe opposite direction of the gradient. This gives⃗J = −κ ⃗ ∇T (195)where the constant κ depends on the conductivity of the plate. Using thedivergence theorem gives∫ ∫dcT dV = κ∇T dt⃗ · dS ⃗ =Since the difference∫ddt∫∫cT dV =κ∇ 2 T dV. (196)[ ∂T∂t − (κ/c)∇2 T ]dV = 0 (197)39


must vanish for all volumes, it follows that the temperature is a solution ofthe heat equation∂T∂t = (κ/c)∇2 T. (198)Once the system has reached a steady state condition the left side ofthis equation becomes 0 and then the temperature is given by a solution ofLaplace’s equation subject to the appropriate boundary conditions.Conformal mapping techniques can be used to transform complex boundaryconditions into simpler ones.Let us assume that we have a plate in the quarter of the complex planex > 0 and y > 0. Assume that the boundary along the y axis is maintainedat temperature T = 0. On the x axis, for 0 ≤ x ≤ 1 the plate is insulatedso (dT/dy) = 0, and for x > 1 it is maintained at T = 1. When the systemreaches a steady state the temperature of the plate will satisfy Laplace’sequation∂ 2 T∂x + ∂2 T= 0 x, y > 0 (199)2 ∂y2 subject to the boundary conditionsT (0, y) = 0,∂T(x, 0) = 0∂y0 ≤ x ≤ 1, T (x, 0) = 1 x ≥ 1(200)These boundary conditions are fairly complex. I show how this problem canbe solved using conformal mapping techniques.Let z = x + iy and w = u + iv and letz = sin(w) (201)x + iy = sin(u + iv) = sin(u) cosh(v) + i cos(u) sinh(v) (202)x = sin(u) cosh(v) y = cos(u) sinh(v) (203)The boundary x = 0 corresponds to u = 0. The boundary componenty = 0, 0 ≤ x ≤ 1 corresponds to v = 0 and 0 ≤ u ≤ π/2, and the regiony = 0, x ≥ 1 corresponds u = π/2 and v > 0. Thus the quarter plane ismapped into a half infinite strip of width π/2.The boundary conditions on the strip are simplerT = 0 for u = 0, v > 0 (204)40


T = 1 for u = π 2 , v > 0 (205)dTdv = 0 for 0 ≤ u ≤ π 2 , v = 0 (206)A simple solution of Laplace’s equation in this region subject to these boundaryconditions isT (u, v) = U(u, v) = 2 π u (207)The desired solution to the original problem isT (x, y) = U(u(x, y)), v(x, y)) = 2 u(x, y) (208)πThis means the we have to solve for u as a function of x and y.x 2 = sin 2 u cosh 2 v y 2 = cos 2 u sinh 2 v (209)x 2sin 2 u −y2cos 2 u = cosh2 v − sinh 2 v = 1 (210)We can solve this for u(x, y):1 = x2sin 2 u −y2cos 2 u =x2sin 2 u − y 21 − sin 2 uThis can be solved by multiplying through by (1 − sin 2 u) sin 2 u(211)(1 − sin 2 u) sin 2 u = x 2 (1 − sin 2 u) − y 2 sin 2 u (212)This can be expressed as a quadratic equation for sin 2 (u):(sin 2 u) 2 − (1 + x 2 + y 2 )(sin 2 u) + x 2 = 0 (213)sin 2 (u) = 1 + x2 + y 2± 1 (1 + x2 + y2 2√ 2 ) 2 − 4x 2 (214)The root with the − sign vanishes when x = y = 0, which is consistent withx + iy = sin(u + iv). Using some more algebra it is possible write the rightside as14 (√ (1 + x) 2 + y 2 − √ (1 − x) 2 + y 2 ) 2 (215)This leads toT (x, y) = 2 π Arcsin[1 2 (√ (1 + x) 2 + y 2 − √ (1 − x) 2 + y 2 )] (216)41


ofIn this problem the solution to the transformed problem was the real partf(z ′ ) = 2 π z′ = 2 (u + iv) (217)πHomographic transformationsThe class of conformal transformations that arise from the linear relationsczz ′ = dz ′ − az − b (218)will be used next semester to generate new solutions of the hypergeometricdifferential equation from existing solutions. These transformations are calledhomographic transformations.It is customary to write this relation in solved formz ′ = az + bcz + d(219)Direct calculation shows thatdz ′dz =acz + d(az + d)c ad − bc− = (220)(cz + d)2(cz + d) 2which will be non-zero if ad − bc ≠ 0. It is conformal in the complex planefor z ≠ −d/c. One can add a point at infinity to the complex plane, then onecan interpret z = −d/c as the point that gets mapped to complex infinity.Homographic transformations can be generated from three elementaryHomographic transformations corresponding toTranslationsChanges of scalez → z ′ = z + e (221)z → z ′ = ez (222)Inversionsz → z ′ = 1 z(223)42


To use these to generate a general homographic transformation considerthe following sequence of transformationsz → z 1 = z + d c(224)Making successive substitutions givesz 1 → z 2 = c 2 z 1 (225)z 2 → z 3 = 1 z 2(226)z 3 → z 4 = (bc − ad)z 3 (227)z 4 → z ′ = a c + z 4 (228)z ′ = a c + z 4 = a c + (bc − ad)z 3 =ac + (bc − ad) 1 = a 1+ (bc − ad) =z 2 c c 2 z 1ac + (bc − ad) 1c 2 (z + d/c) = a c + (bc − ad) 1c(cz + d) =cza + ad + bc − adc(cz + d)= az + b(cz + d)(229)Translations, changes of scale, and inversions have the property that theymap circle in the complex plane to circles (or lines in the in the case ofinversions). It is customary to consider a line as a circle through the pointat infinity.These transformations all involve functions that are analytic in the complexplane (except for the inversion at z = 0). Analyticity can be establishedusing the Cauchy Riemann equations or by directly using the definitions.Since each of the elementary transformations map circles to circles, andevery homographic transformations can be expressed as a product of a finitenumber of elementary transformations, it follows that every homographictransformations maps circles to circles.It remains to show that each of the elementary transformations map circlesto circles.The equation for a circle of radius R and center c in the complex plane is|z − c| = R (230)43


If z → z ′ = z + e then|z ′ − (e + c)| = R (231)is a circle of radius R centered at e + c. If z → z ′ = ez then|z ′ − ec| = R|e| (232)which is a circle with center ec and radius |e|R.I will leave it as a homework exercise to show that inversions also mapcircles to circles (circles through the origin get mapped to lines = circlesthrough the point at infinity)44


0.9 Lecture 9These is one more useful property of of homographic transformations. Considerthe composition of two successive homographic transformationsz ′ = az + bcz + dz ′′ = a′ z ′ + b ′c ′ z ′ + d ′ =az+ba′ + cz+d b′c ′ az+b+ =cz+d d′a ′ (az + b) + b ′ (cz + d)c ′ (az + b) + d ′ (cz + d) = (a′ a + b ′ c)z + (a ′ b + b ′ d)(c ′ a + d ′ c)z + (c ′ b + d ′ d)(233)This expression is another homographic transformation.To get more insight into this transformation consider the matrix product( ) ( ) ( ) ( a′′b ′′ a′bc ′′ d ′′ =′ a b ac ′ d ′ =′ a + b ′ c a ′ b + b ′ dc d c ′ a + d ′ c c ′ b + d ′ d)Inspection of these equations show that the transformation of the coefficientsof successive homographic transformations transform exactly like multiplicationof complex 2 × 2 matrices. The condition ab − cd ≠ 0 means that thematrices have inverses.We have a 1−1 correspondence between complex invertible 2×2 matricesand homographic transformations. Under this correspondence matrix multiplicationgets mapped into composition of homographic transformations.A group is a set S with a product · such that1 The product of two elements of S is an element of S.2 The product is associative: (a · b) · c = a · (b · c).3 There is an identity element e satisfying a · e = e · a = a for every a inS.4 Every element a of S has an inverse a −1 in S satisfying aa −1 = a −1 a = e.Both the complex 2×2 matrices with non-zero determinant under matrixmultiplication and the set of homographic transformations under composition45


of functions are groups. The set of complex 2 × 2 matrices with non-zerodeterminant under matrix multiplication is a group called GL(2, C) (complexgeneral linear group in 2 dimensions).An invertible mapping φ from a group G 1 to G 2 that satisfiesφ(e 1 ) = e 2 (234)φ(a · b) = φ(a) · φ(b) (235)is called a group isomorphism. What we have shown it that the group ofhomographic transformation in the complex plane is isomorphic to the groupGL(2, C).The nice feature about this relation is that any property of GL(2, C)translates to a property of homographic transformations.The Cauchy Goursat Theorem:The Cauchy Goursat theorem is the most important theorem in complexanalysis. The precise statement of the theorem isCauchy-Goursat Theorem: Let C denote a piecewise regular closed curvein the complex plane. Let f(z) be analytic on the curve C and within theregion enclosed by C. Then∫f(z)dz = 0. (236)Remarks:C1. The integral around a closed curve C is sometimes denoted by∮f(z)dz. (237)2. Analyticity on C and in the interior are essential requirements.C3. Cauchy proved this assuming that df was continuous. Goursat’s proveddzthe theorem without this assumption.To prove the Cauchy Goursat theorem first note that for any integer n ≥ 0z n = 1 dn + 1 dz zn+1 . (238)46


It follows that∮z n dz = 1 ∮n + 1ddz zn+1 dz = zn+1 0n + 1 − zn+1 0n + 1 = 0 (239)where z 0 is the starting and ending point of the curve C. Since complexintegration is a linear operator this result extends to any polynomial P (z) inz: ∮P (z)dz = 0. (240)CNext consider the general case. Imagine putting a grid of small squaresover the region enclosed by C. Because of the linearity of the integral, theintegral around the closed curve can be expressed as a sum of integrals aroundthe interior squares and geometric figures bound by parts of C and interiorsquares∮N∑∮f(z)dz = f(z)dz. (241)CC nn=1The orientation of the integrals are all taken in the same sense (counterclockwiseor clockwise). This works because the integrals on common boundariesare in opposite directions and cancel. What remains are a number ofintegrals over segments that add up to give C.The triangle inequality gives∮| f(z)dz| ≤CN∑n=1∮| f(z)dz| (242)C nThe next step is to show that this sum can be made as small as desired.To do this choose an arbitrary ɛ > 0. Let A be the set consisting of C andthe part of the complex plane enclosed by C.The existence of a derivative of f(z) at z 0 ∈ A means thatf(z) − f(z 0 )z − z 0= dfdz (z 0) + r(z, z 0 ) (243)where the remainder r(z, z 0 ) → 0 as z → z 0 .Specifically for every ɛ > 0 there is a δ such that |z − z 0 | < δ means that|r(z, z 0 )| ≤ ɛ.Subdivide A into a grid with sides of length l plus boundary terms. Askif in the intersection of A with a given square of side l there is a z 0 such that47


|r(z, z 0 )| < ɛ for every z in this set. If a square fails to have this propertysubdivide the edges by a factor of two and ask the same question again.Either this process terminates, and there is a finite square size wherethis inequality holds on all squares that intersect A, or there are an infinitesequence of nested smaller squares where in each square there is no z 0 wherethis holds for every z in the square. The intersection of this nested sequenceof squares is a single point z ∞ that is in A and in all of the squares in thissequence. This point has the property that for any l > 0 there is point zsatisfying |z − z ∞ | ≤ √ 2l where r(z, z ∞ ) > ɛ. The point z ∞ is in the set A,since it is in every one of the nested sets, each of which is in A.It follows that no matter how small we choose δ at z ∞ there is at leastone z with |z − z ∞ | < δ such that| f(z) − f(z ∞)z − z ∞− dfdz (z 0)| > ɛ (244)which contradicts the assumption that f(z) is differentiable at z ∞ ∈ A.What this argument shows is that for every positive ɛ there is a finite gridsize l such that |z − z 0 | < √ 2l implies |r(z, z 0 )| < ɛ48


29:171 - Homework Assignment #31. Let f(z) = e z . Let C be the curve in the complex plane that starts atthe origin, z = 0, goes along the positive real axis to the point z = 2,and then proceeds in a straight line in positive imaginary direction fromz = 2 to z = 2 + 3i.Calculate the contour integral∫f(z)dzCdirectly. Check your answer by noting that df (z) = f(z)dz2. Show that the conformal mappingmaps the circlez → z ′ = 1 z|z − a| = rinto another circle. Find the origin and radius of the transformed circle.Determine the condition for the transformed circle to become a line (i.e.circle of infinite radius).3. Show that any real valued analytic function is a constant.4. If C is the circle |z| = 1, calculate the line integral (in the counterclockwisedirection) of ∫dzz5. Consider the homographic transformationz ′ = az + b bc − ad ≠ 0cz + dCalculate the inverse transformation. Is it homographic?6. Use Darboux’s theorem to put a bound on the integral∫| sin(z)dz|where C is the circle |z| = 5.CC49


0.10 Lecture 10We are now in a position to complete the proof of the Cauchy Goursat theorem.Let z n0 be a point in each square where the remainder term is boundedby a fixed ɛ and notef(z) = f(z 0n ) + dfdz (z 0n)(z − z 0n ) + r(z, z 0n )(z − z 0n ) (245)Using (242) in (245) along with the triangle inequality again gives∮| f(z)dz| ≤N∑n=1+C∮| (f z0n + dfC ndz (z 0n)(z − z 0n ))dz|N∑n=1∮| (r(z, z 0n )(z − z 0n ))dz|C n(246)The integrands in the first term are polynomials, which have been shown tohave vanishing integral.Darboux’s theorem can be used to put upper bounds on the integralsover the small squares. The path length of C n is 4l for squares, and less than4l + s n where s n is the length of the C in the square, and l is the edge lengthof a square.The integrand is bounded by|(r(z, z 0n )(z − z 0n ))| ≤ √ 2lɛ (247)This means that the n th integral is bounded by∮| (r(z, z 0n )(z − z 0n ))dz| ≤ ɛ √ 2l(4l + s n ) (248)C nSumming over all squares∑ɛ√24l 2 ≤ ɛ4 √ 2A (249)50


where A is the area of a rectangle bounding the curve C and∑ɛ√2lsn ≤ ɛ √ 2lS (250)where S is the length of C. This gives∮| f(z)dz| ≤ ɛ × (4 √ 2A + √ 2lS) (251)CSince ɛ was arbitrary, the right hand side can be taken smaller than any givenpositive quantity. This requires∮f(z)dz = 0 (252)which completes the proof of the Cauchy Goursat theorem.CI will now use Cauchy’s theorem to prove some useful results.For the first result assume that the closed curve C in Cauchy’s theoremcan be expressed as a sum two curves, C 1 and C 2 . By this I mean that theend of C 1 is the beginning of C 2 , and the end of C 2 is the beginning of C 1 .It follows∮ ∫∫∫0 = f(z)dz = f(z)dz = f(z)dz + f(z)dz =CC 1 +C 2 C 1 C 2∫∫f(z)dz − f(z)dz (253)C 1 −C 2where −C 2 indicates the curve that follows the same path as C 2 but goes inthe opposite direction. It has the same start and end point as C 1 . Therefore∫∫f(z)dz = f(z)dz (254)C 1 −C 2This shows that the integral of an analytic function along any two pathsin the complex plane with the same starting and end points are independentof path provided f(z) is analytic in the region bounded by both curves.In developing further consequences of Cauchy’s theorem it is useful tointroduce a definition.Definition: A region R in the complex plane is called simply connected ifthe interior of every closed curve in R contains only points of R.51


Definition: A region R in the complex plane is called multiply connected ifit is not simply connected.Theorem: Let f(z) be analytic throughout a simply connected region R andlet z ∪ C = {∅}. Then∮12πi Cf(z ′ )z ′ − z dz′ ={ f(z) : z interior to C0 : z exterior to C(255)1Proof: is analytic as a function of z ′ −z z′ except at the point z ′ = z. Theproduct f(z′ )is the product of analytic functions in z ′ −z z′ , except when z ′ = z.If z /∈ R then f(z′ )is analytic on C and at every point interior to C so byCauchy’s theoremz ′ −z∮Cf(z ′ )z ′ − z dz′ = 0 (256)The same result follows if z ∈ R but strictly outside of the curve C.If z is in the interior of C, using Cauchy’s theorem it is possible to replaceC by small circle of radius r about z such that the circle in bounded by C.This can be done by drawing a line from a point on the circle to a point onC that does not intersect z. The path that starts at the line goes around C,extends along the line, goes around the small circle in the opposite direction,and returns to the starting point along the line does not bound z so the valueof this integral is zero. The two integrals along the line are in the oppositedirection and cancel. The integral over C and the circle in the oppositedirection cancel. If the direction of the integral around the circle is reversedthe sign changes and it must be identical to the integral around C in thesame direction.Let ⊙ denote the circle. Thenf(z)2πi∮⊙12πi12πi∮∮C⊙1z ′ − z dz′ + 12πif(z ′ )z ′ − z dz′ =f(z ′ )z ′ − z dz′ =∮⊙f(z ′ ) − f(z)dz ′ (257)z ′ − z52


Since f(z ′ ) is analytic at z it is continuous at z. This means that forevery ɛ > there is an r > such that |z ′ − z| < r ⇒ |f(z) − f(z ′ )| < ɛ. Choosethe radius of ⊙ to be r. Then by Darboux’s theorem∮1|2πiOn the other hand if we let z ′ = z + re iφ⊙f(z)2πif(z ′ ) − f(z)dz ′ | ≤ 1 ɛ2πr = ɛ (258)z ′ − z 2π rf(z)2πif(z)2πi∫ 2π0∮⊙∫ 2π01z ′ − z dz′ =1e iφ dz ′dφ dφ =1re iφ ireiφ dφ = f(z) (259)What we have shown is that for any ɛ > 0 we have|f(z) − 1 ∮f(z ′ )2πi z ′ − z dz′ | < ɛ (260)Cwhere ɛ > 0 is arbitrary. Since the left hand side is independent of ɛ it followsthatf(z) = 1 ∮f(z ′ )2πi z ′ − z dz′ (261)as desired.Note that implicit in the proof of this result is the assumption that C isa counter clockwise curve. It C was in the opposite direction the formulawould have a − sign.C53


0.11 Lecture 11The implication of this theorem is that if f(z) is known to be analytic ina simply connected region then the knowledge of f(z) on a curve uniquelyfixes the value of f(z) at all points interior to the curve.The value of f(z) on the curve can be thought of as boundary conditionsthat determine the solution of Laplace’s equation in the interior.The Cauchy integral formula can be used to prove that analytic functionsare infinitely differentiable. To show this let C be a regular curve in a simplyconnected region. Let g(z) be continuous on C and on the interior of C.Define the functionConsider the difference∆ :=f(z) = 12πif(z + ∆z) − f(z)∆z∮Cg(z ′ )z ′ − z dz′ (262)− 1 ∮2πi Cg(z ′ )(z ′ − z) 2 dz′ (263)The assumption the z /∈ C implies∮1|∆| = | g(z ′ 1)[2πi z ′ − z − ∆z − 1z ′ − z − 1(z ′ − z) 2 ]dz′ | (264)Elementary algebra impliesC1[z ′ − z − ∆z − 1z ′ − z − 1(z ′ − z) ] = (∆z) 22 (z ′ − z) 2 (z ′ − z − ∆z) . (265)Using (265) in (264) gives|∆| = |∆z|2π | ∫Cg(z ′ )|. (266)(z ′ − z) 2 (z ′ − z − ∆z)The integral is bounded by Darboux’s theorem. This shows that dfdz (z′ )exists for all z in the interior of the region bounded by C. More important,it gives an integral expression for the derivative:dfdz (z) = 12πi∮Cg(z ′ )(z ′ − z) 2 dz′ (267)54


Now we use <strong>math</strong>ematical induction to show that all derivatives of f(z)exist in this region. The induction assumption is that the n − th derivativeexists and is given byf (n) (z) := dn f n!(z) =dzn 2πi∮Cg(z ′ )(z ′ − z) n+1 dz′ (268)Assuming (268) I show that the same equation holds for n → n + 1.Consider∆ = f (n) (z + ∆z) − f (n) ∮(z) (n + 1)! g(z ′ )−∆z2πi C (z ′ − z) n+2 dz′ =∮n!g(z ′ 1)[2πi C (z ′ − z − ∆z) − 1n+1 (z ′ − z) − n + 1n+1 (z ′ − z) n+2 ]dz′ (269)To estimate this note(11(z ′ − z − ∆z) = n+1 z ′ − z + 1) n+1z ′ − z ∆z 1=z ′ − z − ∆z∑n+1k=0∑n+1k=2( ) n+1−k(n + 1)! 1 1k!(n + 1 − k)! (z ′ − z) k z ′ − z ∆z 1=z ′ − z − ∆z1 (n + 1)+(z ′ − z)n+1(z ′ − z) ∆z 1n+1 z ′ − z − ∆z +(n + 1)!k!(n + 1 − k)!(1 1(z ′ − z) k z ′ − z ∆z 1z ′ − z − ∆z) n+1−k(270)Using (270) in (269) gives∆ =∮n!g(z ′ (n + 1) 1)[2πi C (z ′ − z) n+1 z ′ − z − ∆z − n + 1(z ′ − z) + n+2∑n+1( ) n+1−k(n + 1)! 1 1 1k!(n + 1 − k)! (z ′ − z) k ∆z z ′ − z ∆z 1]dz ′ (271)z ′ − z − ∆zk=2The last term is of the form ∆z times a quantity that is bounded by Darboux’stheorem. The other two terms can be estimated by noting(n + 1)(z ′ − z) n+1 1z ′ − z − ∆z − n + 1(z ′ − z) n+2 =55


(n + 1)(z ′ − z) [ 1n+1 z ′ − z − ∆z − 1z ′ − z ] =∆z(n + 1) 1(z ′ − z) n+2 z ′ − z − ∆z(272)This leads to a another term of the form ∆z multiplied by a quantity boundedby Darboux’s theorem. Letting |∆z| → 0 shows the n + 1 derivative existsand reproduces the formula assumed in the induction assumption.Thus unlike calculus of real functions, if a function is analytic in a regionthen all complex derivatives of the function exist in that region.Equation (??) is an example of an integral representation of an analyticfunction. This result can be generalized as follows:Theorem: Given the integral representation∫f(z) := K(z, z ′ )g(z ′ )dz; z ∈ R (273)Cthen the complex derivative of f(z) exists and is given bydfdz (z) := ∫cprovided the following condition are satisfied∂K(z, z ′ )g(z ′ )dz; z ∈ R (274)∂z1. For z ∈ R, K(z, z ′ ) an analytic of z for any z ′ ∈ C2. For z ∈ R, K(z, z ′ )g(z ′ ) an a continuous function of z ′To prove this note the analyticity in z allows us to use the Cauchy integralformulaK(z, z ′ ) = 1 ∫K(t, z ′ )2πi C t − z dt C′ ⊂ R (275)′Next we use the definition along with the above to writeSince K(t,z′ )t−zf(z) := 12πi∫C∫K(t, z ′ )[C t − z dt]g(z′ )dz ′ ; z ∈ R (276)′g(z ′ ) is continuous for z ≠ t the order of the integrals can beinterchanged (recall these two complex integrals can be converted to ordinary56


integrals where continuity of the integrand is sufficient to interchange theorder of the integrals:f(z) := 1 ∫ ∫K(t, z ′ )g(z ′ )dz ′ ]dt; z ∈ R (277)2πi t − zC ′ [CUsing (267) gives∫ ∫df 1 K(t, z ′ )g(z ′ )(z) := [dz ′ ]dt;dz 2πi C ′ C (t − z) 2 z ∈ R (278)Note that (267) gives∂K(z, z ′ )∂z= 1 ∫K(t, z ′ )2πi C (t − z) dt ′ 2 C′ ⊂ R (279)If we interchange the order of integration in (279) and use (278) we get∫df 1 ∂K(z, z ′ )(z) := g(z ′ )dz ′ (280)dz 2πi ∂zC57


0.12 Lecture 12Next I derive some properties of analytic functionsTheorem 12.1: The modulus of an analytic function cannot have a localmaximum within the region of analyticity.proof: Let z 0 be a regular point z 0 of f(z). Then for small enough r, f(z)is analytic in the region |z − z 0 | ≤ r. Denote the circle |z − z 0 | ≤ r by ⊙.The Cauchy integral formula impliesf(z 0 ) = 12πiApplying Darboux’s inequality to this integral gives∫⊙f(z)z − z 0dz. (281)|f(z 0 )| ≤ max|f(z)| z∈⊙ . (282)It follows that there must be at least one point z on ⊙ where |f(z)| ≥ |f(z 0 )|.This must also true on every circle of radius r ′ < r, since r was chosenarbitrarily. This means that there is no neighborhood of z 0 where f(z 0 ) is alocal maximum. This proves the theorem. If we let g(z) = 1/f(z) and note that g(z) is analytic in a neighborhoodof z = z 0 provided f(z 0 ) ≠ 0, then g(z 0 ) cannot be a local maximum of g(z).This is equivalent to saying that f(z 0 ) can not be a local minimum.The same applies separately to the real and imaginary parts of an analyticfunction since|e f(z) | = e R(z) |e if(z) | = e I(z) (283)and e x is increasing.Theorem 12.2: A bounded entire function must be constant.proof: Since f(z) is entire it is possible to writedfdz (z) = 1 ∫f(z ′ )2πi (z ′ − z) 2 dz′ . (284)Choosing ⊙ to have a radius R and using Darboux’s theorem gives⊙| dfdz (z)| = max|f(z′ )| z ′ ∈⊙R58≤ max|f(z′ )|. (285)R


Since the right hand side can be made as small as desired by increasing Rand the left hand side is independent of R, it follows thatdf(z) = 0 (286)dzwhich means that f(z) = constant.The implication of this result is that any analytic function that arebounded at ∞ necessarily have a singularity on the complex plane.By looking are higher derivatives this result can be generalized to showthat entire functions bounded by polynomials have to be polynomials. Thiswill be the subject of homework.Cauchy’s theorem also has a converse, called Morera’s theorem.Theorem 12.3(Morera’s Theorem): If the integral∫f(z)dz (287)Cof a continuous function f(z) vanishes for any closed contour C of a regionR, then f(z) is analytic in R.proof: The proof of Morerra’s theorem is by construction. Define the functionF (z) =∫ zaf(z ′ )dz ′ . (288)Since the integral is independent of path, this function is well defined andonly depends on the choice of z and a in R. Compute the derivative of F (z)directly from the definitionF (z + ∆z) − F (z)= 1 ∫ z+∆z ∫ z∆z ∆z [ f(z ′ )dz ′ − f(z ′ )dz ′ ] =aaf(z)∆z∫ z+∆zzdz ′ + 1∆z∫ z+∆zz[f(z ′ ) − f(z)]dz ′ . (289)The first term is f(z) while Darboux’s theorem implies that the second isbounded bymax(|f(z ′ ) − f(z)|)|z ′ − z| < ∆ (290)which vanishes in the limit that |∆ z | → 0 by the assumed continuity of f(z).This shows thatf(z) = dF (z). (291)dz59


Since F (z) has all derivatives, it follows that f(z) is analytic.Next I discuss manipulations of sequences of analytic functions. Considera sequence of analytic functions f 1 (z), f 2 (z), · · · defined in some region suchthat∞∑f(z) = f n (z) (292)n=1converges uniformly for all z in some curve C. Then assuming all integralsexist∫ ∫ ∞∑∞∑∫f(z) = f n (z) = f n (z) (293)CC n=1This means that whenever we have uniform convergence the order of thesum and integral can interchanged. To prove this letn∑s n (z) = f k (z) (294)k=1be the partial sum consisting of the first n terms.From Darboux’s theorem∫| (f(z) − s n (z))dx| ≤ max z∈C |f(z) − s n (z)|L (295)Cwhere L is the length of the curve. Uniform comvergence of the series meansthat for a given ɛ it is possibe to choose n large enough soindependent of z. On the other hand∫s n (z)dz =which shows thatas n → ∞. This means∫∫| f(z)dz −CCn=1C|f(z) − s n (z)| < ɛ (296)f(z)dz =n∑∫k=1n∑∫k=1C∞∑∫k=1f k (z)dz (297)f k (z)dz| → 0 (298)Cf k (z)dz. (299)60


29:171 - Homework Assignment #41. Let f(z) and g(z) be entire functions. Assume that they agree on aline segment of the real axis. Show that they agree for all z.2. If f(z) is analytic and non-vanishing in a region R, and continuous inR and its boundary, show that |f| assumes its minimum and maximumvalues on the boundary of R.3. Show that if f(z) is entire and |f(z)| < C|z n | for sufficiently largevalues of |z|, where C is a constant, then f(z) must be a polynomial ofdegree ≤ n.4. Show that any non-constant polynomial in z has at least one complexroot.5. Let f(t) be continuous for t ∈ [a, b]. ShowF (z) =∫ bae izt f(t)dtis an entire function. Calculate the derivative.6. Prove that for charge free two-dimensional space the value of the electrostaticpotential at any point is equal to the average of the potentialover the circumference of any circle centered on the point. You mayassume that the potential is the real part of an analytic function (theelectrostatic potential in a charge free region is a solution of Laplace’sequation).61


0.13 Lecture 13We can use the previous result to show that f(z) is analytic. To show thisapply Morera’s theorem to the previous result:∮Cf(z)dz =∞∑∮n=1Cf n (z)dz = 0 (300)for any closed curve in R. This implies that f(z) is analytic.Since the function f(z) is analytic it follows thatdfdz (z) = 12πif(z)(301)(z − z 0 ) 2By the uniform convergence of the partial sums of functions f n we haveuniform convergence of the series on an interior point z 0 of C1 f(z)2πi (z − z 0 ) = ∑ ∞1 f n (z)(302)2 2πi (z − z 0 ) 2n=1If we integrate this it follows that∫12πif(z)(z − z 0 ) = ∑ ∞ ∫12 2πin=1f n (z)(z − z 0 ) 2 (303)which is equivalent to the uniform convergence of∞dfdz (z) = ∑n=1df n(z) (304)dzThis generalizes to all higher derivatives by induction.The two important lessons from these exercises are that it is always possibleto change the order of sums and integrals provided the sums convergeuniformly on the path of integration (and the path has a finite length). Thesecond important observation is that is a series of analytic functions convergesuniformly, the resulting function is analytic and the sum of the derivativesof each of the terms in the series is uniformly convergent and converges tothe derivative of the function.62


converges uniformly for all points z ′ on ⊙. This givesf(z) = 1 ∫f(z ′ )2πi ⊙∞∑n=0(z − z 0 ) n(z ′ − z 0 ) n+1 (311)Because of the uniform convergence we can interchange the order of the sumand integral to getf(z) =∞∑∫1 n!n! 2πin=0where we have previously shown thatd n fdz (z 0) = n! ∫n 2πiwhich givesf(z) =∞∑n=0⊙⊙f(z ′ ) (z − z 0) n(z ′ − z 0 ) n+1 (312)f(z ′ )(z ′ − z 0 ) n+1 dz′ (313)1 d n fn! dz (z 0)(z − z n 0 ) n (314)Which is exactly the Taylor series about z 0 .Note that we have shown that this series converges for z in the interior of⊙. The circle ⊙ was only required to be in the region of analyticity. It canbe replaced by the largest circle centered at z 0 that is still in the region ofanalyticity and by the above proof the Taylor series will converge providedz is in this larger circle.Taylor’s theorem has a converse. Assume thatf(z) =∞∑a n (z − z 0 ) n (315)n=0is absolutely convergent in a neighborhood of z 0 . By the ratio test convergencefails if| a n+1r n+1a n r n | > 1 (316)as n → ∞. Convergence requires that |a n r n | < A where A is a finite constant(if this is not true the ratio test fails). This means that there is a constantA with the property|a n | ≤ A r n . (317)64


Choose z so |z − z 0 | < r. Then|∞∑a n (z − z 0 ) n | ≤n=0∞∑n=0∞∑|a n (z − z 0 ) n | ≤ (318)n=0A| (z − z 0)| n ≤rA1 − | (z−z 0)| . (319)rThe means that the convergence is uniform for |z − z 0 | < r.It follows that the integral of f(z) around any close curve in the regionbounded by a circle of radius r about z 0 can be computed by interchangingthe order of the sum and integral∮Cf(z)dz =∞∑∮a nn=0C(z − z 0 ) n dz = 0 (320)We can not apply Morera’s theorem to conclude that f(z) is analytic for zinside the circle of radius r.What we have show is that at least locally any analytic function behaveslike a convergent series in powers of z.Next I consider a generalization of Taylor’s theorem that can be appliedto a function that is analytic in a region R bounded by two concentric circles.The function does not have to be analytic inside the inner circle.Pick a point z ∈ R that lies between the two circles. Consider a curveconsisting of a large circle, a small circle, and a small circle between these tocircles that surrounds z. Include a pairs of lines the connect the outer circleto the small circle and the inner circle to the small circle. THe combinedpath lies in R and the integral∮ f(z ′ )z ′ − z dz′ = 0 (321)by Cauchy’s theorem.Since the integrals over the lines cancel this means that the integral overthe circle around z can be expressed as the difference of the integral over thelarge circle minus the integral over small circle∮f(z ′ )⊙ zz ′ − z dz′ =∮f(z ′ )⊙ Lz ′ − z dz′ −65∮f(z ′ )⊙ Sz ′ − z dz′ (322)


The integral around the circle around z if 2πif(z) so we havef(z) = 1 ∮f(z ′ )2πi ⊙ Lz ′ − z dz′ − 1 ∮f(z ′ )2πi ⊙ Sz ′ − z dz′ (323)Let z 0 be the center of the two concentric circles and write the above asf(z) = 1 ∮f(z ′ )2πi ⊙ Lz ′ − z 0 + z 0 − z dz′ − 1 ∮f(z ′ )2πi ⊙ Sz ′ − z 0 + z 0 − z dz′ (324)On the large circle |z 0 − z ′ | > z 0 − z while on the small circle |z 0 − z| >z 0 − z ′ . Use these inequalities to write the above asf(z) = 1 ∮f(z ′ ) 12πi ⊙ Lz ′ − z 0 1 − z−z dz ′ − 1 ∮f(z ′ ) 1dz ′ (325)0z ′ −z 02πi ⊙ Sz 0 − z 1 − + z 0−z ′z 0 −zUsing the uniform convergence of the geometric series this becomesf(z) = 1 ∮f(z ′ ) ∑ ∞ ( ) n z − z0dz ′ − 1 ∮f(z ′ )∞∑( )z0 − z ′ ndz ′2πi ⊙ Lz ′ − z 0 z ′ − zn=00 2πi ⊙ Sz 0 − z zn=0 0 − z(326)Since the sums are uniformly convergent of the respective circles we caninterchange the order of the sum and the integral and the result still converges.∞∑∞∑f(z) = a n (z − z 0 ) n + b n+1 (z − z 0 ) −n−1where=n=0∞∑a n (z − z 0 ) n +n=0n=0∞∑n=1b n(z − z 0 ) n (327)a n = 1 ∮f(z ′ )2πi ⊙ L(z ′ − z 0 ) n+1 dz′ (328)b n+1 = − 1 ∮f(z ′ )(z ′ − z 0 ) n dz ′ (329)2πi ⊙ SThe series is is called the Laurent series. It is clear from the proof that itconverges when z lies between any pair of concentric circles in R.66


What is different is that the convergence of the Laurent series is notrestricted to simply connected sets. As in the case of Taylor’s theorem, theproof of the theorem gives integral formulas to compute the coefficients ofthe series in terms of the function f(z).As an example, the function f(z) = 1/z is analytic except at z = 0. Ifwe consider a point z 0 ≠ this function has a convergent Taylor series for allz satisfying |z − z 0 | < |z 0 |. On the other hand it has a Laurent series thatconverges everywhere in the complex plane except at z = 0. Bu using theLaurent series the region where the series converges is much bigger.We have primarily been considering functions that are analytic. Thepoints where the function is not analytic are called singularities. It is usefulto discuss both the structure of 0 ′ s and the singularities of analytic functions.Definition: An analytic function f(z) has a zero of order n at z 0 if0 = f(z 0 ) = dfdz (z 0) = d2 fdz 2 (z 0) = · · · dn−1 fdz n−1 (z 0) (330)andd n fdz (z 0) ≠ 0 (331)nIf f(z) has a zero of order n then Taylor’s theorem impliesf(z) = (z − z 0 ) n h(z) (332)where h(z) is analytic and does not vanish at z = z 0 . By continuity it followsthat f(z) must be non-vanishing in a neighborhood of z 0 .Note also that if the function has a zero of infinite order then the functionmust be identically zero by Taylor’s theorem. This means that the zeros ofall analytic functions, except the zero function, are isolated.If f(z) has an isolated singularity at z = z 0 then we can always computethe Laurent series of f(z) about this point. The function f(z) is singular atz 0 if at least one of the Laurent coefficients b k ≠ 0.The function f(z) has a pole of order n if b n ≠ 0 and b k = 0 for k > n.The sumn∑ b k(333)(z − z 0 ) kis called the principal part of f(z) at the singular point z = z 0 .A pole of order 1 is called a simple pole.k=167


A function f(z) that is analytic, except at a set of isolated singularities,where the function has poles, is called meromorphic.If an infinite number of the b n ≠ 0 in the Laurent series, then f(z) is saidto have an isolated essential singularity at z = z 0 .The function f(z) behaves wildly in the neighborhood of an isolated essentialsingularity. This is illustrated by the next theorem.68


0.14 Lecture 14Theorem 14.1:(Weierstrass) If f(z) has an isolated essential singularity atz = z 0 then for arbitrary positive numbers ɛ and δ and any complex numbera one has|f(z) − a| < ɛ (334)for some z satisfying |z − z 0 | < δThis theorem means that f(z) oscillates so violently near z 0 it gets arbitrarilyclose to every complex number!Since the singularity is isolated the Laurent series converges, withb n = 12πi∫C(z ′ − z 0 ) n−1 f(z ′ )dz ′ (335)for a circle of radius r about z 0 . Darboux’s theorem gives|b n | ≤ 2πr2π rn−1 |f| (336)where |f| is the largest value of |f(z)| on the circle.If f(z) is bounded in a neighborhood of z 0 , then the above implies b n → 0so f(z) cannot be bounded if any of the b n , n > 0 are non-zero.Pick an arbitrary complex number a. Either z 0 is an accumulation pointof f(z) − a, in which case we done, or it is not. If z 0 is not an accumulationpoint of f(z) − a then there is an η > 0 such that |z − z 0 | < η implies|f(z) − a| > 0.This means that1g(z) =(337)f(z) − ais well defined for |z − z 0 | < η. Solving for f(z) allows us to express f interms of g:f(z) = a + 1g(z) . (338)If g(z 0 ) is finite then f(z) is analytic. If g(z) has a zero of finite order at z 0 ,then f(z) has a pole of the same order. The only other possibility is thatg(z) has an essential singularity.By our previous argument, g(z) cannot be bounded in a neighborhood ofz 0 , which means that1|f(z) − a| = |g(z) | (339)69


can be made as small as desired in any neighborhood of z 0 . This completesthe proof of the theorem .While the residue theorem is straightforward consequence of Cauchy’stheorem, it is one of the most useful theorems of complex analysis.Recall that if f(z) is analytic in a neighborhood of z 0 then Cauchy’stheorem implies0 = 1 ∮f(z ′ )dz ′ (340)2πi cfor any closed regular curve in the neighborhood about z 0When f(z) is an isolated singularity of f(z) we define the residue of f(z)at z 0Res(f(z 0 ) = 1 ∮f(z ′ )dz ′ (341)2πi cfor any closed regular curve in the neighborhood of z 0 containing no othersingularities.If a curve encloses N isolated singularities then is it possible, usingCauchy’s theorem, to replace the single curve by N curves, where the n thcurve C n only contains the n th singular point. Application of Cauchy’s theoremand the definition of the residue givesTheorem 14.1:(residue theorem) If f(z) is meromorphic in a region R andC is a regular curve in R then∮1f(z ′ )dz ′ =2πi cN∑n=1∮1f(z ′ )dz ′ =2πi c nN∑Res(f(z 0 )). (342)where the sum is over the residues of singularities enclosed by R.The theorem is only useful if we can compute the residue of different kindsof curves around different kinds of isolated singularities.Given an isolated singularity at z = z 0 we can alway use Cauchy’s theoremto replace a given curve by a circle of small radius about z 0 . We still mustspecify whether the curve is counterclockwise (positive) or clockwise (negative)and how many times the curve winds around z 0 in the counterclockwisen + and clockwise n − direction.We first consider the case that f(z) has a simple pole at z = z 0 . In thatcase we can writef(z) =g(z) . (343)z − z 0n=1where g(z 0 ) ≠ 0 and g(z) is analytic in a neighborhood of z 0 .70


Res(f(z)) = 1 ∮ g(z0 )2πi re iφ ireiφ dφ + 1 ∮ g(z) − g(z0 )dz. (344)2πi z − z 0The integrand of the second term is analytic so this integral gives zero. Theintegral of the first term isRes(f(z)) = g(z ∮0)idφ (345)2πiIf the curve c is counterclockwise residue isif the curve c is clockwise residue isg(z 0 ), (346)−g(z 0 ), (347)and if the curve c goes around z 0 n + times in the counterclockwise and n −time in the clockwise direction the residue is(n + − n − )g(z 0 ) (348)Next we consider the case of poles of order n > 1. First letf(z) =Let c be a single counterclockwise circle. Theb n(z − z 0 ) n (349)Resf(z 0 ) = 1 ∫ 2πire iφ dφ2πi 0 r n e = inφ∫1 2πr n−1ie −i(n−1)φ dφ = 0 (350)2πi 0In general, if f(z) has a pole of order n at z 0 thenf(z) =g(z)(z − z 0 ) n (351)71


where g(z) is analytic at z = z 0 . If we expand g(z) in a power series we getf(z) = g(z 0)(z − z 0 ) + dgn dz (z 10)(z − z 0 ) + · · · +n−11 d n−1 g(n − 1)! dz (z 10) + h(z) (352)n−1 (z − z 0 )1where h(z) is analytic. Integrating this and computing the residue givesRes(f(z 0 )) =1 d n−1 g(n − 1)! dz (z 0) (353)n−1whereg(z) = f(z)(z − z 0 ) n (354)This must be non-zero if f(z) has a pole of order n.Note that the residue of a function with f(z) with an order n pole atz = z 0 is always the Laurent coefficient b 1 .When a function has multiple poles at different points considerThenf(z) =N∏n=11z − z nz i ≠ z j , i ≠ j (355)f(z) =N∑n=0c nz − z n(356)The coefficients c n can be computed by taking the difference of these functionsand evaluating the integral over a small circle that only contains the n − thsingularity. In that case we get the identityc n = 2πi2πiN∏k≠nThis leads to the useful representationf(z) =N∏n=11z − z n=1z n − z k=N∑N∏n=0 k≠nN∏k≠n1z n − z k(357)1z n − z k1z − z n(358)72


This has the immediate generalization to the case that the product ismultiplied by an arbitrary polynomial P (z):f(z) =N∏n=1P (z)z − z n=N∑N∏n=0 k≠n1z n − z kP (z n )z − z n(359)Note that this formula still holds if P (z) vanishes at some of the roots.73


0.15 Lecture 15Theorem 14.2:( Jordan’s Lemma) Let Γ R be a semicircle in the upper halfof the complex plane of radius R with center at the origin. Let f(z) → 0uniformly in arg(z) as |z| → ∞ when 0 ≤ φ ≤ π. Let∫I R := e iαz f(z)dz (360)Γ RThen for α > 0proof: The proof uses Darboux’s theorem∫I R := e iαz f(z)dz =Γ R∫ π0lim I R = 0 (361)R→∞e iαR cos(φ)−αR sin(φ) f(Re iφ )iRe iφ dφ (362)Since f(z) vanishes uniformly in the argument of z as |z|∞ for 0 ≤ φ ≤ πthen for any ɛ > we can find a large enough R soSincethis is bound by∫ π22ɛR0|I R | :=∫ πɛR e −αR sin(φ) dφ = 2ɛR0sin(θ) ≥ 2θπ|I R | :=∫ π200 ≤ θ ≤ π 2e −αR sin(φ) dφ (363)(364)e −αR2θπ dφ = 2Rɛ π2αR (1 − e−αR ) = ɛπ α (1 − e−αR ) → 0 (365)as R → ∞This completes the proof of Jordan’s lemma.Next I consider some applications of the residue theoremExample 1: Letf(z) = g(z)h(z)74(366)


where g(z) and h(z) are analytic and h(z) has a simple zero at z 0 and g(z 0 ) ≠0. Then we haveh(z) = (z − z 0 )[ dhdz (z 0) + (z − z 0 )w(z)] (367)where w(z) is analytic and can be computed from Taylor’s theorem. If C isa curve around the point z = z 0 in the counter clockwise direction then∮f(z)dz = 2πi g(z 0)dhC(z (368)dz 0)Example 2: LetI =∫ ∞0x 2 dx(x 2 + 1)(x 2 + 4)Since this is an even function it can be written asI = 1 2∫ ∞−∞x 2 dx(x 2 + 1)(x 2 + 4)(369)(370)Next we convert this to an integral of a complex function around a closedcontour. To do this replace all of the x’s by a complex variable z. Thenextend the integral to include a large semicircle and take the limit as theradius of the semicircle goes to infinity. If we can shoe that the integral overthe semicircle is zero, then the integral over the line is equal to the contourintegral, which is 2πi times the sum of the residues of the poles bounded byreal axis and the semicircle.To show the integral over the semicircle gives no contribution let z = Re iφand dz = iRe iφ dφ giving that∫ πiR 3 e 3iφ dφ2 0 (R 2 e 2iφ + 1)(R 2 e 2iφ + 1) = (371)∫i πe −iφ dφ2R 0 (1 + e −2iφ /R 2 )(1 + 4e −2iφ /R 2 ) = (372)using Darboux’s theorem this the modulus of this is bounded by≤ 1 2π2R (1 − 1/R 2 )(1 − 4/R 2 )for R > 1. This clearly vanishes as R → ∞75(373)


Next factor the denominator to getI = 1 ∮z 2 dz2 (z + i)(z − i)(z + 2i)(z − 2i)C(374)This has simple poles inside C at i and 2i givingI = 2π 2 i[ (i 2(i + i)(i + 2i)(i − 2i) + (2i) 2(2i + i)(2i − i)(2i + 2i) = π[−1 6 + 1 3 ] = π 6(375)Example 3: LetI =∫ 2π0dφ1 + a sin(θ)0 ≤ a 2 < 1 (376)Let z = e iφ , dz = izdφ and <strong>notes</strong>in(φ) = 1 2i (z − z−1 ) (377)It follows that the integral can be written as a contour integral∮dz 1C iz 1 + a( 1 = (378))(z − 1/z)2i∮2dzC 2iz + a(z 2 − 1) = (379)∮dz 2 1√√(380)C a(z − i[ 1 + 1− 1)(z − i[ 1 − 1− 1)a a 2 a a 2√The pole at i[ 1 − 1− 1 is in the unit circle, givinga a 2I = 2 2πi√√a(i[ 1 − 1− 1] − i[ 1 + 1− 1])a a 2 a a 2=2π√1 − a2(381)76


29:171 - Homework Assignment #51. Find the Laurent series for e 1/z about the origin. What kind of isolatedsingularity does this function have at z = 0?2. Find the Laurent series for cosh(z + 1/z) about the origin. What kindof isolated singularity does this function have at z = 0?3. Let f 1 (z) be analytic in a region R 1 . Assume that f 2 (z) is analyticin a region R 2 . Assume that R 1 and R 2 have a not empty, simplyconnected intersection that contains an open set, and that f 1 and f 2agree on the intersection. Use Morerra’s theorem to show that thefunction g(z) = f 1 (z) on R 1 and f 2 (z) on R 2 is analytic.4. What is the radius of convergence of the Taylor series of the analyticfunction1f(z) =(z − 4)(z 2 + 5)about the point z 0 = 10i?5. Use Cauchy’s theorem to evaluate the integralwhere b is real.∫ ∞e iby206. What can you say about an entire function that is bounded by |z 3/2 |for large z?77


0.16 Lecture 16Example 4: LetI =∫ ∞−∞First complete the square in the exponent.I =∫ ∞−∞Next let x ′ = x + i a 2b and dx = dx′ to get∫ ∞+iaI = e − a22b4be −iax−bx2 dx. (382)ia−b(x+e 2b )2 − ba24b 2 dx (383)−∞+i a 2be −b(x′ ) 2 dx. (384)In order to evaluate this make a rectangle bounded by [−R, R] on the realaxis, the line [−R+i a , R+i a ], and the edges [R, R+i a ], and [−R, −R+i a ].2b 2b 2b 2bThe integrand is analytic inside of this rectangle. Applying Cauchy’s theoremgives∫ R0 = e − a24b−R∫ ae −bx2 dx + e − a2 2b4b∫ R+ia−e − a22b4b e −bx2 dx + e − a24b−R+i a 2b0∫ 0a2be −b(R+iy)2 dye −b(R−iy)2 dy. (385)In the limit that R → ∞ the first integral becomes a standard Gaussianintegral, while the third integral becomes the integral that we are trying tocompute. Thus this reduces to a standard Gaussian integral if we can showthat the boundary terms give no contribution in the limit that R → ∞. Todo this consider|e − a24b∫ a2b0∫ ae −b(R+iy)2 dy| = e − a2 2b4b0e −bR2 e −2iyb e by2 dy ≤ (386)as R → ∞. It follows thate − a24b e−bR 2 e b( 1 2b )2 → 0 (387)a2I = lim e− 4bR→∞∫ R−Re −bx2 dx =78√ πb e −a24b (388)


Example 5: LetTo calculate this noteI =∫ ∞0sin(x)dx. (389)xI = 1 2i∫ ∞( eix0x − e−ix). (390)xConsider the curve C in the upper half plane bounded by a small half circleof radius r and a large half circle of radius R about the origin and the linesegments on the real line connecting these circles.Compute∫0 =iC∫ 0πe izz dz = ∫e iz ∫ −r⊙ Rz dz + −Re (ir cos(φ)−r sin(φ)) dφ +∫ Rre ixx dx+e ixdx. (391)xThe integral over the large semicircle vanishes in the limit R → ∞ by Jordan’slemma. Changing x → −x in the second integral gives∫ Re −ix−r x dx+∫ 0∫ Ri e ir cos(φ)−r sin(φ) dφ +πre ixdx. (392)xSince the integrand in the integral over the small circle can be expanded ina uniforly convergent series in r. Changing the order of the sum and integralgives i plus a function that vanishes as, r → 0, givingwhich leads toiπ =∫ Rlim ( eixR→∞,r→orx − e−ix)dx (393)xI = 2 π . (394)isOne type of integral that often appears in problems involving scatteringI =∫ ∞−∞f(x)x − x 0dx (395)79


where |f(x)| < c|x| −α with α > 0 and f(z) is analytic in a neighborhood ofx 0 .It is useful to define two semicircular paths around x 0 :Γ + r = x 0 + re iφ φ : π → 0 (396)Γ − r = x 0 + re iφ φ : π → 2π (397)The integral is not well-defined due to the singularity, however it can bemade into a well-defined integral deforming the path to avoid the singularity:∫ x0 −r−∞∫f(x)dx +x − x 0 Γ ± rI ± :=∫f(x)∞dx +x − x 0 x 0 +rf(x)x − x 0dx. (398)80


0.17 Lecture 17For very small r the middle integral becomes∫limr→0Γ ± r∫ f(x0 + re iφ )i∫if(x)x − x 0dx =re iφ ire iφ dφ =f(x 0 + re iφ )dφ =∫∓iπf(x 0 ) + [f(x 0 + re iφ ) − f(x 0 )] (399)The last terms vanishes by Darboux’s theorem and the continuity of f(x)as r → 0. This gives∫f(x)lim dx = ∓iπf(x 0 ) (400)r→0Γ ± x − xr 0The value of the middle integral depends on the path taken around thesingularity.The remaining integral is called the principal value of the integral.I show that it is finite:P∫ ∞−∞f(x)x − x 0dx := limr→0[∫ x0 −r−∞∫f(x)∞dx +x − x 0 x 0 +rThe term in [· · · ] can be written as a sum of integrals∫ x0 −a−∞∫ x0 +bx 0 +r[· · · ] =∫f(x)x0 −rdx +x − x 0 x 0 −af(x) − f(x 0 )dx+x − x 0f(x)x − x 0dx]f(x 0 ) [ln( −r−a ) + ln( b r )] +} {{ }ln( b a ) ∫f(x) − f(x 0 )→∞f(x)dx +dx (401)x − x 0 x 0 +b x − x 081


The singular terms (r = 0) in the middle integral cancel before we take thelimit. Every other term in the is integral is well defined. In the limit thatr → 0 this is well defined and defines the principal value of the integral.Notef(x) − f(x 0 )x − x 0=∞∑n=11 d n fn! x (x 0)(x − x n 0 ) n−1 (402)converges uniformly in a neighborhood of x 0 . The points a and b can bechosen small enough so these expressions converge on x ∈ [x 0 − a, x 0 + b].If this integral is combined with our two integrals over the semicircle, theintegral over the lower semicircle can be replaced by an integral over the linewith the singularity raised infinitesimally above the lineI ± =∫ ∞∫ ∞−∞f(x)x − x 0 ∓ i0 +f(x)P± iπf(x 0 ) (403)−∞ x − x 0This formula is often written as1x − x 0 ∓ i0 = P 1± iπδ(x − x + 0 ) (404)x − x 0We will introduce the ”δ-function” later.Note that in this integral a slight change in the position of the singularitycan change the value of the integral. When these integrals arise in problems,the physics determines the position of the singularity.Example: Compute the principal value of∫ ∞dx(405)0 x 2 − x 2 0We write this as the ∫ ∞dxP=0 x 2 − x 2 0∫ x0 −ɛ ∫dx∞dxlim[+] (406)ɛ→00 x 2 − x 2 0 x 0 +ɛ x 2 − x 2 0Let u = x/x 0 , du = dx/x 0 in the first integral and v = x 0 /x, dv = x 0 dx/x 2in the second to get∫ 1−ɛ/x0lim [ɛ→00∫ 1dux 2 0(u 2 − 1) + 1−ɛ/x 0820dvx 2 0(1 − v 2 ) ] (407)


Relabeling v → u in the second integral gives∫ 1−ɛ/x0limɛ→0 11−ɛ/x 0dux 2 0(u 2 − 1)(408)The combined integral can be bounded using Darboux’s theorem∫ 1−ɛ/x011−ɛ/x 0dux 2 0(u 2 − 1) ≤ | 1 x 2 0x 0ɛ | 11 − ɛ/x 0− 1 − ɛ/x 0 |. (409)The last term vanishes like ɛ 2 , and when combined with the 1/ɛ, the integralvanishes like a constant ×ɛ as ɛ → 0. This shows thatMulti-valued functions;Recall thatcan solved to getwhereP∫ ∞0dxx 2 − x 2 0= 0 (410)e ln z = z (411)ln(z) = ln(|z|) + iφ (412)φ = arg(z) + 2πn. (413)If we start by considering the branch of the complex logarithm correspondingto a 0 ≤ φ < 2π then let φ increase by 2π this function does not return toits original value. It returns to its original value plus 2πi.Definition: A point z 0 in the complex plane such that f(z) does not returnto its initial value after going around any closed curve is called a branch pointof f(z).Definition: A line connecting two branch points is called a branch cut off(z).By these definitions we see that 0 is a branch point of ln(z). Sinceln(1/z) = − ln(z) we see that ∞ is also a branch point. Any line from 0to ∞ can be taken as a branch cut for the logarithm.While the branch points of a function are fixed by the function, the branchcuts can be chosen as desired.The ln(z), while multiple valued, is analytic in a small region about anyz, provided z is not a branch point. If we do not choose to identify the83


points in the complex plane that differ by 2πn, we get a more complexgeometrical surface called a Riemann surface. The Riemann surface for ln(z)can be thought of as an infinite spiral of complex planes that wind aroundthe origin. Each plane in the Riemann surface is called a Riemann sheet.Branch cuts are places where one passes from one sheet to the next. Forexample, if we choose the branch cut of ln(z) to be the real axis, 0 < φ < 2πis the n = 0 sheet of ln(z), negative −2π < φ < 0 corresponds to the n = −1sheet, while, 2π < φ < 4π corresponds to the n = 1 sheet.We can write the complex logarithm as an infinite collection of functionsln n (z) = ln(|z|) + i(arg(φ) + 2nπ) 0 ≤ φ < 2π (414)This function has complex derivatives at each point on the branch cut,but there are different values at the points where different adjacent sheetsmeet. This is easy to see by noting that adding 2πni to a function with acomplex derivative does not impact its differentiability.The structure of the Riemann surface depends on the function. Multivaluedfunctions often arise when complex variables are raised to non-integerpowers.84


0.18 Lecture 18For the next example of a multi-valued function consider the square root ofz. Letz = re iφ . (415)If φ is increased by 2π inz 1/2 = r 1/2 e iφ/2 (416)the function does not return to its original value. Instead it becomesz 1/2 → r 1/2 e i(φ/2+π) = −r 1/2 e iφ/2 (417)which is the well-known second value of the square root. If the phase φ isincreased by an additional factor of 2π this returns to the original function.This shows that √ z has a branch point at z = 0. Since we can make thesame argument for 1 √ zit follows that √ z also has a branch point at infinity.A branch cut is any line that starts at 0 and extends to ∞. This functionis double valued. If we start at the branch cut and increase φ by 2π we changethe sign of the square root; if we continue increasing by a second factor of2π we return to the original branch of the function.A straight forward computation shows that the two branches of √ z areAnother example of a multiple valued function isz 1/2 = r 1/2 e iφ/2 (418)z 1/2 = −r 1/2 e iφ/2 (419)f(z) := z α (420)for real α. Like the square root this has branch points at zero and infinity,since each curve around z = 0 increases the phase by e i2πα .This function will have an infinite number of branches if α is irrational,while it will have a finite number when α is rational.Treating functions with multiple branch points requires some care. Considerthe functionf(z) = (z 2 − 1) 1/2 = (z − 1) 1/2 (z + 1) 1/2 . (421)This function is a product of square roots. The first term has branch pointsat 1 and ∞ while the second term has branch points at −1 and ∞.85


If we let z → 1 in this function the resulting function becomeszg(z) = f( 1 z ) = (1 − z)1/2 (z + 1) 1/2z(422)which has a simple pole at z = 0. Thus this product has no branch point atinfinity.The result is that if we consider a curve that goes around one of thebranch points the phase increases by π, on the other hand if the curve goesaround both branch points the function returns to its original value.In this case the branch cut is any line between −1 and 1. Sometimes itis convenient to use the line of length 2 that connects these points. It is alsopossible to deform this curve so it goes through infinity, i.e. [−∞, −1]U[1, ∞].How it is chosen is a matter of convenience.Integrals where the integrands have branch cutsThe value of a multivalued function changes discontinuously across abranch cut. It is still possible to use Cauchy’s theorem and the residue theoremwith functions that have branch cuts; we just have to make sure thatour curves do not cross branch cuts and we must realize that the integrandis not continuous across a branch cut.As an example consider the integralI :=∫ ∞0x p−1x 2 + 10 < p < 2 (423)The integrand of this function has simple poles at z = ±i and branchpoints at 0 and ∞ (for p ≠ 1).Take the branch cut along the positive real axis.To do this integral we first choose a branch of the integrand. The valueof this function on the n th branch for z = re iφ can be labeled byf n (z) = rp−1 e i(p−1)(φ+2πn)r 2 e 2iφ + 1(424)To do this integral choose the branch n = 0 with this choice value of f(z)just above the real line iswhile the value just below the real line isf 0 (x) = r p−1 e i(p−1) (425)f 0 (x) = r p−1 e i(p−1)2π (426)86


To evaluate this integral chose a contour that starts near the origin justabove the real axis and follows the real axis out an amount R. To this adda circle of radius R around the entire plane. It returns to the positive x axison a different Riemann sheet, corresponding to n = 1. Next integrate backto the origin, this time below the branch cut. Finally connect the two lineat the origin by a small circle.The final curve does not cross the branch cut. The integrand has twosimple poles at z = ±i in the interior of the curve so the integral around theclosed curve can be computed using the residue theorem. In order to use thetheorem it has to be applied to the chosen branch. In out case n = 0 branchwas convenient. In this case the residue theorem givesI t = 2πi[ 1p−1 e i(p−1)(π/2)2i− 1p−1 e i(p−1)(3π/2)] = 2πie iπ(p−1) sin( π (p − 1)) (427)2i2Here we used 3π/2 rather than −π/2 because 3π/2 corresponds to the chosenn = 0 branch.The contributions from the large and small circle can be estimated usingDarboux’s. theorem. The conditions on p are chosen so the contribution tothe integral from the large and small circles vanish in the limit that the radiibecome infinity and zero respectively.The surviving integrals are∫ ∞0∫ R∫ RI t = lim lim[dx xp−1r→∞ r→0r x 2 − 1 − dx xp−1r x 2 − 1 ei(p−1)2π =dx xp−1x 2 − 1 × [1 − ei2π(p−1) ] =∫ ∞0dx xp−1x 2 − 1 × −2ieipπ sin(pπ) (428)which is the desired integral multiplied by a complex factor. Using this with(??) gives an expression for this integral gives∫ ∞0x p−1x 2 + 1dx = πcos(πp2 )sin(πp) =π2 sin( πp2 ) (429)87


29:171 - Homework Assignment #6Compute the following integrals1. ∫ 2π2. ∫ 2π00dθa + b sin(θ)sin 2 (θ)dθa + b cos(θ)a > b > 0a > b > 03. ∫ 2π4. ∫ ∞5. ∫ ∞6. ∫ ∞7. ∫ ∞0dθ(a + b cos(θ)) 2 a > b > 00000dx1 + x 4x 2 dx(x 2 + a 2 ) 3sin(x)dxx(x 2 + a 2 ) 2sin 2 (x)dxx 288


0.19 Lecture 19The residue theorem can also be employed to sum infinite series. The followingtheorem useful.Theorem 19.1: Let f(z) be meromorphic and let C be a regular curve thatencloses the zeros of sin(πz) located at z = m, m + 1, · · · m + n. If the polesof f(z) and zeros of sin(πz) are distinct thenm+n∑k=mf(m) = 1 ∮π cot(πz)f(z)dz −2πi C∑polesf(z)Res[π cot(πz)f(z)] (430)The proof of this theorem is a direct application of the residue theoremto the meromorphic function π cot(πz)f(z) which has poles for z ∈ Z inaddition to the poles of f(z).The residue of the poles of π cot(πz) are all 1 which can be seen byexpanding the Taylor series of the sin(πz) about z = nπ:To use this notecos(πn)π cot(πz) ≈ ππ cos(πn)(z − n) + · · · (431)cos(π(x + iy))| cot(πz)| = |sin(π(x + iy))cos(πx) cosh(πy)) − i sin(πx) sinh(πy))= |sin(πx) cosh(πy)) + i cos(πx) sinh(πy)) | =| cos2 (πx) + sinh 2 (πy)sin 2 (πx) + sinh 2 (πy)) |1/2 (432)This is bounded for large y. When y = 0 it is bounded for x = n + 1.2If this is multiplied by a function that vanishes for large |z| and thecontour is chosen to be a large rectangle that intersects the x-axis at halfintegers, then in the limit of infinite rectangle sizeThis means that∞∑n=−∞12πi∮Cf(n) = −π cot(πz)f(z)dz → 0. (433)∑polesf(z)89Res[π cot(πz)f(z)]. (434)


This is the most useful form of this theorem.As an example let1f(z) =a 2 + z . (435)2This has poles at z = ±ia. The above theorem impliesnote∞∑n=−∞1a 2 + n = −π(cot(πia) 12 2ia + cot(−πia) 1). (436)−2iacot(πia) = cos(iπa)sin(iπa)The right hand side of this expression givesThe sequence can be rewritten as∞∑n=1= −i coth(πa). (437)cot(−πia) = i coth(πa). (438)coth(πa). (439)a1a 2 + n = coth(πa) − 1 2 2a a . (440)2While the π cot(πz) is useful for summing series with positive entries, π csc(πz)can be used to sum alternating series. It can also be shown to be boundedin the same sense as π cot(πz), however the residue of the poles at z = n arecos(nπ) = (−) n .Next I consider the problem of extending analytic functions. First I statea uniqueness theorem for analytic functions.Theorem 17.1: Let f 1 (z) and f 2 (z) be analytic in a region R. Let S be aset in R with an accumulation point z 0 in R. If f 1 (z) = f 2 (z) for z ∈ S thef 1 (z) = f 2 (z) for all z ∈ R.To prove this note that g(z) = f 1 (z) − f 2 (z) is analytic in R. By assumptiong(z n ) = 0 for all z ∈ S. Since z 0 is an accumulation point of S we canfind a subsequence z ′ n in S such that z ′ n → z 0 . Since g(z) is analytic, it isalso continuous. It follows thatg(z 0 ) = limn→∞g(z ′ n ) = 0 (441)90


Thus z 0 is a zero of an analytic function that is not isolated. The only analyticfunction that does not have isolated zeros is the zero function. Thereforef 1 (z) = f 2 (z) (442)in all of R.Since curves and small neighborhoods contain points with accumulationpoints, this theorem shows that analytic functions that agree on a curve ora small neighborhood are also identical.91


0.20 Lecture 20Next assume that f(z) is analytic in a region D ⊂ R and known explicitly ina neighborhood of z 0 . Consider a curve C between z 0 and any other z in R.At each point z ∈ C ∩ D the function has a Taylor expansion with a aradius of convergence that is determined by the largest circle that can bedrawn around z 0 in D. Pick a point on C in this circle near the boundary ofthe circle. It is possible to generate a new Taylor series about this point thathas a radius of convergence that extends the boundary of D to the nearestsingularity. In general it will extend outside of the original circle. Sincethe two series agree in the intersection of the circles, they define a singleanalytic function in the combined region. This process can be repeated untilwe either hit a singularity on C or reach the final point z. This shows thatthe function can be extended to a single analytic function to all points thatcan be connected to the original point by a curve in the new domain ofanalyticity. If there is more than one path to z and the two path go arounda branch point the resulting function could be multiple valued.The process where an analytic function can be extended to a single analyticfunction in an extended region is called analytic continuation.Theorem 20.1: Let R 1 and R 2 be non overlapping regions with a commonboundary B. Assume that f 1 (z) is analytic in R 1 and f 2 (z) is analytic in R 2and f 1 (z) and f 2 (z) are continuous and equal on B. Then{f1 (z) z ∈ Rh(z) :=1 ∪ Bf 2 (z) z ∈ R 2 ∪ B(443)is analytic in R 1 ∪ R 2 ∪ B.This theorem can be proved using Morerra’s theorem. Any curve that isnot in R 1 or R 2 can be broken up into parts in B and parts that cross B. Bycontinuity the parts in B can be moved infinitesimally in R 1 or R 2 withoutchanging the value of the integral. For a closed curve the number of crossingmust be even. The curve can be replaced by a sum of curves that are entirelyin R 1 or R 2 by adding pairs of curves in opposite direction one either side ofB. Using Cauchy’s theorem in either region shows that the integral aroundany close curve in R 1 ∪ R 2 ∪ B vanishes. Since h(z) is also continuous in thisregion it is analytic by Morerra’s theorem.Theorem 20.2: (Schwartz reflection principle) Let f(z) be analytic in aregion R that has part of the real line as the boundary. Assume that f(z) is92


eal for z real, and continuous on the boundary. DefineThenis an analytic continuation of f(z) into¯R = {a|z ∗ ∈ R} (444)g(z) := (f(z ∗ )) ∗ (445)¯R ∪ R (446)To prove this note that if z ∈ ¯R then z ′ = z ∗ ∈ R. For z 1 , z 2 ∈ ¯R[ f(z′ 1) − f(z 2)′z 1 ′ − ] ∗ = [ g(z 1) − g(z 2 )] (447)z′ 2z 1 − z 2since the term on the left has a limit as z 1 ′ → z′ 2 independent of direction,the same is true for the term on the right. This means that g(z) is analyticin ¯R. The reality for real z means that f(z) = g(z) for z real. By theorem17.1 f(z) and g(z) define a single analytic function on ¯R ∪ R. This functionhas the property thatf ∗ (z) = f(z ∗ ) (448)The nature of this theorem is easy to understand. The reality means if x isnot the real line∞∑f(x) = g(x) c n (x − x 0 ) n (449)n=0where c n = c ∗ n . Taking the conjugate of f conjugates c n → c ∗ n = c n and(z − x 0 ) n → (z ∗ − x 0 ) n . The conjugation of z then gives that the coefficientsof the Taylor series for expanding (z ∗ − x 0 ) n → (z − x 0 ) n leading to theoriginal series, which is now valid for z in either region.Dispersion RelationsMany functions that appear in physics are analytic except for a branchcut along the real axis from x 0 → ∞, are real for x < x 0 and fall off fasterthan c for large |z| and vanish at the branch point.|z|The source of these functions normally comes from expressions involvingresolvents of self-adjoint operators (defined later). If O is a linear operator,and z is a complex number we can ask if operator (z −O) has an inverse. Theoperator (O − z) −1 considered as an operator valued function of the complexvariable z is called the resolvent of O. While we will study these operators93


next semester, this operator has poles when O has isolated eigenvalues, andbranch cuts where O has continuous eigenvalues. The operator is analytic inz when the inverse exists and is continuous. In this class of problems branchcuts on a half line are typical associated with operators that have eigenvaluesbounded from below, like energy operators.Normally the information on the branch cut is experimentally accessible.Dispersion relations provide a simple relation between the value of such afunction anywhere and the imaginary part of the function along the branchcut. They follow as an application of the Schwartz reflection principle.Considerf(z) = 12πi∮cf(z ′ )z ′ − z dz′ (450)where curve is the sum of a circle at infinity and the lines in either side ofthe branch cut. The assumption that |f(z)| < c 1 means that there is no|z|contribution from the large circle at ∞. Vanishing at the origin means thatthere is no contribution from a small circle around the branch point. Whatremains are the integrals on either side of the branch cut:f(z) = 12πi∫ ∞f(x + i0 + )x 0x + i0 + − z dx − 12πi∫ ∞f(x − i0 + )dx (451)x 0x − i0 + − zTaking the limit as ɛ → 0 for z not on the branch cut, using the Schwartzreflection principle (f ∗ (z) = f(z ∗ )), givesf(z) = 12πi∫ ∞x 02I f(x) dx (452)x − zThis is called a dispersion relation. It expresses the value of f(z) in the cutplane in terms of the imaginary part of the function along the branch cut.94


0.21 Lecture 21:The fundamental Theorem of Algebra:We initially introduced complex numbers to find roots to the equationx 2 + 1 = 0. We then mentioned that introducing the new number i wasenough to factor any Polynomial. We are now in a position to prove thisresult.Theorem 19.3: Let f(z) be meromorphic in a region R and g(z) be analyticin R. Let C be a closed regular curve in R on which f(z) is analytic andnowhere zero. If f(z) has M zeros {z j } M j=1 of order {m j } M j=1 and N poles{p j } N j=1 or order {n j} N j=1 in C then∮12πi CdfMdzg(z)f(z) dz = ∑m j g(z j ) −j=1N∑m j g(p j ) (453)To prove this note that in general near a pole or zero f(z) has the formn=1f(z) = a 1 (z − z 0 ) n + a 2 (z − z 0 ) n+1 + · · · (454)anddf(z)= na 1 (z − z 0 ) n−1 + (n + 1)a 2 (z − z 0 ) n + · · · (455)dzwhere n may be positive (zero) or negative (pole). Near a pole or zero theratio has the formdfdzf(z) =na 1 (z − z 0 ) n−1 + (n + 1)a 2 (z − z 0 ) n + · · ·=a 1 (z − z 0 ) n + a 2 (z − z 0 ) n+1 + · · ·n+ analytic function (456)z − z 0Multiplying by g(z) gives in the neighborhood of a zero or poledfdzg(z)f(z) = ng(z) + analytic function (457)z − z 0If we apply the residue theorem to this expression we get∮ dfdzg(z)f(z) = ∑ ( M)∑N∑2πi m j g(z j ) − n j g(p j )j=195j=1(458)


which is the desired result.We use this result to count the zeros of an n-th degree polynomial. Itis obvious that for large z and n-th degree polynomial grows like |z| n . Thismeans that all of the zeros must lie inside of a circle of sufficiently largeradius R. From the above theorem∮12π1dP n MdzP n (z) = ∑j=1m j (459)which adds up to the total number of zeros if we count order m zeros as mzeros. On the other hand as the circle gets very large∮12π112π1dP ndzP n (z) → (460)∮ ndz = n (461)zwhere n is the degree of the polynomial.This proves the fundamental theorem of algebra - i.e. that any polynomialof degree n has n complex roots.A powerful technique for approximating integrals involving integrals ofthe form∫I(w) = e wf(z) g(z)dz (462)Cfor very large |w| is called the method of saddle point integration. The intuitiveobservation is that when |w| is very large the major contribution to theabove integral is due to the points on the curve where Re(f(z)w) is largest.Near these points we also have to deal with high frequency oscillations.These integrals were first studied by Debye. He deformed the path C soone a part of the deformed path C 0a. I(f(z)) is locally constant.b. There is a point z 0 ∈ C 0 wheredfdz (z 0) = 0 (463)c. At z = z 0 along the path R(z) goes through a relative maximum.96


To understand the precise meaning of these conditions consider the behaviorof an analytic function in the neighborhood of a point z 0 where itsfirst complex derivative is zero:f(z) = f(z 0 ) + 1 d 2 f2 dz (z 0)(z − z 2 0 ) 2 + · · · (464)We first assume that d2 fdz 2 (z 0 ) ≠ 0. Letz − z 0 = re iφ (465)and1 d 2 f2 dz = 2 Reiψ (466)Then we have for sufficiently small r the approximation:f(z) = f(z 0 ) + Rr 2 e iψ+2iφ + · · · =f(z 0 ) + Rr 2 (cos(ψ + 2iφ) + i sin(ψ + 2iφ) + · · · (467)Express this in terms of the real and imaginary parts of f(z) = u(z) + iv(z):u(z) = u(z 0 ) + Rr 2 (cos(ψ + 2iφ) + · · · (468)v(z) = v(z 0 ) + Rr 2 (sin(ψ + 2iφ) + · · · (469)The condition that f(z) has a relative maximum at f(z 0 ) means that thecurve is designed to goes through z 0 in the direction given bycos(ψ + 2φ) = −1 sin(ψ + 2φ) = 0 (470)which means thatψ + 2φ = (2n + 1)π (471)ψ = 1 2 ((2n + 1)π − φ) = nπ + π 2 − φ 2(472)Along this path the imaginary part of the function is constant to thirdorder, which minimizes oscillations, while the real part reaches a local maximumalong the deformed curve at z = z 0 .In this case the integral is replaced by∫∫I(w) = e wf(z) g(z)dz = e wf(z) g(z)dz ≈ (473)CC 097


On C 0 near z 0 we havewhich givesw d 2 f2 dz (z 0)(z − z 2 0 ) 2 = −wτ 2 (474)∫I(w) ≈ e wf(z 0)e −wτ 2 g(z(τ)) dzC 0dτ(475)98


29:171 - Homework Assignment #7Compute the following integrals1. ∫ ∞2.P∫ ∞−∞3. ∫ ∞4. ∫ ∞5. ∫ ∞6. ∫7. CalculateandC0ln(x)b 2 + x 2 dx1(x − x 0 ) 2 + a 2 1(x − x 1 ) dx0e 1/z dz0−∞dxa 3 + x 3dx(a 3 + x 3 ) 2e ikx dx(x 2 + a 2 )∞∑n=1C = unit circle1n 2∞∑(−1) n+1 1 n 2n=199


0.22 Lecture 22The last step is to extend the integral from the small region of C 0 near z 0 to(−∞, ∞). We also expandg(z(τ)) dzdτ = ∑ nc n τ n (476)giving the approximationI(w) ≈ e wf(z 0)∫ ∞−∞e −wτ 2 ∑ nc n τ n dτ (477)The integrals are zero for odd n and for even n they are determined byThus∫I 2n :=e −wτ 2 τ 2n dτ =(− d ) n ∫dwe −wτ 2 dτ(− ddw) n√ πw(478)√ 1 π2 · 52 · · · 2n − 1 w −(2n+1)/2 (479)2I(w) ≈ e wf(z 0)∞∑I 2n (480)In general this series may not converge. It is often the case that the seriesgenerated using the method of steepest decent is an asymptotic series. Wewrite the series∞∑f n z −n (481)is asymptotic to f(z) if for any n:n=0lim {z n (f(z) −|z|→∞n=0n∑f k z −k )} = 0 (482)k=0What this equation means is that for any fixed n|f(z) −n∑f k z −k |


which means the the error by using the finite sum can be made as small asdesired by choosing z large enough.Unfortunately this does not mean that the full series converges for anyfixed z. While asymptotic series are not always convergent, a given functioncan have at most one asymptotic series. In addition, different functions canhave the same asymptotic series.Example: The method of steepest descent is best illustrated by example.Consider the integral:I(w) :=∫ ∞0e −z z w (484)where w is large. First transform this integral to an integral of the desiredform by letting z = wν:I(w) :=∫ ∞0In this example f(ν) = ln(ν) − ν). Note that∫ ∞e −wν w w ν w wdν = w w+1 e w(ln(ν)−ν) dν (485)which vanishes when ν = 1. Expanding about ν = 10dfdν (ν) = 1 ν − 1 (486)f(ν) = −1 − 1 (ν − 1) 2 + · · · = 1 − 1 2νν=12 2 (ν − 1)2 + · · · (487)If we let ν = −1 + re iφ we getChoose φ = 0, π, · · · which givesf(ν) = −1 = 1 − 1 2 r2 e 2iφ + · · · (488)∫ ∞√I(w) = w w+1 e −w− w 2 r2 dr = e −w e w+ 1 2 2π (489)0In general this is only an approximation that is expected to improve asw gets large.The Gamma and Beta functionsSpecial functions are an important element of <strong>math</strong>ematical physics.These functions can be expressed in many different forms. Integral representationsare often useful, especially when considering analytic propertiesof these functions.101


The Gamma function is defined for all values of z by the following integralrepresentation1Γ(z) := 1 ∫2πi Ce tdt.tz (490)The functiont z = e z ln t (491)has branch points in t at zero and ∞ unless z is an integer. It is analytic inz when t ≠ 0. When t is not an integer ln(t) has branch points at 0 and ∞.The branch cut is along the negative real axis, from x = −∞ to x = 0. Thecontour C in the above expression goes from −∞ to 0 below the cut andfrom 0 to −∞ above the branch point. The term e t falls off exponentiallyfor large negative t. The curve also includes a circle around the origin thatavoids the point t = 0 When z = n is an integer this curve is equivalent tointegrating around a closed circle about the origin. When z = −n we have1Γ(−n) := 12πi∫Ce t t n dt = 0 (492)by Cauchy’s theorem; when n is a positive integer1Γ(n) := 1 ∫e t 2πi 1dt =2πi tn 2πi (n − 1)!C(493)Taking inverses givesΓ(n) = ∞ n < 0 (494)Γ(n + 1) = n! n ≥ 0 (495)1The properties of this integral representation imply that is defined andΓ(z)differentiable for all z. Γ(z) is analytic, except to poles alone the negativereal axis.An important property of the Gamma function isTo prove this note thatΓ(z + 1) = zΓ(z) (496)1Γ(z + 1) = ∫ce t t −(z+1) dt = − 1 z∫ce t d dt t−(z) (497)102


Integrate by parts noting that the endpoint contributions on the curve Chave the real part of t = −∞ where the integrand vanishes for all z.1Γ(z + 1)∫c = e t t −(z+1) dt = 1 ∫e t t −(z) = 1 (498)zzΓ(z)Taking inverses gives the desired result:cΓ(z + 1) = zΓ(z) (499)103


0.23 <strong>lecture</strong> 23Next we show that there is an alternative integral representation for the Γfunction given byΓ 1 (z) =∫ ∞that is valid for z in the right half plane.Again using integration by partsΓ 1 (z + 1) =∫ ∞00∫ ∞0e −t t z−1 dt R(z) > 0 (500)e −t t z dt =∫ ∞0(− d dt )e−t t z dte −t d dt tz dt = zΓ 1 (z) (z ≠ 0) (501)where we have used that the endpoint contributions vanish for z in the righthalf plane.Next I argue that the two representations, Γ(z) and Γ 1 (z), of the Γ functionare equal. To do this we first define the β functions for R(a), R(b) > 0:β(a, b) :=∫ 10t a−1 (1 − t) b−1 dt (502)Let t → 1 and t → τsin2 (θ) in the above expression to get two equivalentexpressions for the β function:β(a, b) =∫ ∞1Next note thatt −a−b (t − 1) b−1 dt = 2Γ(a)Γ(b) =Next let t = y 2 w = x 2 to getΓ(a)Γ(b) = 4Next change to polar coordinatesΓ(a)Γ(b) = 4∫ ∞0∫ ∞0∫ ∞0r 2a+2b−1 e −r2 dr∫ π20sin 2a−1 (θ) cos 2b−1 (θ)dθ (503)dtdwe −t−w t a−1 w b−1 (504)dxdxe −x− y 2 x 2a−1 y 2b−1 (505)∫ π20104dθ sin 2a−1 (θ) cos 2b−1 (θ) (506)


Finally let u = r 2 to getΓ(a)Γ(b) =∫ ∞Comparing with (503) we get0∫ πu a+b−1 e −u du22dθ sin 2a−1 (θ) cos 2b−1 (θ) (507)0Γ(a)Γ(b) = Γ(a + b)β(a, b) (508)orβ(a, b) = Γ(a)Γ(b)Γ(a + b)which is all in terms of our second expression for the Γ function.Let a = z b = 1 − z so a + b = 1. Then the above becomes(509)Γ(z)Γ(1 − z) =∫ ∞0e −u du2∫ π2Let x = tan(θ), dx = (1 + x 2 )dθ to get2∫ ∞00dθ sin 2z−1 (θ) cos 1−2z (θ) = (510)∫ π22dθ tan 2z−1 (θ)(θ) (511)0dθ x2z−1dx = π csc(πz) (512)x 2 + 1which is the integral done is section 18 for real z between 0 and 1 . Thisextends to x → z at points of analyticity.One convenient by-product of this formula is for z = 1/2 we getΓ( 1 2 )2 = π (513)orΓ( 1 2 ) = √ π. (514)To show the desired equivalence change variables t → −t ′ = e −iπ t ′ in theexpression of Γ(z):1Γ(1 − z) = 1 ∫e t t −(1−z) dt = − 1 ∫2πi C2πi C105e −t′t ′(1−z) eiπ(1−z) dt ′ (515)


In this transformation the phase of bottom half of the original curve goesfrom e −iπ → e iπ to 0 to e 2πi and the curve remains in the counter clockwisedirection. This can be broken up into integrals alone the two semi infinitelines and the circle:− 1 ∫ 0e −t′2πi t ′(1−z) eiπ(1−z) dt− 12πi− 12πi∫ ∞0∫ 2π0∞e −t′t ′(1−z) eiπ(1−z) e i2π(z−1) dte −reiφr (1−z) eiφ(z−1) e iπ(1−z) irdφ (516)The integral over the circle vanishes when R(z) > 0 as the radius → 0. Thefirst two integrals can be combined to getThis shows that∫1∞e −t2πi eiπ(1−z) [t (1 − (1−z) ei2π(z−1) )dt =12πi∫ ∞0− 12πi0e −tt (1−z) [eiπ(1−z) − e −iπ(1−z) ]dt =∫ ∞0sin(πz)πe −t t z−1 [e −iπz − e iπz ]dt =∫ ∞0e −t t z−1 dt =sin(πz)Γ 1 (z) = 1πΓ 1 (z)(517)Γ 1 (z) = Γ(z) (518)for R(z) > 0. Since both Γ functions satisfy Γ(z + 1) = zΓ(z), we can alwaychoose n large enough to Re(n + z) > 0, which then implies the equality inthe plane provided z in not a negative integer.106


0.24 Lecture 24:This section begins the study of vectors and vector spaces. Following the text,I will use a notation that was introduced by Dirac for quantum mechanics.Also, since I will be using both finite and infinite dimensional vector spaces,what I discuss applies to all types of vector spaces, unless specifically stated.Vectors, |a〉, |b〉, |c〉, · · · are elements of a set S. There are two operationsdefined on vectors.Vector addition: Adding vectors |a〉 and |b〉 gives a new vector |c〉. Vectoraddition is expressed as|a〉 + |b〉 = |c〉. (519)Scalar multiplication: If |a〉 is a vector and α is a complex number then anew vector |c〉 is defined by|c〉 = α|a〉. (520)I also assume the existence of some special vectors.Zero vector: The zero vector, |0〉, satisfies|a〉 + |0〉 = |a〉 (521)for any vector |a〉 ∈ S.Inverse vector: Given a vector |a〉 ∈ S, the inverse vector | − a〉 ∈ S satisfies|a〉 + | − a〉 = |0〉. (522)A complex vector space is a set S with the operations of vector additionand scalar multiplication, where S has a zero vector, every vector in S hasan inverse in S, and the rules of vector addition and scalar multiplicationsfor vectors |a〉, |b〉, |c〉 ∈ S and complex numbers α, β ∈ C are:1.)2.)(|a〉 + |b〉) + |c〉 = |a〉 + (|b〉 + |c〉) (523)|a〉 + |b〉 = |b〉 + |a〉 (524)1 · |a〉 = |a〉 (525)107


3.)β(α|a〉) = (αβ)|a〉 (526)(α + β)|a〉) = (α|a〉) + (β|a〉) (527)α(|a〉 + |b〉) = (α|a〉) + (α|b〉) (528)Examples of vector spaces are below:Example 1: Complex numbers with addition and multiplication of complexnumbers:|a〉 = x + iy (529)Example 2: Complex 2 × 2 matrices( ) α β|a〉 =γ δ(530)where α, β, γ, δ ∈ C and addition is addition of complex matrices and scalarmultiplication is multiplication of the matrix by a complex constant.Example 3: Degree 2 polynomials:|a〉 = α + βz + γz 2 (531)where α, β, γ ∈ C and addition and scalar multiplication are addition andscalar multiplication of polynomials.There are some elementary consequences that follow directly from thedefinition of a vector space.The vector |0〉 is unique. Assume that there are two zero vectors, |0 1 〉 and|0 2 〉, and let |a〉 and | − a〉 be inverses. ThenThereforeGiven a vector |a〉 the inverse | − a〉 is unique.inverses, | − a 1 〉 and | − a 2 〉. Then|0 1 〉 = |a〉 + | − a〉 = |0 2 〉 (532)|0 1 〉 = |0 2 〉. (533)Assume that |a〉 has two|a〉 + | − a 1 〉 = |0〉 = |a〉 + | − a 2 〉. (534)Add | − a 1 〉 to both sides of this equation to get|0〉 + | − a 1 〉 = |0〉 + | − a 2 〉 (535)108


or0|a〉 = |0〉: To show this note| − a 1 〉 = | − a 2 〉. (536)|a〉 = 1|a〉 = (1 + 0)|a〉 = 1|a〉 + 0|a〉 = |a〉 + 0|a〉 (537)Add the inverse | − a〉 of |a〉 to both sides to getwhich becomesThe inverse | − a〉 of |a〉 isTo prove this note that| − a〉 + |a〉 = | − a〉 + |a〉 + 0|a〉 (538)|0〉 = |0〉 + 0|a〉 = 0|a〉. (539)| − a〉 = (−1)|a〉. (540)|a〉 + (−1)|a〉 = 1|a〉 + (−1)|a〉 = (1 − 1)|a〉 = 0|a〉 = |0〉. (541)Given the identification above it is customary to define−|a〉 = (−1)|a〉. (542)These properties apply to any vector space.Just because a set of vectors with a definition of addition and scalarmultiplication form a vector space, it does not automatically follow that thevectors have a length or scalar product.A vector space is a metric space if there is real valued function, ρ(·, ·),defined on pairs of vectors satisfying1. ρ(|a〉, |b〉) ≥ 02. ρ(|a〉, |b〉) = 0 ⇐⇒ |a〉 = |b〉3. ρ(|a〉, |b〉) = ρ(|b〉, |a〉)4. ρ(|a〉, |b〉) ≤ ρ(|a〉, |c〉) + ρ(|c〉, |b〉)109


Not that metric spaces do not have to be vector spaces.Example 1: Let S be the set of points on the surface of a unit three dimensionalsphere. Define a metric on this space byρ(|a〉, |b〉) = minimum arc length of curve between |a〉 and |b〉 on sphere(543)It follows from this definition thatρ(|b〉, |a〉) ≥ 0 (544)ρ(|a〉, |b〉) = 0 ⇐⇒ |a〉 = |b〉 (545)ρ(|a〉, |c〉) + ρ(|c〉, |b〉) ≥ ρ(|a〉, |b〉) (546)This means the ρ(·, ·) is a metric on S. This metric space is not a vectorspace.A set of vectors is a normed linear space if there is real valued function,‖ · ‖, defined on vectors satisfying1. ‖|a〉‖ ≥ 02. ‖|a〉‖ = 0 ⇐⇒ |a〉 = 03. ‖α|a〉‖ = |α|‖|a〉‖4. ‖(|a〉 + |b〉)‖ ≤ ‖|a〉‖ + ‖|b〉‖A set of vectors is a inner product space if there is complex valued function,〈·|·〉, defined on pairs vectors satisfying1. 〈a|b〉 ∗ = 〈b|a〉2. For |c〉 = |a〉 + |b〉3. For |c〉 = α|a〉, α ∈ C4.〈d|c〉 = 〈d|a〉 + 〈d|b〉 (547)〈b|c〉 = α〈b|a〉 (548)〈a|a〉 ≥ 0 (549)〈a|a〉 = 0 ≡ |a〉 = |0〉 (550)110


All inner product spaces are normed linear spaces with the norm‖|a〉‖ = 〈a|a〉 1/2 (551)and all normed linear spaces are metric spaces with the metricρ(|a〉, |b〉) = ‖(|a〉 − |b〉)‖. (552)The distinctions are important because there are important metric spacesthat are not normed spaces, and important normed linear spaces that are notinner product spaces. Most examples where these distinctions are relevantinvolve infinite dimensional vector spaces, which will discussed later.To show that all normed linear spaces are metric spaces we useρ(|a〉, |b〉) := ‖(|a〉 − |b〉)‖ ≥ 0 (553)ρ(|a〉, |b〉) := ‖(|a〉 − |b〉)‖ = 0 ⇐⇒ |a〉 = |b〉 (554)ρ(|a〉, |b〉) := ‖(|a〉 − |b〉)‖ = ‖(|b〉 − |a〉)‖ = ρ(|b〉, |a〉) (555)ρ(|a〉, |b〉) + ρ(|b〉, |c〉) := ‖(|a〉 − |b〉)‖ + ‖(|b〉 − |c〉)‖ ≥ (556)‖(|a〉 − |c〉)‖ = ρ(|a〉, |c〉) (557)To show that all inner product spaces are normed linear spaces we firstprove the Cauchy Schwartz inequality.Consider the vector|c〉 = |a〉 − λ〈b|a〉|b〉 (558)It follows that〈c|c〉 = (〈a| − λ ∗ 〈b|a〉 ∗ 〈b|)(|a〉 − λ〈b|a〉|b〉) = (559)〈a|a〉 − λ〈b|a〉〈a|b〉 − λ ∗ 〈b|a〉 ∗ 〈b|a〉 + |λ| 2 〈b|a〉〈a|b〉〈b|b〉 ≥ 0 (560)This inequality holds for any λ. Restrict to the case that λ is real. Theinequality also holds for all real λ. This means that this polynomial in realλ can have not real roots in λ The roots of this polynomial in λr = −b ± √ b 2 − 4ac2aThis polynomial will have no real roots if(561)b 2 − 4ac < 0 (562)111


whereb = −2|〈a|b〉| 2 (563)a = 〈a|a〉 (564)c = 〈b|b〉|〈a|b〉| 2 (565)This condition is equivalent to√〈a|a〉√〈b|b〉 ≥ |〈a|b〉| (566)This is the Cauchy-Schwartz inequality. It is a property of any inner productspace.Using the Cauchy Schwartz inequality it is possible to show that everyinner product space is a normed linear space. First note‖|a〉‖ := 〈a|a〉 1/2 ≥ 0 (567)0 = ‖|a〉‖ = 〈a|a〉 1/2 ⇐⇒ |a〉 = |0〉 (568)‖α|a〉‖ := 〈αa|αa〉 1/2 = (α ∗ α) 1/2 〈a|a〉 1/2 = |α|‖|a〉‖ (569)‖|a + b〉‖ 2 = 〈a + b|a + b〉 = (570)〈a|a〉 + 〈b|b〉 + 〈a|b〉 + 〈b|a〉 ≤ (571)〈a|a〉 + 〈b|b〉 + 2|〈a|b〉| ≤ (572)〈a|a〉 + 〈b|b〉 + 2 √ 〈a|a〉〈b|b〉| = ( √ 〈a|a〉 + √ 〈b|b〉) 2 (573)Taking square root of both sides gives the triangle inequality‖(|a〉 + |b〉)‖ ≤ ‖|a〉‖ + ‖|b〉‖ (574)112


29:171 - Homework Assignment #81. By integrating ∫zdza − e −izover a rectangular curve with corners at −π, π, π + in −π + in andletting n → ∞ show∫ π0x sin(x)dx1 + a 2 − 2a cos(x) = π log(1 + a) (0 < a < 1)a2. Evaluate ∫ ∞0ln 2 (z)z 2 + 1 dz3. Express the integral ∫ ∞e −αx2 x β dx0where α and β are real and positive in terms of the Gamma function.4. Prove that if a > 0, − 1π < aλ < 1π2 2∫ ∞0e −ra cos(aλ) cos(r a sin(aλ))dr = cos(λ) 1 a Γ(1 a )5. Calculate ∫ π/2sin α (θ) cos β (θ)dθ0for α, β > 0.6. Evaluate β(m, n) and relate it to the binomial coefficients.113


0.25 Lecture 25Example 2: Let S be the set of continuous complex valued functions of areal variable in the interval [0, 1] is a vector space. The function,‖f‖ = sup |f(z)| (575)x∈[0,1]where sup means least upper bound, is a norm on this space. This normcannot be constructed from an inner product in the manner discussed above.Example 3: Let S = C. Then〈a|b〉 = a ∗ b (576)is an inner product on the vector space of complex numbers.Example 4: Let S the vector space of complex 2 × 2 matrices. If( )( )a11 aA =12aand A † ∗= 11 a ∗ 21a 21 a 22 a ∗ 12 a ∗ 22(577)Then〈A|B〉 = Tr(A † B) (578)is a scalar product. Here AB is the matrix product and Tr(A) is the sum ofdiagonal elements.Example 5: Let S the vector space of second degree complex polynomials.Then〈P |Q〉 =∫ 10dxP ∗ (x)Q(x) (579)is a scalar product.From these examples it is clear the there are many different types ofvector spaces. Our goal is to exploit the common features of these differentlooking spaces.Definition: A vector space with a metric ρ(|a〉, |b〉) is complete if everyCauchy sequence of vectors converges to a vector in the space.Since all vectors spaces with norms or scalar products are also metricspaces, we have the following definitions.Definition: A complete normed linear space is called a Banach space.Definition: A complete inner product space is called a Hilbert space.114


Spaces that are not complete can be made complete including new vectorsdefined as Cauchy sequences of vectors in the space. This is exactly how thereal numbers are constructed by completing the rational numbers.Consider a vector space with an inner product. The inner product of thevectors|a〉 := β|b〉 + γ|c〉 (580)isThis can be factored as|a ′ 〉 := δ|d〉 + ɛ|e〉 (581)〈a ′ |a〉 = βδ ∗ 〈d|b〉 + βɛ ∗ 〈e|b〉 + γδ ∗ 〈d|c〉 + γɛ ∗ 〈e|c〉. (582)〈a ′ |a〉 = δ ∗ 〈d|a〉 + ɛ ∗ 〈e|a〉 = (583)β〈a ′ |b〉 + γ〈a ′ |c〉 (584)This shows that the inner product is linear in the arguments on the rightwhile it is conjugate linear in the arguments on the left.It is possible to treat this more symmetrically by introducing a space S ∗of dual vectors denoted by 〈a|.Dual vectors form a vector space by definition. In particular linear combinationsof dual vectors with complex coefficients are dual vectors. Thereis no complex conjugation involved.Each dual vector 〈a| is associated with a linear functional on the originalvector space. A linear functional is an operator L that gives a complexnumber when it is applied to a vector. It also satisfies the linearity conditionL(|b〉 + γ|c〉) = L(|b〉) + γL(|c〉) (585)The linear functional associated with the dual vector 〈a| on an inner productspace isL 〈a| (|b〉) = 〈a|b〉 (586)This dual vector acts linearly on S:L 〈a| (|b〉 + γ|c〉) =L 〈a| |b〉 + γL 〈a| |c〉 =〈a|b〉 + γ〈a|c〉 (587)115


Consistency with the previous definitions requiresL 〈(a+βb| = L 〈a| + β ∗ L 〈b| (588)which means that the relation that assigns a dual vector to a vector is notlinear.Note that while the above discussion applies to dual vectors in an innerproduct space, the concept of a continuous linear functional makes sense onall vector spaces. These functionals define the dual space of any vector space,but in the general case there is not necessarily a 1−1 correspondence betweenvector and dual vectors.Operators:Given two vectors spaces, S 1 and S 2 , an operator, f, is a function thatassigns vectors in a subset D of S 1 to vectors in a subset R of S 2 . This iswrittenf : D → R (589)The set D ⊂ S 1 is called the domain of f The set R ⊂ S 2 is called the rangeof f.The function f is onto S 2 if R = S 2 . It is one to one if |a〉 ≠ |b〉 impliesf(|a〉) ≠ f(|b〉).This section primarily concerns the class of linear operators. An operatorf : D ⊂ S 1 → R ⊂ S 2 is linear if for and |a〉, |b〉 ∈ D and α ∈ Cf(α|a〉 + |b〉) = αf(|a〉) + f(|b〉) (590)The following vector space operations are defined on linear operatorsaddition of linear operators(f 1 + f 2 )(|a〉) = f 1 (|a〉) + f 2 (|a〉) (591)multiplication of linear operators by complex numbers(αf 1 (|a〉)) = αf 1 (|a〉) (592)composition of linear operators: Let f 1 : D 1 ⊂ S 1 → R 2 ⊂ S 2 and letf 2 : D 2 = R 2 ⊂ R 3 ⊂ S 3 . Thenf 2 · f 1 (|a〉) := f 2 (f 1 (|a〉)) (593)116


In general if f 1 · f 2 is defined it does not follow that f 2 · f 1 is defined. If theyare both defined,in general f 1 · f 2 ≠ f 2 · f 1 . This means that multiplicationof linear operators is not generally commutative.Two linear operators f 1 and f 1 are equal if D 1 = D 2 , R 1 = R 2 and forevery |a〉 ∈ D 1f 1 (|a〉) = f 2 (|a〉) (594)The zero operator, O maps every vector in S 1 to the zero vector |0〉 2 inS 2 :O|a〉 = |0〉 2 (595)In what follows I assume that D = S 1 = S 2 . I will also use upper caseLatin letters to represent linear functions:f → A (596)f 1 → A 1 f 2 → A 2 ⇒ f 2 · f 1 → A 2 A 1 (597)If D = R = S then the identity operator, I, is the the linear operatorthat satisfiesI|a〉 = |a〉 ∀|a〉 ∈ S (598)The following identities are consequences of the definitionsTo prove these noteA0 = 0A = 0 IA = AI = A (599)0A|a〉 = 0|Aa〉 = |0〉 (600)A0|v〉 = A|0〉 = A(|0〉 − |0〉) = |0 (601)IA|a〉 = A|a〉 = AI|a〉 (602)The commutator of linear operators A and B is defined by[A, B] := AB − BA (603)The anticommutator of linear operators A and B is defined byIt is obvious from the definitions that{A, B} := AB + BA (604)[A, B] = −[B, A] (605)117


{A, B} = {B, A} (606)AB = 1 2 {B, A} + 1 [A, B] (607)2Example 1: Products of anti-commuting operators commute. Assume{A i , A j } = 0 i ≠ jThen[A 1 A 2 , A 3 A 4 ] = A 1 A 2 A 3 A 4 − A 3 A 4 A 1 A 2 =A 1 A 2 A 3 A 4 + A 1 A 3 A 2 A 4 − A 1 A 3 A 2 A 4 − A 3 A 1 A 2 A 4 + A 3 A 1 A 2 A 4 + A 3 A 1 A 4 A 2−A 3 A 1 A 4 A 2 − A 3 A 4 A 1 A 2 + A 3 A 4 A 1 A 2 − A 3 A 4 A 1 A 2 =A 1 {A 2 , A 3 }A 4 − {A 1 , A 3 }A 2 A 4 A 3 A 1 {A 2 , A 4 }A 3 {A 1 , A 4 }A 2 = 0 (608)This identity is very important in quantum field theory. It explains whyfermions are observable.118


0.26 Lecture 26Since linear operators are closed under addition, operator multiplication, andmultiplication by complex scalars, it is possible to use these operations tocreate new linear operators.For example we can construct linear operators that are polynomials in agiven linear operatorP (A) = p 0 I + p 1 A + p 2 AA + · · · + p n AA · · · AA } {{ }n-timesIt is customary to defineA n := AA } ·{{ · · AA}n-timesso the above polynomial has the traditional form(609)(610)P (A) = p 0 I + p 1 A + p 2 A 2 + · · · + p n A n (611)It is possible to consider polynomials of infinite degree. As in the case ofordinary functions, they can be defined in terms of Cauchy sequences of finitedegree polynomials. In order to define a Cauchy sequence it is necessary tofind a notion of convergence for sequences of linear operators.While there are many ways to do this I consider the special case of linearoperators on normed linear spaces. On these spaces it is possible to definethe norm of a linear operator as follows:‖A|v〉‖‖|A|‖ := sup ‖A|v〉‖ = sup}{{} }{{} ‖|v〉‖‖|v〉=1‖|v〉‖>1(612)where sup is the supremum or least upper bound.It is an immediate consequence of this definition that‖A|a〉‖ ≤ ‖|A|‖ · ‖|v〉‖ (613)for any |a〉A linear operator A on a normed linear space is bounded if‖|A|‖ < ∞ (614)119


Bounded operators A on a normed linear space are continuous. To see thisnote if‖(|a m 〉 − |a n 〉)‖ < ɛ (615)thenThis means that‖(A|a m 〉 − A|a n 〉)‖ = ‖A(|a m 〉 − |a n 〉)‖ ≤ ‖|A|‖ɛ (616)‖(|a m 〉 − |a n 〉)‖ → 0 ⇒ ‖(A|a m 〉 − A|a n 〉)‖ → 0 (617)or that it is permissible to pull limits through bounded operators.I now show that ‖|A|‖ is a norm on the vector space of bounded operators.By definition‖|A|‖ ≥ 0 (618)if0 = ‖|A|‖ = ‖A|v〉‖ ⇒ A|v〉 = |0〉 ⇒ A = 0 (619)where the zero on the left is complex zero while the zero on the right is thezero operator.‖|αA|‖ := sup ‖αA|v〉‖ = |α| sup ‖A|v〉‖ = |α| · ‖|αA|‖ (620)}{{}}{{}‖|v〉=1‖|v〉=1‖|A + B|‖ := sup ‖A + B|v〉‖ ≤ sup ‖A|v〉‖ + sup ‖B|v〉‖ = ‖|A|‖ + ‖|B|‖}{{}}{{} }{{}‖|v〉=1‖|v〉=1‖|v〉=1(621)In addition to addition and multiplication by scalars, operators can bemultiplied. It is useful to know how to calculate the operator norm of productsof operators‖|AB|‖ = sup ‖AB|v〉‖ ≤ (622)}{{}‖|v〉=1‖|A|‖ sup ‖B|v〉‖ ≤ (623)}{{}‖|v〉=1‖|A|‖|‖|B|‖ (624)Now that I have a norm on the space of bounded linear operators I candefine the exponential function of an operator in terms Cauchy sequences of120


finite degree polynomials:The limit is the infinite series(e A ) N = I +e A = I +N∑n=1∞∑n=1It is a simple consequence of the definitions that‖|e A |‖ = ‖|I +1n! An (625)1n! An (626)∞∑n=11n! An |‖ (627)Since ‖| · |‖ is a norm, repeated use of the triangle inequality givesEquation (624 ) implieswhich when used in (628) gives‖|e A |‖ =≤ 1 + ∑ n=1‖|e A |‖ ≤ 1 + ∑ n=11n! ‖|An |‖ (628)‖|A n |‖ ≤ ‖|A|‖ n (629)1n! ‖|A|‖n = e ‖|eA |‖ < ∞ (630)This shows that the series converges in norm when applied to any vectorin our normed linear space. In addition, this convergence is uniform in thesense that the bound above holds for all vectors and does not depend on thevector.For the same reason it is possible to define f(A) for any entire functionf(z) of a bounded operator A since the functions can always be expressed asa convergent power series.Certain classes of linear operators appear often in applications.Inverse operators: If A is a linear operator and there exists another linearoperator A −1r with the propertyAA −1r = I (631)121


then A −1r is called a right inverse of A.If A is a linear operator and there exists another linear operator A −1lthe propertywithA −1r A = I (632)then A −1lis called a left inverse of A.Theorem: If both A −1land A −1r exist then they are unique and A −1lor0 = A −1l 1AA −1r= A −1rA −1l 1A − A −1l 2A = I − I = 0 (633)− A −1l 2AA −1rA −1l 1An analogous argument shows that A −1rA −1r== A −1l 1I − A −1l 2I (634)− A −1l 2(635)is unique. The simple calculation= A −1r AA −1l= A −1l(636)shows that both operators must be the same.A linear operator A has an inverse if it has both a left and right inverse.The inverse operator of A is denoted by A −1 .Theorem: If both A −1 and B −1 exist then (AB) −1 exists and(AB) −1 = B −1 A −1 (637)B −1 A −1 AB = B −1 IB = I (638)ABB −1 A −1 = AIA −1 = I (639)which shows that B −1 A −1 is a left and right inverse of AB.If A is a linear operator and z is a complex number then the resolventoperator, R(z, A) of A is defined byR(z, A) = (zI − A) −1 (640)when it exists.It satisfies two important identities called the first resolvent identity:R(z 1 , A)−R(z 2 , A) = R(z 1 , A)(z 2 −z 1 )R(z 2 , A) = R(z 2 )(z 2 −z 1 )R(z 1 ) (641)122


and the second resolvent identity:R(z, A)−R(z, B) = R(z, B)(A−B)R(z, A) = R(z, B)(A−B)R(z, A). (642)A nice property of the resolvent of a bounded linear operator is that if R(z, A)is bounded at z = z 0 then it is analytic as an operator valued function of zfor z near z 0 . This follows by iterating the first resolvent equation to get theseries expansionR(z, A) = R(z 0 , A)[I +which converges uniformly for∞∑(z 0 − z) n R(z 0 , A) n ] (643)n=1|z 0 − z|‖|R(z 0 , A)‖| < 1 (644)The points z in the complex plane where R(z, A) is bounded is called theresolvent set of A. I have just demonstrated that R(z, A) is analytic on theresolvent set of A.When linear operators act on an inner product space there are additionalways to classify linear operators that are important in applications.Let A be a linear operator on an inner product space. Define the adjointoperator A † by〈b|(A|a〉) = ( 〈a|(A † |b〉) ) ∗(645)On inner product spaces we use the notationIn this notation equation (645) becomesthenNote that if〈b|A|a〉 := 〈b|(A|a〉) (646)〈b|A|a〉 = 〈a|A † |b〉 ∗ (647)|c〉 = A|b〉 (648)〈c|d〉 = 〈d|c〉 ∗ = 〈d|A|b〉 ∗ = 〈b|A † |d〉 (649)This means that 〈c| := 〈b|A † is the dual vector to |c〉 = A|b〉.The adjoint operation has a number of elementary properties that followfrom the definition.123


(A † ) † = A:〈a|A|b〉 = 〈b|A † |a〉 ∗ = 〈a|(A † ) † |b〉 ∗∗ = (650)Since this holds for all |a〉 and |b〉 is follows that(AB) † = B † A † :Let |c〉 = B|b〉 and |d〉 = A † |a〉. Then(A † ) † = A (651)〈a|AB|b〉 = 〈b|(AB) † |a〉 ∗ (652)〈a|AB|b〉 = 〈a|A|c〉 = 〈c|A † |a〉 ∗ = 〈d|c〉 = (653)〈d|B|b〉 = 〈b|B † |d〉 ∗ = 〈b|B † A † |a〉 ∗ (654)Comparing (652) and (654) gives (AB) † = B † A †(A + βB) † = A † + β ∗ B † :〈b|(A + βB) † |a〉 ∗ = 〈a|(A + βB)|b〉 = 〈a|A|b〉 + β〈a|B|b〉 = (655)〈b|A † |a〉 ∗ + β〈b|B † |a〉 ∗ (656)Taking complex conjugates of both sides of this equation gives the desiredresult.124


0.27 Lecture 27The adjoint is an essential part of the definition of some kinds of operators:Definition: A linear operator A on an inner product space is hermitian ifand only if A = A † .Definition: A linear operator A on an inner product space is unitary if andonly if A −1 = A † .Definition: A linear operator A on an inner product space is normal if andonly if [A, A † ] = 0.By these definitions it follows that all hermitian and unitary operatorsare normal.Definition: A linear operator A on an inner product space is a projectionoperator if and only ifA = A † and A 2 = A (657)Definition: A linear operator A on an inner product space is a positive ifand only ifA = A † and 〈a|A|a〉 ≥ 0 (658)for all |a〉 in the inner product space.The Cauchy Schwartz inequality impliesTaking square roots gives|〈a|A|a〉| 2 ≤ ‖|a〉‖ 2 ‖A|a〉‖ 2 ≤ ‖|a〉‖ 4 ‖|A|‖ 2 (659)|〈a|A|a〉| ≤ ‖|a〉‖ 2 ‖|A|‖ (660)One property of a positive operator A is that it satisfies a generalizedCauchy Schwartz inequality:|〈a|A|b〉| 2 ≤ 〈a|A|a〉〈b|A|b〉 (661)The proof of this result is left as a homework exercise.It has the following useful consequence. If A is positive‖A|a〉‖ 4 = 〈a|A 2 |a〉 2 ≤ 〈a|A|a〉〈Aa|A|Aa〉 (662)which is equivalent to‖A|a〉‖ 2 ≤ 〈a|A|a〉 〈Aa|‖A|a〉‖ A |Aa〉‖A|a〉‖(663)125


The meanssup ‖A|a〉‖ 2 ≤ sup 〈a|A|a〉 〈Aa|‖|a〉‖=1‖|a〉‖=1 ‖A|a〉‖ A |Aa〉‖A|a〉‖ ≤Comparing (660) and (664) givessup 〈a|A|a〉 2 (664)‖|a〉‖=1‖|A|‖ =sup 〈a|A|a〉 (665)‖|a〉‖=1when A is a positive operator. The right hand side of this equation is easierto compute than the left hand side.Definition: Let A be a linear operator. An eigenvector |v〉 of A is a vectorsatisfyingA|v〉 = λ|v〉 (666)where λ is a complex constant called the eigenvalue of A associated with theeigenvector vertv〉.Example 1: Every vector is an eigenvector of the identity operator I witheigenvalue 1:I|a〉 = 1|a〉 (667)Example 2: If P is a projection operator and P |a〉 ≠ 0 then P |a〉 is aneigenvector of P with eigenvalue 1:P 2 |a〉 = 1P |a〉 (668)Theorem 26.1: The eigenvalues of a Hermitian operator are realProof:It follows thatA = A † A〈v〉 = λ|v〉 (669)〈v|A|v〉 = 〈v|v〉λ = (〈v|A † |v〉) ∗ = (〈v|A|v〉) ∗ = 〈 ∗ v|v〉λ (670)(λ − λ ∗ )‖|v〉| 2 = 0 (671)which means that if |v〉 is not the zero vector that λ = λ ∗Theorem 26.2: The eigenvector of a Hermitian operator corresponding todifferent eigenvalues are orthogonal.126


Proof:which givesThus if λ 1 − λ 2 ≠ 0 then〈v 1 |A|v 2 〉 = λ 2 〈v 1 |v 2 〉 = (〈v 2 |A † |v 1 〉) ∗ =(〈v 2 |A|v 1 〉) ∗ = λ ∗ 1(〈v 2 |v 1 〉) ∗ = λ 1 〈v 1 |v 2 〉 (672)(λ 1 − λ 2 )〈v 1 |v 2 〉 = 0 (673)〈v 1 |v 2 〉 = 0 (674)Theorem 26.3: If λ is an eigenvalue of a Unitary operator then λλ ∗ = 1Proof:Which gives〈v|v〉 = 〈v|U † U|v〉 = λ〈v|U † |v〉 =λ(〈v|U|v〉) ∗ = λλ ∗ (〈v|v〉) ∗ = λλ ∗ 〈v|v〉 (675)(1 − λλ ∗ )‖v〉‖ 2 = 0 (676)or λλ ∗ = 1 if |v〉 ≠ |0〉.Theorem 26.4: The eigenvectors of a unitary operator corresponding todifferent eigenvalues are orthogonalProof:Which gives〈v 1 |v 2 〉 = 〈v 1 |U † U|v 2 〉 = λ 2 〈v 1 |U † |v 2 〉 =λ 2 (〈v 2 |U|v 1 〉) ∗ = λ 2 λ ∗ 1(〈v 2 |v 1 〉) ∗ = λ 2 λ ∗ 1〈v 2 |v 2 〉 (677)(λ 2 λ ∗ 1 − 1)〈v 1 |v 2 〉 = 0 (678)or 〈v 1 |v 2 〉 = 0 for λ 1 ≠ λ 2 .In your homework you will show that eigenvectors of normal operatorswith different eigenvalues are also orthogonal/Let |a〉 and |b〉 be elements of an inner product space and let 〈b| be thedual vector to |b〉. Define the linear operator|a〉〈b| (679)by|a〉〈b|(|c〉 + α|d〉) := |a〉(〈b|c〉 + α〈b|d〉) (680)127


Note thatwhich immediately gives(〈c|a〉〈b|d〉) ∗ = 〈d|(|a〉〈b|) † |c〉 = 〈d|b〉〈a|d〉 (681)(|a〉〈b|) † = (|b〉〈a|) (682)Consider the special case when |a〉 = |b〉 with 〈a|a〉 = 1:It follows immediately thatP a := |a〉〈a| = P † a (683)P a P a = |a〉〈a|a〉〈a| = |a〉1〈a| = |a〉〈a| = P a (684)These are the equation that define a projection operator. This projectionoperator maps every vector to a complex multiple of the vector |a〉.This is not the most general projection operator. To see this let |a〉 and|b〉 be orthogonal unit vector〈a|b〉 = 0 (685)LetIt follows from (685) thatTaking adjoints givesIt follows thatP = P a + P b (686)P a P b = |a〉〈a|b〉〈b| = |a〉0〈b| = 0 (687)0 = (0) † = (P a P b ) † = P † b P † a = P b P a (688)P † = (P a + P b ) † = (P a ) † + (P b ) † = P a + P b = P (689)P 2 = P a P a + P a P b + P b P a + P b P b = P a + 0 + 0 + P b = P (690)Thus P is a projection operator.The identity operator is another operator that is trivially a projectionoperator.128


If P is a projection operator and P has an inverse then P is the identity.To prove this noteI = P P −1 = P P P −1 = P I = P (691)If P 1 and P 2 are projection operators then P = P 1 + P 2 is a projectionoperator if and only if P 1 P 2 = 0If P 1 P 2 = 0 then P 2 P 1 = (P 1 P 2 ) † = 0 and(P 1 +P 2 ) 2 = P 1 P 1 +P 1 P 2 +P 2 P 1 +P 2 P 2 == P 1 +0+0+P 2 = (P 1 +P 2 ) (692)Conversely ifthenMultiplying by P 1 on the left gives(P 1 + P 2 ) 2 = (P 1 + P 2 ) (693)P 1 P 2 + P 2 P 1 = 0 (694)0 = P 1 (P 1 P 2 + P 2 P 1 ) = P 1 P 2 + P 1 P 2 P 1 (695)while multiplying by P 1 on the right givesComparing (696) and (696) gives0 = (P 1 P 2 + P 2 P 1 )P 1 = P 1 P 2 P 1 + P 2 P 1 (696)which when combined with (694) givesProjection operators {P i } satisfyingP 2 P 1 = P 2 P 1 (697)P 1 P 2 = P 2 P 1 = 0 (698)P i P j = δ ij P j (699)are called orthogonal projectors. The Kronecker delta function, δ ij , isδ ij ={1 i = j0 i ≠ j(700)129


29:171 - Homework Assignment #91. The commutator and anti-commutator of two linear operators are definedby[A, B] := AB − BA {A, B} := AB − BAProve the following identities[A[B, C]] + [B[C, A]] + [C[A, B]] = 0[A, BC] = [A, B]C + B[A, C][A, BC] = {A, B}C − B{A, C}2. Let K be a linear Hermitian operator. DefineW := (I + iK)(I − iK) −1Show that W is a unitary operator.Express K in terms of W .(K is called the Cayley transform of W )3. Let P be an orthogonal projection operator. Let Q := I − P .Show that Q is an orthogonal projection operator.Evaluate QP .4. A linear operator N is Nilpotent if for some finite n, N n = 0. Showthat e N is a finite degree polynomial in N if N is nilpotent. Show thate αN e βN = e (α+β)N still holds when N is nilpotent.5. Let A be a bounded linear operator on a normed linear space. Definethe partial sumsn∑ 1F n (A) = I +m! Amm=1Show that this is a Cauchy sequence of operators.130


6. Show that if [A, B] = 0 thatexp(A + B) = exp(A)exp(B) = exp(B)exp(A)What happens to these relations if [A, B] = αI where α ∈ C and I isthe identity operator?7. Let P be a positive operator. Prove the generalized Cauchy Schwartzinequality:|〈a|P |b〉| 2 ≤ 〈a|P |a〉〈b|P |b〉131


0.28 Lecture 28A nice application of constructing new operators in terms of polynomialsgiven by the square root theorem.Theorem: Every bounded positive operator has a unique positive squareroot.Proof: Without loss of generality I assume that ‖|A|‖ < 1, Otherwise I candefine A ′ = 12‖|A|‖ A and √A =√2‖|A|‖√A′(701)where ‖|A ′ ‖| = 1 2I construct √ A as a limit of polynomials in with real positive coefficientsA. DefineC := 1 − A (702)and replace the unknown √ A by X:X := 1 − √ A (703)ThenorX 2 = I − 2 √ A + A = I − 2(1 − X) + (1 − C) (704)X = 1 2 (C + X2 ) (705)Next I note that C is a positive operator. Using (665) with ‖|a〉‖ = 1On the other hand since 〈a|A|a〉 < 1Using (660) again gives〈a|C|a〉 = 1 − 〈a|A|a〉 ≥ 1 − ‖|A|‖ > 0 (706)|‖C‖| =Given this bound consider the iteration of (705)〈a|C|a〉 ≤ 1 (707)sup 〈a|C|a〉 ≤ 1 (708)‖|a〉‖=1X 0 = 1 2 C (709)132


X 1 = 1 2 (C + X2 0 ) = 1 2 (C + 1 4 C2 ) (710). (711)X n+1 = 1 2 (C + X2 n) (712)Note that X 0 is a polynomial in C with positive coefficients.By induction assume that X n is a polynomial in C with constant coefficients,then by (712) X n+1 is a polynomial in C with positive coefficients.This implies that X n is positive.Assume that ‖|X n |‖ ≤ 1. Then‖|X n+1 |‖ = 1 2 ‖|C + X2 n|‖ ≤ 1 2 ‖|C|‖ + 1 2 ‖|X n|‖‖|X n |‖ ≤ 1 2 + 1 2 = 1 (713)This shows that the X n are positive and bounded in norm by 1.Next I show that the sequence of operators X n is a Cauchy sequenceX n − X n−1 = 1 2 (C + X2 n−1 ) − 1 2 (C + X2 n−2 ) = 1 2 (X2 n−1 − X2 n−2 ) (714)Since all of the X n are polynomials in C they commute soX n − X n−1 = 1 2 (X n−1 + X n−2 )(X n−1 − X n−2 ) (715)Repeating this n − 1 times givesX n − X n−1 = 1 2 (X n−1 + X n−2 ) 1 2 (X n−2 + X n−3 ) · · · 12 (X 1 + X 0 ) 1 2 X 0 = (716)This immediately impliesX n − X n−1 = 1 2 n X 0‖|X n − X n−1 |‖ ≤ 1 2 n ‖|X 0|‖n∏(X k−1 + X k−2 ) (717)k=2n∏(‖|X k−1 |‖ + ‖|X k−2 |‖) < 1 (718)k=2On the other hand (717) also implies thatX n − X n−1 (719)133


is a polynomial in C with positive coefficients.Consider the sequence of vectorsNote thatX n |a〉 (720)‖(X n − X m )|a〉‖ 4 = 〈a|(X n − X m )(X n − X m )|a〉 2 (721)Since X n −X m is positive the generalized Cauchy Schwartz inequality (homework)gives〈a|(X n − X m )(X n − X m )|a〉 2 ≤ 〈a|(X n − X m )|a〉〈a|(X n − X m ) 3 |a〉 2 ≤〈a|(X n − X m )|a〉1 (722)Note {〈a|X n |a〉} is an increasing sequence of positive numbers, boundedby 1. This sequence must be Cauchy, otherwise we can find infinite subsequenceswith 〈a|(X m − X n )|a〉 > 1/K. Using K + 1 elements leads to aviolation of the bound.The left side of the inequality implies‖(X m − X n )|a〉‖ → 0 (723)which means that the sequence of vectors X n |a〉 is Cauchy. If the innerproduct space is complete this converges to a vector in the space. DefineSolving for √ A givesX|a〉 = limn→∞X n |a〉 = |a〉 − √ A|a〉 (724)√A|a〉 = |a〉 − limn→∞X n |a〉 (725)Since ‖|X n |‖ ≤ 1 for all n, and X n are all Hermitian, it follows that I − X nis positive.Note that while there may be many operators that satisfy B 2 = A, thereis only one positive square root.Since √ A is a limit of polynomials in A it follows that[ √ A, A] = 0 (726)134


andLetsoIf A and B are both positive operators and [A, B] = 0 then AB is positive.To prove this note(AB) † = B † A † = BA = AB (727)AB = √ A √ AB = √ AB √ A (728)|c〉 = √ A|a〉 (729)〈a|AB|a〉 = 〈c|B|c〉 ≥ 0 (730)Assume that A is positive and A −1 exists, then√A−1= A−1 √ A (731)Assume that A −1 exists. ThenNote thatis positive (square root of a positive operator) andA = (AA † ) 1/2 (AA † ) −1/2 A (732)P := (AA † ) 1/2 ≥ 0 (733)(AA † ) −1/2 A[(AA † ) −1/2 A] † = (AA † ) −1/2 AA † (AA † ) −1/2 = I (734)This means that U := (AA † ) −1/2 A is unitary. It follows that every invertibleoperator can be written as the product of a positive and a unitary operator:Note that we also haveA = P U (735)A = A(AA † ) −1/2 (AA † ) 1/2 (736)with a (different) unitary operator on the left and a positive operator on theright. This result is called the polar decomposition theorem.135


0.29 Lecture 29Linear combinations of vectors can be used to generate new vectors:|b〉 =N∑α i |a i 〉 (737)i=1Definition: A set of vectors {|a i 〉} i∈I are linearly independent if the onlysolution to∑α i |a i 〉 = |0〉 (738)i∈Iis α i = 0 for all i ∈ I where I is an index set.Definition: A set of vectors {|a i 〉} i∈I span a vector space V if any vector|b〉 ∈ V can be expressed as|b〉 = ∑ i∈Iα i |a i 〉 (739)Theorem 29.1: Let n < ∞ vectors span a vector space containing r linearlyindependent vectors. Then n ≥ r.To prove this let {|b i 〉} 1≤i≤r be the linear independent vectors and let{|a i 〉} 1≤i≤n be the vectors that span the vector space. I assume that |b i 〉 ≠ |0〉.First noten∑|b 1 〉 = α 1j |a j 〉 (740)i=1At least one of the α 1j ≠ 0. By relabeling the index set on the vectors |a i 〉we can assume without loss of generality that α 11 ≠ 0. It follows that()|a 1 〉 = 1n∑|b 1 〉 − ( α 1j |a j 〉)(741)α 11It follows that {|b 1 〉, |a 2 〉, · · · |a n 〉} span the same vector space.Next express |b 2 〉 in terms of this new spanning set.i=2|b 2 〉 = β 21 |b 1 〉 +n∑α 2j |a j 〉 (742)i=2136


Since |b 1 〉 and |b 2 〉 are independent, at least one of the α 2j ≠ 0. Again Ican relabel indices so α 22 ≠ 0. This allows me to replace |a 2 〉 by |b 2 〉 in thespanning set, giving a new spanning set{|b 1 〉, |b 2 〉, |a 3 〉 · · · |a n 〉} (743)This process can be continued until all n of the |a i 〉 are replaced by the firstn |b i 〉s. In this case the first n |b i 〉 span the space. If r > n then |b n+1 〉 · · · |b r 〉can all be expressed in terms of the |b i 〉 for 1 ≤ i ≤ n. This means thatthe vectors |b n+1 〉 · · · |b r 〉 are not linearly independent of |b 1 〉 · · · |b n 〉. Thiscompletes the proof of the theorem.Definition: The dimension of a vector space is the maximal number oflinearly independent vectors.Definition: A basis of a vector space is a linearly independent set of vectorsthat span the space.Theorem 29.2: The number of basis vectors in an n dimensional space isn.Proof: Let {|b l 〉} m l=1 be a basis for a n-dimensional vector space. Since the|b l 〉 are linearly independent, by definition of dimension, m ≤ n. On theother hand since |b l 〉 span the vector space, Theorem 29.1 implies n ≥ m. Tosatisfy both inequalities requires m = n.Corollary: Any two bases of the same vector space have the same numberof basis vectors.Let {|n〉} N n=1 be a basis for a vector space V. Then any vector can bewritten asN∑|a〉 = |n〉a n (744)thenn=1This decomposition is unique because if|a〉 =|0〉 = |a〉 − |a〉 =N∑|n〉a ′ n (745)n=1N∑|n〉(a n − a ′ n) (746)n=1Since the basis vectors are linearly independenta n − a ′ n = 0 (747)137


for every n.This shows that there is a 1-1 correspondence between vectors in a N-dimensional vector space and ordered sets of N complex numbers. Thismeans the study of abstract vector spaces can always be reduced to thestudy of ordered sets of complex numbers.Definition: The numbers a n are coordinates of the vector |a〉 in the basis{|n〉} N n=1 .Note that a vector |a〉 will have different coordinates in different bases:|a〉 =N∑|n〉a n =n=1N∑|¯n〉b n (748)Vector operations become operations on components. The componentsof the sum of two vectors is represented by the sum of the components of theindividual vectors: The components of the scalar multiple of a vector by acomplex constant α is the product of α with the components of the vector:N∑N∑N∑|n〉a n + |n〉b n = |n〉(a n + b n ) (749)asn=1α(n=1N∑|n〉a n ) =n=1n=1n=1N∑|n〉(αa n ) (750)n=1Ii is sometimes useful to write the components of vectors in a given basis⎛ ⎞a 1a 2|a〉 =⎜.(751)⎟⎝ a n−1⎠a nIn this notationcan be written as⎛ ⎞ ⎛a 1a 2⎜.+ β⎟ ⎜⎝ a n−1 ⎠ ⎝a n|a〉 + β|b〉 (752)b 1b 2.b n−1b n⎞⎛=⎟ ⎜⎠ ⎝138a 1 + βb 1a 2 + βb 2.a n−1 + βb n−1a n + βb n⎞⎟⎠(753)


Implicit in these expressions is a set of basis vectors. Using the same coordinateswith different basis vectors results in a different vector.Let A be a linear operator and let {|n〉} N n=1 be a basis. Each of the vectorsA|n〉 can be expressed as a linear combination of the basis vectors:It follows thatAN∑|n〉a n =n=1A|n〉 =M∑|m〉A mn (754)m=1N∑A|n〉a n =n=1N∑n=1 m=1N∑|m〉A mn a n (755)The means that if a n are the components of |a〉 in the basis {|n〉} N n=1 thenthe components of A|a〉 in this basis areb m =N∑A mn a n (756)n=1Definition: An N × M array of numbers that represent a linear operatorA in a basis is called a matrix. Normally the first index is called the “row”index and thesecond index is called the “column” index. The matrix A mn issoemtimes written as⎛A mn =⎜⎝Definition: The operation⎞a 11 a 12 · · · a 1n−1 q 1na 21 a 22 · · · a 2n−1 q 2n. . · · · . .⎟a n−11 a n−12 · · · a n−1n−1 a n−1n⎠a n1 a n2 · · · a nn−1 a nna ′ n =(757)M∑A nm a m (758)m=1is called multiplication of a vector by a matrix.The product of linear operators C = AB can be represented by matricesasM∑C mn = A mk B kn (759)k=1139


where M is the dimension of the first index of B kn and the dimension of thelast index of A mk This operation is called multiplication of a matrix by amatrix.To compute the elements of a matrix A mn note that〈m|A|n〉 =N∑〈m|k〉A kn (760)k,l=1In order to extract the numbers A kl is is necessary to invert 〈m|k〉. Whilethis can always be done, it can be avoided by using orthonormal bases.IfN∑|a〉 = |n〉a n (761)then the dual vector has the form〈a| =n=1N∑a ∗ n〈n| (762)n=1where 〈n| is the dual of |n〉.The action of the linear functional 〈b| on the vector |a〉 is〈b|a〉 =N∑m=1 n=1N∑b ∗ m〈m|n〉a n (763)Here the same matrix the appears in (760) appears in (763).The relationship between the matrix elements of A and A † can be computedfrom the definition:The left side of this expression is〈b|A|a〉 = 〈a|A † |b〉 ∗ (764)while the right side isN∑ N∑ N∑b ∗ m〈m|k〉A kn a n = (765)m=1 n=1 k=1N∑N∑N∑m=1 n=1 k=1a m 〈k|m〉A †∗kn b∗ n (766)140


Comparing coefficients of the same vectorsN∑N〈k|n〉A ∗ kn = ∑〈n|k〉A † km(767)k=1The matrix the appears in (760) appears again here.Definition: The transpose of the matrix A mn is the matrix A T mn = A nm .This definition allows us to writeN∑k=1A T ∗mk〈k|n〉 =k=1N∑〈m|k〉A † kn(768)The matrices M mk := 〈m|k〉 are characteristic of inner product spaces.While they are always invertible, it the basis vectors are almost parallel, theinverse matrix can get very large and lead to computational instabilities.The matrix 〈m|n〉 becomes the identity when the basis is orthonormal.Definition: A basis {|n〉} N n=1 is orthonormal ifandk=1In an orthonormal basis (768) becomes〈m|n〉 = δ mn (769)A T ∗mn = A† mn (770)A mn = 〈m|A|n〉 (771)If a basis {|n〉} N n=1 is not orthonormal it is possible to use it to construct anew orthonormal basis. This method is important in quantum mechanics andis called the Gram Schmidt method. The new basis is denoted by {|¯n〉} N n=1The construction is|¯1〉 := c 1 |1〉 (772)wherec 1 = 1(773)‖|1〉‖|¯2〉 = c 2 [|2〉 − |¯1〉〈¯1|2〉] (774)1c 2 =‖[|2〉 − |¯1〉〈¯1|2〉]‖(775)141


|¯3〉 = c 3 [|3〉 − |¯1〉〈¯1|3〉 − |¯2〉〈¯2|3〉] (776)1c 3 =‖[|3〉 − |¯1〉〈¯1|3〉 − |¯2〉〈¯2|3〉]‖(777). (778)∑n−1|¯n〉 = c n [|n〉 − |¯k〉]] (779)1c n =‖[|n〉 − ∑ n−1k=1 |¯k〉]‖This completes the construction of the new orthonormal basis.0.30 Lecture 30k=1In an orthonormal basis a matrix is Hermitian ifIn an orthonormal basis a matrix is unitary if(780)A mn = A T ∗mn = A† mn (781)A −1mn = AT ∗mn = A† mn (782)Next I discuss inverses of linear operators. I what follows I assume thatthe dimension of the vector space, N, is finite.Define1,2,··· ,N−1,Nɛn 1 ,n 2 ,···n N−1 ,n N(783)by the conditions1,2,3··· ,nɛ1,2,3,··· ,n = 1 (784)1,2,··· ,N−1,Nand ɛn 1 ,n 2 ,···n N−1 ,n Nis antisymmetric on interchanging any n i with n j of j ≠ i.Given this definition the Determinant of an N × N matrix A mn is definedbydet(A) =∑1,2,··· ,N−1,Nn 1 ,n 2 ,···n N−1 ,n NA 1,n1 A 2,n2 · · · A N,nN (785)n 1 ,··· ,n Nɛ142


Theorem 30.1 If det(A) ≠ 0 then A 1,n , · · · , A N,n are components of Nlinearly independent vectors, |v k 〉|v k 〉 =N∑A k,n |n〉 (786)n=1To prove this assume by contradiction that |v k 〉 are linearly independent.Then for some m|v m 〉 = ∑ l≠mc l |v l 〉 (787)It follows thatdet(A) = ∑ 1,2,··· ,N−1,Nɛn 1 ,n 2 ,···n N−1 ,n NA 1,n1 A 2,n2 · · · A N,nN =∑ɛ1,2,··· ,N−1,Nn 1 ,n 2 ,···n N−1 ,n NA 1,n1 A 2,n2 · · · ( ∑ c L A l,nm ) · · · A N,nN =l∑ ∑1,2,··· ,N−1,Ncl ɛn 1 ,n 2 ,···n N−1 ,n NA 1,n1 A 2,n2 · · · A l,nm · · · A N,nN (788)l≠mEach l in the sum is repeated has terms of the formɛ······n m···n l···Al,n m· · · A l,nl =−ɛ······n l···n m···Al,n m· · · A l,nl =−ɛ······n m···n l···A l,nl · · · A l,nm (789)1,2,··· ,N−1,Nwhere I have used the antisymmetry of ɛn 1 ,n 2 ,···n N−1 ,n Nand relabeled dummyindices. This shows that this term, and thus each term in the sum, is equalto zero.This contradicts the assumption that det(A) is non-zero.A similar proof shows that of det(A) ≠ 0 that the “columns” of the matrixA mn must also be coordinates of linearly independent vectors.There is an alternate way to write the determinant of a matrix. A permutationσ on the integers {1, · · · , N} is an invertible function on {1, · · · , N}.There are N! distinct permutations of N objects. They can be generatedby taking products of pairwise interchanges. We define |σ| = 0 if σ can beobtained from the identity by an even number of pairwise interchanges, and|σ| = 1 if σ can be obtained from the identity by an odd number of pairwise143


interchanges. P(N) de<strong>notes</strong> the set of all N! permutations on the integers{1, · · · , N}.With this definition the determinant can be written asdet(A) =∑(−) |σ| A 1,σ(1) · · · A N,σ(N) (790)σ∈P(N)This follows because the sum over n 1 · · · n N is non-zero only when all of the n iare different, which corresponds to a permutation. In that case ɛ σ(1)···σ(N)1···N=(−) |σ|Assume that D = det(A) ≠ 0. If I treat the components of the matrixA mn as a set of N 2 independent variables then for any l:D = ∑ nA ln∂D∂A ln(791)This is an immediate consequence of the fact for fixed l each contribution toD has one term of the form A ln .On the other hand if l ≠ l ′ thenD = ∑ nA l ′ n∂D= 0 (792)∂A lnsince it is equivalent to replacing the l − th row of the matrix with the l ′ rowof the matrix. In this case the resulting expression is the determinant of amatrix with two identical rows, which must vanish.Combining these two results givesor if D ≠ 0Dδ ll ′ = ∑ nδ ll ′ = ∑ nA lnA ln1D∂D∂A l ′ n∂D∂A l ′ n(793)= (794)δ ll ′ = ∑ ∂ ln(D)A ln (795)∂Anl ′ nThis shows that whenever det(A) ≠ 0 the matrix A mn has an inverse givenbyA −1mn = ∂ ln(D)(796)∂A nm144


The matrixC nl := ∂D∂A ln(797)is called the cofactor matrix to A nl . I have just shown thatA −1mn = 1det(A) C mn (798)145


29:171 - Homework Assignment #111. Prove that the eigenvectors of a normal operator corresponding to distincteigenvalues are orthogonal.2. Consider the vector space of polynomials with inner product〈P 1 |P 2 〉 =∫ ∞0e −x P ∗ 1 (x)P 2(x)dxThe polynomials 1, x, x 2 , x 3 , x 4 are linearly independent vectors in thisspace. Use the Gram-Schmidt method to make a set of four orthonormalbasis functions with this scalar product. Compare your results toexpressions for the first four Laguerre polynomials.3. Show thatA −1ij:= ∂ ln(det(A))∂A jiis both a left and right inverse of the matrix A ij .4. Find the characteristic Polynomial Φ(λ) of( ) a bA =b aa. Find the roots of the characteristic polynomialb. Show φ(A) = 0c. Find φ n (A) for each eigenvalue λ n of A.d. Show φ 1 (A) + φ 2 (A) = I5. Let A be a Hermitian operator in a d-dimensional vector space with ddistinct eigenvalues. Show thatP m =d∏n≠m,1A − λ nλ m − λ n146


a. is an orthogonal projection operatorb. P m P n = δ mn P m .6. Let J be⎛J := ⎝0 1 0−1 0 00 0 0⎞⎠a. Find the eigenvalues of J.b. Find the eigenvectors of J.c. Show that J 3 can be expressed as a linear combination of J 2 ,J and I.d. CalculateR := exp(iJθ)where θ is a parameter.147


0.31 Lecture 31:Example 1:A =( )a11 a 12a 21 a 22(799)det(A) = a 11 a 22 − a 12 a 21 (800)ln(det(A)) = ln(a 11 a 22 − a 12 a 21 ) (801)a −111 = ∂ ln(det(A))∂a 11=a −112 = ∂ ln(det(A))∂a 21=a −121 = ∂ ln(det(A))∂a 12=a −122 = ∂ ln(det(A)) =∂a 22a 22a 11 a 22 − a 12 a 21(802)−a 12a 11 a 22 − a 12 a 21(803)−a 21a 11 a 22 − a 12 a 21(804)a 11a 11 a 22 − a 12 a 21(805)Multiplication shows that this is the inverse of the matrix A.Since the determinant of a matrix plays such an important role it is usefulto establish some other properties of determinants.The most useful of these isTo prove this I first note thatdet(A) = ∑det(AB) = det(A) det(B) (806)1,2,··· ,N−1,Nɛn 1···n N∑σ∈P (N)n 1 ,n 2 ,···n N−1 ,n NA 1,n1 · · · A N,nN = (807)(−) |σ| A 1,σ(1) · · · A N,σ(N) (808)where the sum is over all permutations σ on N objects where |σ| is 1 if σ canbe constructed out of an odd number of pairwise interchanges of (1, 2, · · · , n)and 0 if σ can be constructed out of an even number of pairwise interchangesof (1, 2, · · · , n).It is easy to see that this definition is completely equivalent to the originaldefinition. With this definitiondet(AB) = ∑ ∑(−) |σ| A 1,n1 B n1 ,σ(1) · · · A N,nN B Nn,σ(N) = (809)n 1···n N σ∈P (N)148


∑n 1···n NA 1,n1 · · · A N,nN∑σ∈P (N)(−) |σ| B n1 ,σ(1) · · · B Nn,σ(N) (810)The second sum vanishes unless all of the n i are different. This means thatI can replace the n i sums by a sum over permutations∑∑A 1,σ ′ (1) · · · A N,σ ′ (N) (−) |σ| B σ ′ (1),σ(1) · · · B σ ′ (N),σ(N) (811)σ ′ ∈P (N)which can be rewritten as∑(−) |σ′| A 1,σ ′ (1) · · · A N,σ ′ (N)σ ′ ∈P (N)The second term becomesI can redefine the indexσ∈P (N)∑σ∈P (N)(−) |σ|+(−)|σ′ |B σ ′ (1),σ(1) · · · B σ ′ (N),σ(N)B σ ′ (1),σ(1) · · · B σ ′ (N),σ(N) = B 1,σ −1′ σ(1) · · · B N,σ −1′ σ(N)where the sum still runs over all permutations, to get(812)σ ′′ = σ ′−1 σ, (813)|σ ′′ | = |σ −1′ | + |σ| = −|σ −1′ | + |σ| = −|σ ′ | + |σ| (814)It follows thatdet(AB) =∑∑(−) |σ′| A 1,σ ′ (1) · · · A N,σ ′ (N) (−) |σ′′| B 1,σ ′′ (1) · · · B N,σ ′′ (N) =σ ′ ∈P (N)σ ′′ ∈P (N)In a similar fashion it is possible to showThis will be left as a homework exercise.Note also thatdet(A) det(B) (815)det(A) = det(A T ) (816)det(I) = 1 1 = det(I) = det(AA −1 ) = det(A) det(A −1 ) (817)which impliesdet(A −1 ) =1491det(A)(818)


If {|n〉} N n=1 and {|¯n〉}N n=1 are both orthonormal bases then any vector canbe written asN∑N∑|a〉 = |n〉a n = |¯n〉ā n (819)n=1n=1The orthonormality of the basis vectors giveswhich can be used to express ?? asa n = 〈n|a〉 ā n = 〈¯n|a〉 (820)N∑|a〉 = |n〉〈n|a〉 =n=1N∑|¯n〉〈¯n|a〉 (821)n=1Comparing the left and right hand side of these equations, if I remove |a〉from both sides I get two expressions for the identity operatorNow considerI =N∑|n〉〈n| =n=1|a〉 = I 2 |a〉 =N∑| ¯m〉〈 ¯m|m=1Comparing these expressionsThe quantitiesN∑|¯n〉〈¯n| (822)n=1N∑| ¯m〉〈 ¯m|m=1N∑|n〉a n =n=1ā m =N∑|n〉〈n|a〉 =n=1N∑| ¯m〉ā m (823)m=1N∑〈 ¯m|n〉a n (824)n=1are matrix elements of the operator U〈 ¯m|n〉 (825)〈 ¯m|n〉 = 〈 ¯m|U|¯n〉 = 〈m|U|n〉 (826)150


whereNote thatU := ∑ n|n〉〈¯n| (827)UU † := ∑ nm|n〉〈¯n|| ¯m〉〈m| = ∑ m|m〉〈m| = I (828)which show that U is unitary.This shows that any operator that changes orthonormal bases is unitary.It is also easy to show that any unitary operator can be written in theformU := ∑ |n〉〈¯n| (829)nwhere {|n〉} N n=1 and {|¯n〉} N n=1 are both orthonormal basesTo do this start with the basis {|¯n〉} N n=1 . Define|n〉 := U|ˆn〉 (830)for 1 ≤ n ≤ N.The unitarity of U implies that these transformed vectors are also orthonormaland thatN∑U = |n〉〈ˆn| (831)n=1This shows that unitary operators can always be associated with changesof orthonormal bases. It is useful to contrast the transformation propertiesof basis vectors with the transformation properties of the components ofvectors:N∑ā m = 〈 ¯m|U|¯n〉a n (832)| ¯m〉 =n=1N∑|¯n〉〈 ¯m|U|¯n〉 ∗ =n=1N∑|¯n〉〈¯n|U † | ¯m〉 (833)which show that the basis vectors and components have transformation properties.n=1151


0.32 Lecture 32In the discussion the on matrices I showed that the condition for a linearoperator have an inverse was that the determinant of a matrix representationof the linear operator be non zero.My definition of determinant appears to be basis dependent. Note howeverthat for a linear operator AA mn = 〈m|A|n〉 = ∑ 〈m|¯k〉〈¯k|A|¯l〉〈¯l|n〉=∑〈m|U|k〉 Ā kl 〈l|U † |n〉= ∑ klU mn Ā kl U † ln(834)It follows thatdet(A) = det(UĀU † ) = det(U) det(Ā) det(U † ) = det(UU † ) det(Ā) = det(Ā)(835)This shows that the determinant is invariant with respect to a change ofbasis.I showed this assuming a unitary change of basis. If one does not care ifthe bases are orthonormal then U mn can be replaced by any invertible matrixWA mn = ∑ W mn Ā kl W −1ln(836)klThis will also have the same determinant. The proof is identical.Tensors: Consider an inner product space.Vectors and dual vectors can be written as|a〉 = ∑ n|n〉a n (837)〈a| = ∑ na ∗ N〈n| (838)The scalar product can be expressed in terms of the components asN∑〈b|a〉 = b ∗ n a n (839)n=1152


In a different orthonormal basis this equation becomeswhich is equivalent toN∑〈b|a〉 = ¯b∗ n ā n (840)n=1〈b|a〉 =N∑n,m,k=1¯b∗ m U † mk U kna n (841)The relevant observation is that vector components transform with U knunder change of basis while components of dual vectors transform with U † knunder change of basis.Equation (??) can be written in the form〈b|a〉 =N∑n,m,k=1¯b∗ m U † mk U kna n =N∑n,m,k=1U ∗ km U kn¯b ∗ m a n (842)It is possible to consider objects that transform like products of n vectorsand m dual vectors. An objectT a 1···a nb 1···b m(843)is called a rank (m, n) tensor if it has the following transformation propertiesunder the transformation U¯T a 1···a nb 1···b m=′∑Ua1 a ′ 1 · · · U a na ′ n U ∗ b 1 b ′ 1 · · · U ∗ b mb ′ m T a 1···a nb 1···b m(844)The way that I have constructed tensors, when a vector index and dual vectorindex are set to be equal and summed, the resulting sum is invariant. Theremaining indices describe a tensor of rank (n − 1, m − 1). This operation iscalled a contraction.Tensors normally appear in a more abstract setting. Typically one has avector space, a quadratic form, and a dual vectors space.The scalar product is generalized to have the formQ(a, b) = ∑ m,na ∗ mQ mn b n (845)153


The unitary transforms are replaced by linear transformations that leave thequadratic form Q invariant∑W acQ † cd W db = Q ab (846)c,dThese transformations form a group under matrix multiplicationW ′ ab = ∑ cW 1 acW 2 cb (847)The vectors may be real or complex. A rank (mn) tensor associated withthis quadratic form transform like¯T a 1···a nb 1···b m=′∑Wa1 a ′ 1 · · · W a na ′ n W ∗ b 1 b ′ 1 · · · W ∗ b mb ′ m T a′ 1···a′ nb ′ 1···b′ m(848)The important feature of tensors is that the zero tensor is invariant withrespect to the transformation W ab . This means that if an equation impliesthat two tensor quantities are equal, then the transformed tensors are alsoequal.Tensors of the same rank form a vector space. They can be added andmultiplied by scalars.Example: The vector space is a real four-dimensional vector space. Thequadratic form is the diagonal matrixη ab :=⎛⎜⎝−1 0 0 00 1 0 00 0 1 00 0 0 1⎞⎟⎠ (849)and the transformation are real transformations that preserve is quadraticformΛ ac Λ bd η cd = η ab (850)These transformations are called Lorentz transformations.The Maxwell field strength tensor⎛⎞0 −E 1 −E 2 −E 3F ab = ∂A a− ∂A b= ⎜ E 1 0 −B 3 B 2⎟∂x b ∂x a⎝ E 2 B 3 0 −B 1⎠ (851)E 3 −B 2 B 1 0154


is rank two antisymmetric tensor. The structure of this tensor arises becauseit involves derivatives of a vector. Tensors associated with stress and strainand dielectric strength are also related to Taylor expansions.Another well-known tensor is the moment of inertia tensor of a system ofpoint masses:I ij = ∑ m k (⃗x k · ⃗x k δ ij − x i kx j k ) (852)kIn this case the quadratic form is δ ij , the coordinates are real, and the transformationsW ij are three dimensional orthogonal matrices.Tensors appear in many situations. In general they are tied to a quadraticform and a group of transformations that leave the form invariant. Thissetting naturally leads to tensors with indices that transform like vectors(contravariant indices) and indices that transform like dual vectors (covariantindices.)Normally the transformations W ab and Wab ∗ acting on a vector space ofdimension d are the fundamental and conjugate representations of the groupthat preserve the d-dimensional quadratic form Q ab . Tensors can be thoughtof as vectors in a higher dimensional vector space that transform under ahigher dimensional representations of the same group.155


0.33 Lecture 33The Cayley Hamiltonian TheoremLet A be a linear operator represented by a N × N matrix with componentsA mn in an orthonormal basis.Definition:The characteristic polynomial of the linear operator A is thefunctionφ(λ) := det(λI − A). (853)1. Since I have shown that the determinant is independent of the choiceof basis, it follows that this function depends on the operator A, independentof its matrix representation.2. The definition of the determinant implies that φ(λ) is a polynomial ofdegree N.Theorem 31.1Cayley Hamiltonian Theorem:φ(A) = 0. (854)The Cayley-Hamilton is one of the most important theorems in linearalgebra.To prove this recalldet(M)δ ij =N∑k=1∂ det(M)∂M ikM jk . (855)Applying this result to the characteristic polynomialφ(λ)δ ij =N∑k=1∂ det(φ(λ))∂M ik(λI − A) jk . (856)Replacing λ by A in this expression givesφ(A)δ ij =N∑k=1∂ det(φ(λ = A))∂M ik(A − A) jk = 0. (857)While the proof of this result is elementary, it has strong consequences.156


Theorem 31.2 Let r > N then A r can be expressed as an polynomial ofdegree N − 1.To see this note that the Cayley Hamilton theorem implieswhich can be written asThenBy induction assume thatThis impliesN−1∑k=1d k−1 A k −φ(A) = 0 = A N +A s+1 =N−1∑k=0k=0N−1∑k=0c k A k (858)N−1∑A N = − c k A k . (859)A s =N−2∑k=0N−1∑k=0d k A k (860)d k A k+1 + d N−1 A N =N−1∑d N−1 c k A k = (d k−1 − d N−1 c k )A k (861)A s+1 =N−1∑k=0k=1d ′ kA k (862)This completes the proof of the theorem.It shows that any function of the operator A that can be approximatedby a polynomial can in fact be represented by a finite degree polynomial.The order of the polynomial depends on the dimension of the space.Since φ(λ) is a polynomial of degree N, the fundamental theorem ofalgebra implies thatL∏φ(λ) = (λ − λ l ) r l(863)where ∑ Ll=1 r l = N, and the λ l are isolated zeros of φ(λ).The function1φ(λ)l=1157(864)


has isolated poles at λ = λ l .1φ(λ) = 1∏ Ll=1 (λ − λ l) r l(865)I define functions of the formf(λ) =L∑ ∑r lc kl((λ − λ l ) ) (866)kl=1k=1Note that if γ l is a positively oriented curve abound λ l that does not containany of the other isolated singularities thenc kl = 1 ∫f(λ)(λ − λ l ) k−1 dλ (867)2πi γ iif I define these constants bythenc kl = 12πi∫γ i1φ(λ) (λ − λ l) k−1 dλ (868)g(λ) = 1 − f(λ) (869)φ(λ)is entire by Morerra’s theorem (integrals around the λ l vanish and it is analyticeverywhere else). It is constant because it is bounded. The constantmust be zero because the constant function vanishes at λ = λ l .Thus I can writeI define the polynomialsthen1φ(λ) =c kl = 12πi∫L∑ ∑r l(l=1k=1c kl(λ − λ l ) k ) (870)γ i1φ(λ) (λ − λ l) k−1 dλ (871)∑r lf l (λ) = c kl (λ − λ l ) rl −k1φ(λ) =k=1L∑ 1f l (λ)(λ − λ l ) r ll=1158(872)(873)


Now that I have this representation I have11 = φ(λ)φ(λ) =L∑ 1f l (λ)(λ − λ l ) r ll=1L∏(λ − λ k ) r k=k=1whereL∑f l (λ)l=1L∏k=1,k≠lφ l (λ) = f l (λ)(λ − λ k ) r k=L∏k=1,k≠lis a polynomial in /lambda.Inserting A for λ in this polynomial givesL∑φ l (λ) (874)l=1(λ − λ k ) r k(875)I =L∑φ l (A) (876)l=1Next I investigate properties of the polynomials φ l (A):Theorem 31.3: φ m (A)φ l (A) = 0 l ≠ mTo prove this observeφ l (A)φ m (A) = f l (A)f m (A)L∏(A − λ k )k≠l,mL∏(A − λ t ) (877)The term ∏ Lt=1 (A − λ t) = φ(A) = 0 by the Cayley Hamilton theorem.Theorem 31.4: φ m (A)φ m (A) = φ m (A)To prove this use theorem 31.3 and (876) to writeφ l (A)φ l (A) = φ l (A)Pick any vector |ξ〉 and considert=1L∑φ k (A) = φ l (A)I = φ l (A) (878)k=1φ m (A)|ξ〉 := |λ m 〉, (879)thenφ m (A)|λ m 〉 := |λ m 〉 (880)159


φ n (A)|λ m 〉 = 0 n ≠ m (881)Define S k as the linear subspace of vectors spanned by all vectors of theform φ k (A)|ξ〉. By construction all vectors in this subspace satisfyφ k (A)|χ〉 := |χ〉 (882)φ n (A)|χ〉 := 0 n ≠ k (883)On the other hand any vector can be expressed as|v〉 = I|v〉 =L∑φ l (A)|v〉 (884)l=1as a sum of vectors in each of the subspaces S kDefinition: A generalized eigenvector |v〉 of a linear operator A of order rwith eigenvalue λ is a vector satisfyingand(λI − A) r |v〉 = 0 (885)(λI − A) r−1 |v〉 ≠ 0 (886)Theorem 31.5: Every vector |ξ〉 ∈ S k is a generalized eigenvector of orderr ≤ r k of A with eigenvalue λ kProof:|ξ〉 ∈ S k ⇒ |ξ〉 = φ k (A)|χ〉 (887)by the Cayley Hamilton theorem.Theorem 31.6: If r k = 1 then(λ k I − A) r kφ k (A)|χ〉 = f k (A)φ(A)|χ〉 = 0 (888)|ξ〉 = φ k (A)|χ〉 (889)is an eigenvector of A with eigenvalue λ k .This follows from Theorem 30.5 by setting r k = 1.Theorem 31.7: If |χ〉 ∈ S k then|ξ〉, (λ k I − A)|ξ〉, · · · (λ k I − A) r k−1 |ξ〉 (890)are either 0 or linearly independent.160


To prove this let l be the smallest integer such that (λ k I − A) l |ξ〉 ≠ 0 andconsiderl∑α m (λ k I − A) m |ξ〉 = 0 (891)m=1Multiplying by (λ k I − A) l implies α 0 = 0. Next multiply by (λ k I − A) l−1 toshow α 1 = 0. This can be repeated l + 1 times to show α 0 = α 1 = · · · α l = 0.Putting everything together gives the following result:Theorem 31.7: Any vector |v〉 can be expanded as a linear combination ofthe generalized eigenvectors of the linear operator A.The expansion isL∑|v〉 = φ l (A)|v〉 (892)l=1Theorem 31.8: Let A be a normal operator. Then all generalized eigenvectorsof A have rank 1.Proof: Assume(λI − A) m |ξ〉 ≠ 0 m ≥ 1 (893)ThenThis means thatwhich gives〈ξ|(λI − A) †m (λI − A) m |ξ〉 > 0 (894)(λI − A) †m (λI − A) m |ξ〉 ≠ 0 (895)〈ξ|(λI − A) †m (λI − A) m (λI − A) †m (λI − A) m |ξ〉 = (896)〈ξ|(λI − A) †2m (λI − A) 2m |ξ〉 > 0 (897)This can be repeated until 2 k m > r which gives a contradiction.If A is unitary, Hermitian, or normal it follows thatwith|ξ〉 =L∑φ l (A)|ξ〉 (898)l=1Aφ l (A)|ξ〉 = λ l φ l (A)|ξ〉 (899)This shows that every vector in the vector space is a linear combination ofeigenvectors of A. Since ∑ Lk=1 r k = N is means that each of the subspaces S k161


is r k dimensional, or has r k linearly independent eigenvectors with eigenvalueλ k . Without loss of generality these can be chosen to be orthonormal usingthe Gram Schmid method.This proves the theorem:Theorem 31.9 Every normal operator has a complete set of eigenstates.These can always be chosen to be orthonormal.If A is normal then A can be expressed as the following polynomial in A:Sinceit follows that itA = AI =is a norm convergent series thenL∑Aφ l (A) =l=1L∑λ l φ l (A) (900)l=1A m φ l (A) = λ l A m−1 φ l (A) = · · · = λ m l φ l (A) (901)f(A) =∞∑c m A m (902)m=0f(A) = f(A)I = f(A)L∑l=1 m=0∞∑c m λ m l φ l (A)L∑φ l (A) =l=1L∑f(λ l )φ l (A) (903)This expresses f(A) as a polynomial in A.Example: Let A be normal linear operator on an 3 dimensional space withdistinct eigenvalues λ 1 , λ 2 , λ 3 . It is an elementary exercise to show that inthis simple caseφ 1 (A) = (A − λ 2)(A − λ 3 )(904)(λ 1 − λ 2 )(λ 1 − λ 3 )Using this in the above formula givesl=1φ 2 (A) = (A − λ 1)(A − λ 3 )(λ 2 − λ 1 )(λ 2 − λ 3 )φ 3 (A) = (A − λ 1)(A − λ 2 )(λ 3 − λ 1 )(λ 3 − λ 2 )(905)(906)e A = e λ 1φ 1 (A) + e λ 2φ 2 (A) + e λ 3φ 3 (A) (907)162


or more explicitlye A = e λ 1 (A − λ 2)(A − λ 3 )(λ 1 − λ 2 )(λ 1 − λ 3 ) +e λ 2 (A − λ 1)(A − λ 3 )(λ 2 − λ 1 )(λ 2 − λ 3 )+e λ 3 (A − λ 1)(A − λ 2 )(λ 3 − λ 1 )(λ 3 − λ 2 )(908)Previously this was defined in terms of an infinite series of matrix products ofall orders. In this representation the only products that appear are I, A, A 2 .This shows that whenever a normal operator has N distinct eigenvaluesthat∏ Nk≠lφ l (A) =(A − λ k)∏ Nn≠l (λ (909)l − λ n )andf(A) =∏ N∑Nk≠lf(λ l )(A − λ k)∏ Nn≠l (λ l − λ n )l=1(910)The origin of these results is the Cayley Hamilton theorem, which allowsone to reduce any function of A to a polynomial of degree A.Next I consider the case that A is any N × N matrix. As discussedabove, the characteristic polynomial still has N complex roots. For eachdistinct eigenvalue it is possible to construct the operator φ l (A). The rangeof each of these operators is a subspace S m of the N dimensional vector space.SinceL∑φ l (A) = I (911)any vector cah be witten asl=1|v〉 =L∑φ l (A)|v〉 (912)l=1The vectors φ l (A)|v〉 are linearly independent becasue if0 = ∑ la l φ l (A)|v〉 (913)163


and I multiply by φ m (A) it follows that0 = ∑ la l φ m (A)φ l (A)|v〉 = a m φ m (A)|v〉 (914)which requires that a m = 0 if φ m (A)|v〉 ≠ 0I construct a basis by first constructing basis on each of the subspacesS m . Fix m and let r be the largest integer such thatfor some |ξ〉 ∈ S m . If follows that(λ m − A) r |ξ〉 ≠ 0 (915)|ξ m10 〉 = |ξ〉|ξ m11 〉 = (A − λ m )|ξ〉.|ξ m1r1 〉 = (A − λ m ) r |ξ〉are all linearly independent and span a subspace of S m . Next find r 2 to bethe largest integer such that(A − λ m ) r 2|ξ〉 ≠ 0for |ξ〉 in S m but not in the subspaces constructed above. This leads to r 2 +1new independent vectors which are labeled|ξ m2n 〉 0 ≤ r ≤ r 2This process can be repeated until we exhaust all of the independent vectorsin S m . This can be repeated for all distinct eigenvalues.This leads to a sert of basis vectors with labels |mnl〉 By constructionfor l + 1 ≤ r nm . Recall that in a given basisA|mnl〉 = λ m |mnl〉 + |mnl + 1〉 (916)A|n〉 = ∑ m|m〉A mn164


so it follows that in this new basisA nm = δ mn λ n + 1η n+1,n (917)where η n+1,n is zero or 1.In this basis A has eigenvalues along the diagonal. When the same eigenvalueis repeated the may be small blocks where there may be 1’s directlybelow the diagonal.This is called the Jordan canonical form of the matrix. If {|¯n〉} is anyother basis then|¯n〉 = |m〉S mn |n〉 = | ¯m〉Smn −1(918)for some matrix S mn . It follows thatA| ¯m〉 = ∑ n|¯n〉Ānm =orA|k〉S km = ∑ lA|k〉 = ∑ l∑n∑|l〉S ln Ā nm (919)n|l〉S ln Ā nm S −1mk =∑|l〉A lk (920)Identifying coefficients of independent basis vectors giveslA lk = ∑ nmS ln Ā nm S −1mk(921)This shows that any matrix can be brought into Jordan canonical form by asuitable change of basis.Note that in the general case the matrix is not unitary, it is just aninvertible matrix.When the matrices are normal then the Jordan matrix is diagonal andthe basis vectors can be chosen to be orthonormal.In this case if |¯n〉 is also an orthonormal basis then|¯n〉 = ∑ m|m〉U mn 〈m|¯n〉 = U mn Umn −1 = 〈 ¯m|n〉 (922)165


and the operator can be brought into diagonal form by a unitary tranfsformation.The entries on the diagonal are just eigenvalues of the operator.When two matrices commute:[A, B] = 0 (923)then polynomials in these operators also commute. If follows thatφ l (A)φ m (B) = φ m (B)φ l (A) (924)Applying this operator to an arbitrary vectors |xi〉 givesIf this vector is not zero, then it follows thatφ l (A)φ m (B)|ξ〉 (925)(λ al − A) r alφ l (A)φ m (B)|ξ〉 = 0 (926)(λ bm − B) r bmφ l (A)φ m (B)|ξ〉(λ bm − B) r bmφ m (B)φ l (A)|ξ〉 = 0 (927)which means that this vector is a generalized eigenvector of both A and B.We also haveL∑ M∑I = φ l (A)φ m (B) (928)l=1 m=1which means that commuting operators have complete sets of simultaneousgeneralized eigenstates.When A and B are normal the generalized eigenvectors become eigenvectors.The above result means there is a basis of simultaneous eigenvectors ofboth A and B.If |a〉 is the only eigenvector of a normal operator A with eigenvalue a andB is a noraml operator satisfying [B, A] = 0 then |a〉 must be an eigenvectorof B for some eigenvalue. This follows becasueφ l (A) = ∑ mφ m (B)φ l (A) ≠ 0 (929)which means that there is at least one value of m with φ m (B)φ l (A) ≠ 0.SInce φ l (A) is one dimensional, there can be no more than one value of mwith this property.Weyl Pairs and irreducibility166


Note that if A is an N × N matrix the Cayley Hamilton theorem impliesthat any function of A can be expressed as a polynomial of degree N in A. IfI consider the space of N×N matrices as a complex vector space, the CayleyHamilton theorem implies that the powers of A span a subspace of this spaceof at most N + 1 dimensions. On the other hand this vector space has N 2dimensions. This means every operator cannot be written as a function ofA. What we want to show next is that polynomials in carefully chosen pairsof pairs of matrices can be used to represent any matrix.Begin by assuming that A is normal and has N distinct eigenvalues withorthonormal eigenvectorsA|a n 〉 = a n |a n 〉 (930)where 1 ≤ n ≤ N.Next define the shift operatorIt follows thatU|a n 〉 = |a n+1 〉 n < N (931)U =N−1∑n=0U|a N 〉 = |a 1 〉 (932)|a n+1 〉〈a n | + |a 1 〉〈a N | (933)has the form a unitary change of basis.By constructionU N |a n 〉 = |a n 〉 (934)for all basis vectors soU N − 1 = 0. (935)Thus the characteristic polynomial is λ n − 1 which has rootsη n = e 2πin/N n = 1, 2, · · · , N (936)or equivalentlyThis means thatη n = e 2πin/N n = 0, 2, · · · , N − 1 (937)N∏U N − 1 = (U − η n ) (938)n=1167


If I pick a particular fixed value k of n it follows thatN−1∑(U/η k ) N − 1 = 0 = (U/η k − 1) (U/η k ) l (939)where I have used the factorizationl=0(x n − 1) = (x − 1)(1 + x + x 2 · · · + x n−1 )Next I show that the operatorΠ k := 1 NN−1∑(U/η k ) l (940)l=0is an orthogonal projection operator. To see this noteΠ † k := 1 NN−1∑(U † η k ) l = 1 Nl=0N−1∑(U/η k ) −ll=0Multiply this by (U/η k ) N = 1 to getI also have1NN−11 ∑(U/η k ) N−lNl=0N∑(U/η k ) m = Π km=1Π 2 k = 1N 2N−1∑l,m=0(U/η k ) l+mm = N − lClearly for each fixed value of l, l + m goes from l to l + N − 1 which meanseach power of U/η k appears once. Since there are N values of l this sum isrepeated N times givingΠ 2 k = N N−1∑(U/ηN 2 k ) l = Π kl=0168


where(U − η k )Π k = 0This means each Π k projects on a different invariant subspace of U. Sincethere are N such operators with distinct eigenvalues, each of these subspacesmust be one dimensional. This demonstrates thatwhereΠ k = |u k 〉〈u k | (941)U|u k 〉 = η k |u k 〉These equations determine the basis vectors |u k 〉 up to phase. To choose thephase of |u k 〉 note thatI choose the phase so〈a N |u k 〉〈u k |a N 〉 = 1 NN−1∑〈a N |(U/η k ) l |a N 〉 = 1 Nl=0〈a N |u k 〉 =√1NUsing this choice gives〈a N |u k 〉〈u k |a m 〉 = 1 NN∑〈a N |(U/η k ) l |a m 〉The only surviving term has N = l + m or l = N − m which givesl=11√N〈u k |a m 〉 = 1 N η kmor〈u k |a m 〉 = 1 √Nη kmWhat is distinctive about these two operators is the the magnitude of theinner product of any basis function in one set with any basis function in theother set is entirely independent of the choice of basis states. In all cases the1relevant magnitude is √N.It follows that|a m 〉 = ∑ |u k 〉〈u k |a m 〉 = ∑ |u k 〉 1 √Nη km169


The next step is to define an adjoint shift operator on the eigenstates |u k 〉of U. We define a new operator V byAs in the case with U we have〈v k |V = 〈v k+1 |V N − 1 = 0,with characteristic roots η k . I also have0 = (V − η k )N∑(V/η k ) ll=1with|v k 〉〈v k | =where I choose the phase on 〈v k | soN∑(V/η k ) ll=1〈u N |v k 〉 = 1 √NIt follows that〈u m |v k 〉〈v k |u N 〉 = 〈u m | 1 NN∑(V/η k ) l |u N 〉 =l=1This givesand〈u m | 1 NN∑(V/η k ) l |u N 〉 = 1 N (1/η k) N−m = 1 N η kml=1|v m 〉 = ∑ k〈u m |v k 〉 = 1 √Nη km|u k 〉〈u k |v m 〉 = ∑ k|u k 〉 1 √Nη kmComparing this to the corresponding expression for a m|a m 〉 = ∑ |u k 〉 1 √Nη km = |v m 〉170


From this it follows that |v k 〉 = |a k 〉, or that we return to the original set ofstarting vectors.Note thatUV |v k 〉 = η k |k + 1〉V U|v k 〉 = V |v k+1 〉 = η k+1 |v k+1 〉Comparing these expressions gives the Weyl relationsNote thatV U = UV η = UV e 2πiN|u m 〉〈u n |N∑= |u m 〉〈u m |V n−m = 1 NA general operator O can be written ask=1e −2πimkN U k V n−mO = ∑ mn|u m 〉O mn 〈u n | =1NN∑k,m,n=1O mn e −2πimkN U k V n−mThis show that any operator has the general formO =N∑nm=1o(m, n)U m V nUnlike the case of A m the N × N operators U n V m can be used to expandany N × N matrix.The operators U and V are called Weyl pairs. In quantum mechanicsthe pairs U and V are called complementary operators. They are associatedwith a maximal mixing of eigenstates.In many cases the Weyl pairs can be decomposed into smaller parts.Assume that the dimension N can factored as a product N = KM. DefineU 1 = U M U 2 = U K (942)Then U K 1 = 1 and U M 2 = 1. Since they are both powers of the same operatorthey commute. Define η 1m = e 2πim/M and η 2k = e 2πik/K171


We label eigenstates of U using pairs of indices|u n 〉 → |u 1 u 2 〉 (943)The eigenstates of U can be labeled by ordered pairs of integers n → (k, m)where 1 ≤ k ≤ K 1 ≤ m ≤ M. In this notationNext I define V operatorsU 1 |u 1k1 u 2k2 〉 = η 1k1 |u 1k1 u 2k2 〉 (944)U 2 |u 1k1 u 2k2 〉 = η 2k2 |u 1k1 + 1u 2k2 〉 (945)〈u 1k1 u 2k2 |V 1 = 〈u 1k1 + 1u 2k2 | (946)〈u 1k1 u 2k2 |V 2 = 〈u 1k1 + 1u 2k2 + 1| (947)where the last basis vector is also mapped to the first in both of these expressions.These equations lead toV K1 − I = V M2 − I = 0 (948)These characeristic polynomials imply that U i and V i have the same eigenvalues.We also haveV 2 V 2 = V 2 V 1 (949)so these operators have simultaneous eigenstates|v 1k1 v 2k2 〉 (950)Appealing to the previous construction it also follows thatThe V operators are defined byThe standard construction givesU 1 |v 1k1 v 2k2 〉 = |v 1k1 + 1v 2k2 〉 (951)U 2 |v 1k1 v 2k2 〉 = |v 1k1 + 1v 2k2 + 1〉 (952)〈u 1k1 u 2k2 |V 1 = 〈u 1k1 + 1u 2k2 (953)〈u 1k1 u 2k2 |V 2 = 〈u 1k1 + 1u 2k2 + 1 (954)U K 1= V K1 = U M 2 = V M2 = I (955)172


V 1 U 1 = e 2πiK U1 V 1Simple computations on the states givesV 2 U 2 = e 2πiM U2 V 2 (956)U 1 U 2 = U 2 U 1 V 1 V 2 = V 2 V 1 U 1 V 2 = V 2 U 1 V 1 U 2 = U 2 V 1 (957)For example〈u 1k1 u 2k2 |U 2 V 2 = η 1k1 〈u 1k1 u 2k2 + 1 = 〈u 1k1 u 2k2 |V 2 U 2 (958)We see that the U operators commute with each other. They have commoneigenvectors.The decomposition above can be repeated if any of the smaller dimensionsK or M admit a factorization into pairs of smaller numbers. This processcan be continuted until all U and V operators correspond to prime factors.In the general case all of the U and V operators are needed as a basisfor the most general operator. Specifically the dimension of the vector spacecan be factored into a product of prime numbers. Each of these primesappears twice in caclulating the dimension of the linear space of matriceson this space. One prime corresponds to a U and the other idenical primecorresponds to a corresponding V .Thus the most general operator has the form.F = ∑ f(a 1 , · · · , a k , b 1 , · · · , b k )U a 11 · · · U a kk V b 11 · · · V b kk(959)Thus we see that the basic building blocks of any operatros are Weyl pairsof operators associated with prime demensions.Moore Penrose Generalized InverseIn our discussion of matrices the condition that a matrix has an inverse itthe requirement that the determinant is non-vanishing. Sometimes computationalmethod lead to instabilities when the determinant of a large matrix getstoo small. It is often useful to have an algorithm for computing soemthingthat becomes always exists and becomes the inverse when it exists.A quantity with these properties is called the —colorblueMoore PenroseGeneralized inverse. It A is a square matrix X is a matrix satisfyingAXA = A (960)XAX = X (961)AX = (AX) † (962)173


(XA) = (XA) † (963)First note this there is at most one X satisfying these equations. Let X 1and X 2 satsify these equations. first noteAX 2 = (AX 2 ) † [(AX 1 A)X 2 )] † = [(AX 1 )(AX 2 )] † (AX 2 )(AX 1 ) = (AX 2 A)X 1 = AX 1(964)SimilarlyX 2 A = (X 2 A) † = (X 2 AX 1 A) † = (X 1 A)(X 2 A) = X 1 (AX 2 A) = X 1 A (965)It now follows thatwhich shows the uniqueness.X 1 = X 1 AX 1 = X 2 AX 1 = X 2 AX 2 = X 2 (966)174


29:171 - Homework Assignment #101. In classical mechanics we can consider the space of functions f(x, p) ofcoordinates and linear momentum as vectors in a vector space. Theenergy E = E(x, p) of a system can be expressed the sum of a kineticand potential energy, which is a function of coordinates and linearmomentum.Define the operator A by(Af)(x, p) = ∂E ∂f∂x ∂p − ∂E ∂f∂p ∂xShow that A is a linear operator. We sometimes writeAf = {E, f} p.bwhich is called the Poisson Bracket of E and fLet E = 1 2 (p2 +x 2 ) be the energy for a one dimensional simple harmonicoscillator with mass 1 and spring constant 1. CalculateUse these results to calculateAx = {E, x} p.bAp = {E, p} p.bx(t) = e −tA xShow that this solution describes simple harmonic motion correspondingto an initial coordinate and momentum given by x and p2. Letσ x =(0 11 0)σ x =(0 −ii 0)σ z =(1 00 −1)These matrices are called the Pauli spin matrices.175


a. Let ˆn be a real unit vector. Show thatˆn · ⃗σ = ˆn x σ x + ˆn y σ y + ˆn z σ zis a Hermitian matrix.b. Show that e λˆn·⃗σ is a positive operator for every real λ3. Let N be a nilpotent operator satisfying N 3 = 0. Let |v〉 be an initialvector and define the polynomial in t|v(t)〉 = |v〉 + tN|v〉 + t22! N 2 |v〉Show that |v(t)〉 satisfies the differential equationd|v(t)〉dt:= limɛ→0|v(t + ɛ)〉 − |v(t)〉ɛ= N|v(t)〉4. Show that three vectors, ⃗a, ⃗ b,⃗c, in a three dimensional space are linearlydependent if and only if the component vectors satisfy⃗a · ( ⃗ b × ⃗c) = 05. Show that det(A) = det(A T ) where A T mn = A nm6. Complex numbers are vectors in a one-dimensional vector space. Positiveoperators on this space are given by multiplying by a real positivenumber. Let A = 1/3. Approximate the positive square root of thisoperator using the method used in class:C = 1 − A = 2/3X = 1 − √ AX n = 1 2 (C + X2 n−1) X 0 = C/2 X = limn→∞X nCompare your approximation to what you get using a calculator.176


29:171 - Homework Assignment #12Consider the matrix⎛M := ⎝1 −i 0i −1 00 0 1⎞⎠1. Find the characteristic polynomial of M2. Find the eigenvalues of M3. Find the polynomials φ i (λ)4. Calculate φ i (M).5. Let |v〉 be any vector such that φ i (M)|v〉 ≠ 0. Show thatMφ i (M)|v〉 = λ i φ i (M)|v〉where λ i is the i-th eigenvalue.6. Calculate sin(M)7. Show that ∑ i φ i(M) = I8. Find a similarity transform that diagonalizes M9. Find M −1 177

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!