预告：无穷小微积分改版，寻找接班人

    敬告广大读者，新年将至。无穷小微积分网站将要改版，寻找接班人。
    特此公告。
袁萌陈启清 12月30日
附件：超实微积分原文
Hyperreal Calculus MAT2000 –– Project in Mathematics
Arne Tobias Malkenes Ødegaard Supervisor: Nikolai Bjørnestøl Hansen
Abstract This project deals with doing calculus not by using epsilons and deltas, but by using a number system called the hyperreal numbers. The hyperreal numbers is an extension of the normal real numbers with both inﬁnitely small and inﬁnitely large numbers added. We will ﬁrst show how this systemcanbecreated,and thenshowsomebasicpropertiesofthehyperreal numbers. Then we will show how one can treat the topics of convergence, continuity, limits and diﬀerentiation in this system and we will show that the two approaches give rise to the same deﬁnitions and results.
Contents
1 Construction of the hyperreal numbers 3
1.1 Intuitive construction . 3
1.2 Ultraﬁlters . . . . . . . . . . . 3
1.3 Formal construction . . . . . . . . . . . . . . . . 4
1.4 Inﬁnitely small and large numbers . . . . . . . 5
1.5 Enlarging sets . . . . . . . . . . 5
1.6 Extending functions . . .. . . . . 6
2 The transfer principle 6
2.1 Stating the transfer principle . . . . . . . 6
2.2 Using the transfer principle . . . . . . . . . . 7
3 Properties of the hyperreals 8
3.1 Terminology and notation . .. . 8
3.2 Arithmetic of hyperreals . . . . 9
3.3 Halos . . . . . . . . . . . . . . . . . 9
3.4 Shadows . . . . . . . . . . . . . . 10
4 Convergence 11
4.1 Convergence in hyperreal calculus. . . . .. . . . 11
4.2 Monotone convergence . . . 12
5 Continuity 13
5.1 Continuity in hyperreal calculus . . . . . . . . . . . . 13
5.2 Examples . . . . . . . . . . 14
5.3 Theorems about continuity. 15
5.4 Uniform continuity . . ... 16
6 Limits and derivatives 17
6.1 Limits in hyperreal calculus . .17
6.2 Diﬀerentiation in hyperreal calculus . . . . . . .. . 18
6.3 Examples . . . . . . . . . . 18
6.4 Increments . . . . . . . . 19
6.5 Theorems about derivatives . 19
1 Construction of the hyperreal numbers
1.1 Intuitive construction We want to construct the hyperreal numbers as sequences of real numbers hrni = hr1,r2,...i, and the idea is to let sequences where limn→∞ rn = 0 represent inﬁnitely small numbers, or inﬁnitesimals, and let sequences where limn→∞rn =∞ represent inﬁnitely large numbers. However, if we simply let each hyperreal number be deﬁned as a sequence of real numbers, and let addition and multiplication be deﬁned as elementwise addition and multiplication of sequences, wehavetheproblemthatthisstructure is not a ﬁeld, since h1,0,1,0,...i
h0,1,0,1,...i=h0,0,0,0,...i. The way we solve this is by introducing an equivalence relation on the set of real-valued sequences. We want to identify two sequences if the set of indices for which the sequences agree is a large subset of N, for a certain technical meaning of large. Let us ﬁrst discuss some properties we should expect this concept of largeness to have. • N itself must be large, since a sequence must be equivalent with itself. • If a set contains a large set, it should be large itself. • The empty set ∅ should not be large. • We want our relation to be transitive, so if the sequences r and s agree on a large set, and s and t agree on a large set, we want r and t to agree on a large set.
1.2 Ultraﬁlters Our model of a large set is a mathematical structure called an ultraﬁlter. Deﬁnition 1.1 (Ultraﬁlters). We deﬁne an ultraﬁlter on N, F, to be a set of subsets of N such that: • If X ∈ F and X ⊆ Y ⊆ N, then Y ∈ F. That is, F is closed under supersets. • If X ∈F and Y ∈F, then X ∩Y ∈F. F is closed under intersections. • N∈F, but ∅6∈F. • For any subset A of N, F contains exactly one of A and N\A. We say that an ultraﬁlter is free if it contains no ﬁnite subsets of N. Note that a free ultraﬁlter will contain all coﬁnite subsets of N (sets with ﬁnite complement) due to the last property of an ultraﬁlter. Theorem 1.2. There exists a free ultraﬁlter on N. Proof. See [Kei76, p. 49].
2
1.3 Formal construction Let F be a ﬁxed free ultraﬁlter on N. We deﬁne a relation ≡ on the set of real-valued sequences RN by letting hrni≡hsni ⇐⇒ {n ∈N| rn = sn}∈F. Proposition 1.3 (Equivalence). The relation ≡ is an equivalence relation on RN. Proof. We check all needed properties of an equivalence relation. Reﬂexivity Since the set {n ∈N| rn = rn}= N, and N∈F, ≡ is reﬂexive. Symmetry The sets {n ∈N| rn = sn} and {n ∈N| sn = rn} are the same, so if one belongs to F, so does the other. Transitivity Assume that hrni≡hsni and hsni≡htni. Then both {n ∈ N | rn = sn}∈F and {n ∈N| sn = tn}∈F. Since {n ∈N| rn = sn}∩{n ∈ N| sn = tn}⊆{n ∈N| rn = tn}, and F is closed under intersections and supersets, {n ∈N| rn = tn}∈F, and so hrni≡htni, as desired.   Since ≡ is an equivalence relation, we can deﬁne the set of hyperreal numbers ∗R as the set of real-valued sequences modulo the equivalence relation ≡. In symbols, ∗R ={[r]| r ∈RN}= RN/ ≡ . We deﬁne addition and multiplication of elements in ∗R by doing elementwise addition and multiplication in the related sequences, more formally as [r]+[s]=[hrni]+[hsni]=[hrn +sni] [r]•[s]=[hrni]•[hsni]=[hrn •sni]. We deﬁne the ordering relation < by letting [r] < [s] ⇐⇒ {n ∈N| rn < sn}∈F. At this point, let us introduce some notation to make our arguments easier to read. For two sequences hrni and hsni, we denote the agreement set {n ∈N| rn = sn} byJr = sK. We can apply the same notation to other relations, so for example we haveJr < sK={n ∈N| rn < sn}. Proposition 1.4. The operations + and•are well-deﬁned, and so is the relation <. Proof. We ﬁrst show that + is well-deﬁned. If we have that hrni ≡ hr0 ni and hsni ≡ hs0ni, thenJr = r0K∈ F andJs = s0K∈ F, which means thatJr = r0K∩Js = s0K∈F. What we now need to show is thatJr + s = r0 + s0K∈F. If, for some k ∈ N, both rk = r0 k and sk = s0 k, then rk + sk = r0 k + s0 k, hence if k ∈Jr = r0K∩Js = s0K, then k ∈Jr + s = r0 + s0K, which shows that Jr = r0K∩Js = s0K⊆Jr + s = r0 + s0K. SinceJr = r0K∩Js = s0K∈ F, so is Jr +s = r0 +s0K. So if r ≡ r0 and s ≡ s0, r +s ≡ r0 +s0, which shows that the operation is well-deﬁned. Showing that • is well-deﬁned is similar. 3
We will now show that < is well-deﬁned, which means that we need to show that if hrni ≡ hr0 ni and hsni ≡ hs0ni, then ifJr < sK∈ F, thenJr0 < s0K∈ F.Firstly, assume that Jr = r0K∈F and thatJs = s0K∈F. Then, we need to provethat if Jr < sK∈F thenJr0 < s0K∈F. So let us assume thatJr < sK∈F, and then prove thatJr0 < s0K∈F. By our assumptions, we have thatJr = r0K∩Js = s0K∩Jr < sK∈ F. Ifk ∈Jr = r0K∩Js = s0K∩Jr < sK, then rk = r0 k, sk = s0 k and rk < sk, and therefore r0 k < s0 k, so k ∈Jr0 < s0K. So,Jr = r0K∩Js = s0K∩Jr < sK⊆Jr0 < s0K,and since F is closed under supersets, we conclude thatJr0 < s0K∈F, whichshows that < is well-deﬁned.   1.4 Inﬁnitely small and large numbers One of the main reasons for constructing the hyperreals is that we want to have access to inﬁnitely large and inﬁnitely small numbers, and now we can prove their existence. Theorem 1.5. There exists a number ε ∈ ∗R such that 0 < ε < r for any positive real number r, and there exists a number ω ∈ ∗R such that ω > r for any real number r. Proof. First, we need to talk about real numbers in ∗R. The way to do this is that given a real number r ∈R, we can identify this with a hyperreal number ∗r ∈ ∗R as ∗r =hr,r,...i. We will generally omit the ∗-decoration, and simply refer to this number as r. Now, let us turn to the actual proof. Let ε =h1, 1 2,...i =h1 ni . For anypositive real number r, the set {n ∈ N | 1 n > r} must be ﬁnite, and therefore {n ∈N| 1 n < r}iscoﬁnite,andhencebelongstoourfreeultraﬁlterF. Therefore, we can conclude that ε < r. Also, since {n ∈N|0 < 1 n}= N∈F, it must bethe case that 0 < ε. So the number ε is a hyperreal number which is greater than 0, but smaller than any positive real number. Let ω =[h1,2,...i]=[hni]. For any real number r, the set{n ∈N| r ≥ n}is ﬁnite, and hence{n ∈N| r < n}is coﬁnite, and belongs toF, which means that ω > r. This proves that ω is a hyperreal number greater than any real number.
1.5 Enlarging sets For a given subset A of R we can deﬁne an “enlarged” subset ∗A of ∗R by saying that a hyperreal number r is an element in ∗A if and only if the set of n such that rn is an element in A is large. Formally this can be deﬁned as [r]∈∗A ⇐⇒ {n ∈N| rn ∈ A}∈F. Again, we need to check that this is well-deﬁned. Using theJ...Knotation,let Jr ∈ AK={n ∈N| rn ∈ A}. We have that Jr = r0K∩Jr ∈ AK⊆Jr0 ∈ AK,so if r ≡ r0 andJr ∈ AK∈F, thenJr0 ∈ AK∈F, which shows that enlargements are well-deﬁned.
4
An example of this is if A = N and ω =h1,2,3,...i. ThenJω ∈NK= N∈F,so ω ∈∗N. Wewillrefertotheset ∗N asthehypernaturals. Similarly, if A =(0,1)and r =h0.9,0.99,0.999,...i. ThenJr ∈NK= N∈F, so r ∈∗(0,1). 1.6 Extending functions An important tool in non-standard analysis is to take a function f: R → R and extend it to a function ∗f: ∗R→∗R. This is done by applying the function to each element in the sequence representing the given hyperreal number. We deﬁne the extension as follows: ∗f([hr1,r2,...i])=[hf(r1),f(r2),...i]. Again, we need to prove that this is well-deﬁned. First, let f ◦r denote hf(r1),f(r2),...i. In general,Jr = r0K⊆Jf ◦r = f ◦r0K, and so if r ≡ r0, then ∗f(r)= f ◦r ≡ f ◦r0 = ∗f(r0). Hence the function is well-deﬁned. A function f: A →R deﬁned on some subset A of R can also be extended to a function ∗f: ∗A →∗R, but not in exactly the same way as above. Since r can be in ∗A without all elements of r being in A, there can be indices i for which f(ri) is not deﬁned. In order to get around this, we let f(ri) = 0 whenever ri 6∈ A. More formally, let sn =(f(rn) if rn ∈ A 0 otherwise and deﬁne ∗f([hrni])=[hsni]. Since we have that ∗f(r)= f(r) whenever r ∈ A, ∗f extends f. Therefore we will often simply drop the ∗-decoration, and simply refer to the extended function as f as well. An important subject related to this construction is sequences. A sequence hs1,s2,...i is simply a function s: N→R, and so by this construction can be extendedtoahypersequence s: ∗N→∗R,whichmeansthattheterm sn isdeﬁned even when n ∈∗N\N. 2 The transfer principle 2.1 Stating the transfer principle Oneofthemostimportanttoolsofnon-standardanalysisisthetransferprinciple, a way to show that a certain type of statement is true when talking about the real numbers if and only if a certain related statement is true when talking about the hyperreal numbers. 1 First, we introduce the set of sentences which the transfer principle applies to. This set is basically the set of all sentences (formulas with no free variables) in a language of ﬁrst-order logic which consists of a constant for each real number, a function symbol for each real function, and a relation symbol for each 1This is a rather cursory introduction to the tranfer principle. For a more in-depth explanation, see [Gol98, pp. 35-47].
5
relationonthereals. However, insteadofusingthequantiﬁers(∀x)and(∃y), our sentences use quantiﬁers of the form (∀x ∈ A) and (∃y ∈ B) where A and B are subsets of R. Some examples of such sentences are (∀n ∈N)(∃m ∈N)(m > n), (∃x ∈R)(∀y ∈R)(x+y = y) which state respectively that there is no biggest natural number and there is an additive identity for the reals. Let us call such a sentence an L-sentence. Now, we deﬁne the ∗-transform of an L-sentence. We take a sentence ϕ, and create a related sentence ∗ϕ. An L-sentence ϕ contains symbols P, f, and r for relations, functions, and constants on R. To create ∗ϕ, we replace P by ∗P for all relations P, replace f by ∗f for all functions f, and replace r by ∗r for all constants r. Some examples of this are: • The ∗-transform of the sentence (∀n ∈ N)(∃m ∈ N)(m > n) is (∀n ∈ ∗N)(∃m ∈∗N)(m ∗> n). • The ∗-transform of (∀x ∈R)(sin(x) < 2) is (∀x ∈∗R)(∗sin(x)∗< ∗2). Wewillgenerallyfollowtheconventionsthatweomitthe∗forconstants,most functions, and simple equalities and inequalities. With these conventions, the above sentences become (∀n ∈∗N)(∃m ∈∗N)(m > n) and (∀x ∈∗R)(sin(x) < 2). Now we state the transfer principle, which we will take as true without proof. Theorem 2.1 (Transfer principle). An L-sentence ϕ is true if and only if its ∗-transform ∗ϕ is true. Some remarks are in order. It is worth pointing out that one can go in both directions, that is one can go from R to ∗R, and from ∗R to R. If one decides to go in this last direction, it is important that the statement is the ∗-transform of an L-sentence, so for example it can contain no hyperreal constants. A way to get around this is by replacing the constant with a variable x, and adding the quantiﬁer (∃x ∈∗A) for some A ⊆R in front, which is a technique we will use. In many cases, we will not explicitly write down the full sentence, but rather state things like “since s < n for all natural n, by transfer it also also true for any hypernatural n”.
2.2 Using the transfer principle Theorem 2.2. The structureh∗R,+,•,<iis an ordered ﬁeld with zero and unity. Proof. The way we prove this is by using the transfer principle. We take the fact that R is an ordered ﬁeld as true. This can be stated by a number of logical sentences. The fact that addition is commutative in R can be expressed as the sentence (∀x,y ∈ R)(x+y = y +x), and so by the transfer principle, we can conclude that (∀x,y ∈ ∗R)(x+y = y +x), and so addition is commutative in ∗R. We leave out the full details, but this procedure can then be done for all the axioms for ordered ﬁelds (since they are all ﬁrst-order axioms), and so we conclude that h∗R,+,•,<i is also an ordered ﬁeld.   Remark. One important property of the standard real numbers is that they are complete, that is any subset of R which is non-empty and bounded above has a least upper bound. The reason for why this cannot be proven to hold for ∗R is that this can only be expressed using second-order logic, since you need to
6
talk about subsets of R, not just elements of R. In fact, ∗R is not complete. An example of this is that the open interval of real numbers (0,1) does not have a least upper bound in ∗R. Proposition 2.3. For any two subsets A and B of R, we have that • ∗(A∪B)= ∗A∪∗B • ∗(A∩B)= ∗A∩∗B • ∗(A\B)= ∗A\∗B. Proof. We prove the statement about unions, but the other two statements can be proven similarly. The statement (∀x ∈ R)(x ∈ (A∪B) ↔ x ∈ A∨x ∈ B) is true for any two subsets A and B of R, basically by the deﬁnition of unions. Using the transfer principle, the statement (∀x ∈ ∗R)(x ∈ ∗(A ∪ B) ↔ x ∈ ∗A∨x ∈∗B) is also true. We also have that for any two subsets X and Y of ∗R, (∀x ∈∗R)(x ∈(X∪Y)↔ x ∈ X∨x ∈ Y). Combining these last two statements, letting X = ∗A and Y = ∗B,wegetthat(∀x ∈∗R)(x ∈∗(A∪B)↔ x ∈(∗A∪∗B)), which shows that ∗(A∪B)= ∗A∪∗B.   Remark. It is worth noting that ∗Sn∈N Andoes not need to be equal to Sn∈N∗An. If An ={n}, then ∗Sn∈N An= ∗N, butSn∈N∗An= N. 3 Properties of the hyperreals 3.1 Terminology and notation At this point we introduce some terminology and notation for talking about hyperreal numbers. We say that a hyperreal number b is: • limited if r < b < s for some r,s ∈R, • positive unlimited if r < b for all r ∈R, • negative unlimited if b < r for all r ∈R, • unlimited if it is positive or negative unlimited, • positive inﬁnitesimal if 0 < b < r for all positive r ∈R, • negative inﬁnitesimal if r < b < 0 for all negative r ∈R, • inﬁnitesimal if it is positive inﬁnitesimal, negative inﬁnitesimal or 0, • appreciable if it is limited but not inﬁnitesimal. We will use the terms limited and unlimited, rather than ﬁnite and inﬁnite, when referring to individual numbers. Finite and inﬁnite are terms we use for sets only. For any subset X of ∗R, we deﬁne X∞ = {x ∈ X | x is unlimited}, X+ = {x ∈ X | x > 0}, and X− = {x ∈ X | x < 0}. These notations can also be combined, and so X+ ∞ denotes all positive unlimited members of X.
7
3.2 Arithmetic of hyperreals When reasoning about hyperreals, it is useful to have certain rules for computing them, for example that the sum of two inﬁnitesimals is itself inﬁnitesimal. Here are some such rules for computing with hyperreal numbers. If ε and δ are inﬁnitesimals, b and c are appreciable, and H and K are unlimited, then: • ε+δ is inﬁnitesimal, • b+ε is appreciable, • H +ε and H +b are unlimited, • b+c is limited, • −ε is inﬁnitesimal, • −b is appreciable, • −H is unlimited, • ε•δ and ε•b are inﬁnitesimal, • b•c is appreciable, • b•H and H •K are unlimited, • 1 ε is unlimited if ε 6=0, • 1 b is appreciable, • 1 H is inﬁnitesimal, • ε b, ε H and b H are inﬁnitesimal, • b c is appreciable, • b ε, H ε and H b are unlimited if ε 6=0. We do not give a proof for any of these rules, but they can be proven by using the transfer principle, or by reasoning about sequences of reals. The following expressions do not have such a rule, and can all take on inﬁnitesimal, appreciable, and unlimited values: ε δ, H K, ε•H, H +K. 3.3 Halos A hyperreal b is said to be inﬁnitely close to a hyperreal c if b−c is inﬁnitesimal, and this is denoted by b ' c. This deﬁnes an equivalence relation on ∗R, and we deﬁne the halo of b to be the '-equivalence class hal(b)={c ∈∗R| b ' c}. Said diﬀerently, the halo of b is the set of all hyperreals which are inﬁnitely close to b. Proposition 3.1. If two real numbers b and c are inﬁnitely close, that is if b ' c, then b = c. Proof. Suppose that b ' c with b and c real, but that b 6= c. Then there is a non-zero real number r such that b−c = r. But this contradicts the assumption that b ' c, since r is not an inﬁnitesimal.
8
Proposition 3.2. Suppose that b and c are limited, and that b ' b0 and c ' c0. Then b±c ' b0±c0 and b•c ' b0•c0. Furthermore, if c 6'0, then b/c ' b0/c0. Proof. From our assumptions, we have that b−b0 = εb and c−c0 = εc, with εb and εc being inﬁnitesimal. It is also the case that both b0 and c0 are limited. We want to show that b ± c ' b0 ± c0, and this is done by showing that( b±c)−(b0±c0) is inﬁnitesimal. We have that (b±c)−(b0±c0)=(b−b0)±(c−c0)= εb ±εc. Since both the sum of and the diﬀerence between two inﬁnitesimals is itself inﬁnitesimal by Section 3.2, we have that (b±c)−(b0±c0) is inﬁnitesimal, and hence that b±c ' b0±c0. The case b•c ' b0•c0 is proven similarly. We have that b•c−b0•c0 = b•c−b•c0 +b•c0−b0•c0 = b•(c−c0)+(b−b0)•c0 = b•εc +εb •c0 whichisinﬁnitesimalsincetheproductofalimitednumberwithaninﬁnitesimalis inﬁnitesimal and the sum of two inﬁnitesimals is inﬁnitesimal. Hence b•c ' b0•c0. For the last case we have that
b c −
b0 c0
= b•c0−b0•c c•c0 = b•c0−b•c+b•c−b0•c c•c0 = b•(c−c0)+c•(b−b0) c•c0 = c•εb −b•εc c•c0 . Now, if c 6'0, the denominator is the product of two appreciable numbers, which is also appreciable. Since the numerator is inﬁnitesimal by a similar argument to thecaseof products, the quotientis itself inﬁntesimal, andhence b/c ' b0/c0.   Remark. The ﬁrst part of the prosition, namely that b±c ' b0±c0, holds also for unlimited b and c, but the other parts do not. To show this, let H be some positive unlimited number, and let b0, c and c0 equal H, and let b equal H + 1 H. Then b ' b0 and c ' c0, but b•c−b0•c0 =H + 1 H•H −H •H = H2 +1−H2 =1, which is not inﬁnitesimal, and so b•c 6' b0•c0. A similar counterexample can also be produced for b/c.
3.4 Shadows Theorem 3.3 (Existence of shadows). Every limited hyperreal b is inﬁnitely close to one and only one real number s. This real number is called the shadow of b, which is denoted by sh(b).
9
Proof. Let A ={r ∈R| r < b}. Since A is a non-empty set which is bounded above, it has a least upper bound of A in R by the (Dedekind) completeness of R. Call this real number s. We want to show that b ' s, and we do this by showing that |b−s| < εfor all ε ∈ R+. Take any such ε. We show that |b−s| < ε by showing thats −ε < b < s+ε. Take the case when b < s+ε. Assume that s+ε ≤ b. Then s < s+ ε 2 < s+ε ≤ b. Sinceboth s and ε arereal, sois s+ ε 2, andsince s+ ε 2 < b, s + ε 2 ∈ A. But since s + ε 2 > s, s is not an upper bound of A. But this is a condradiction, so it must be the case that b < s+ε. Now take the case when s−ε < b. Assume that b ≤ s−ε. Then b ≤ s−ε < s− ε 2 < s. Since s− ε 2 ≥ b,s − ε 2 is an upper bound of A, but s− ε 2 < s, so s is not the least upper bound of A, which is a contradiction. We also need to check that there cannot be more than one shadow of b. Assume that there are two reals s and s0 which are both inﬁnitely close to b. Thus, by deﬁnition, b ' s and b ' s0, and so by transitivity of ', s ' s0. But since both s and s0 are real, by Proposition 3.1 we conclude that s = s0.2   Alternative proof. Watch Babylon 5.
4 Convergence 4.1 Convergence in hyperreal calculus The standard way to deﬁne convergence in real analysis is that a sequence hsni converges to the limit L ∈ R if for any ε ∈ R+, there exists an mε ∈ N such that |sn −L| < ε for any n > mε. This can be expressed in formal logic by the sentence (∀ε ∈R+)(∃mε ∈N)(∀n ∈N)(n > mε →|sn −L| < ε). The idea that this deﬁnition formalizes is that a sequence convergences to a real value L if you get very close to L when you get very far out in the sequence. What we do for non-standard analysis is that we say that a sequence converges to L if it gets inﬁnitely close to L as one gets inﬁnitely far out in the sequence. The original sequence hsni is only deﬁned on the naturals, so one can not go inﬁnitely far out, but by using how we deﬁned hypersequences in Section 1.6, we get a new sequence which is deﬁned for all n ∈∗N, where we can go inﬁnitely far out, and we denote this sequence by hsni as well. Theorem 4.1. A sequence of real numbers hsni converges to L if and only if sn ' L for all unlimited n. Proof. Assume that the sequence hsni converges to L. We need to show that sn ' L for any unlimited n, and we do this by proving that |sn −L| < ε for any positive real ε. So take an ε ∈R+. By the deﬁnition of convergence, there exists a natural number mε such that |sn −L| < ε whenever n > mε. Let k be such a natural number. Then the formal statement (∀n ∈N)(n > k →|sn −L| < ε) must hold. By the transfer principle, it must also be the case that (∀n ∈∗N)(n > k →|sn −L| < ε) (1) 2This proof, along with several other proofs we give in this article, is a modiﬁed version of a proof given in [Gol98].
10
is true. Now, let N be any unlimited number. Since k is limited, we have that N > k, and so by (1) can conclude that |sN −L| < ε. Since this holds for any positive ε, it must be the case that sN ' L is true, which completes the forward direction of the proof. For the converse, assume that sn ' L for all unlimited n. We want to show that the sequence converges. Take any ε ∈ R+, and ﬁx an unlimited N ∈∗N. Now, if n > N, n must be unlimited, and so sn ' L by our assumption, from which we conclude that |sn − L| < ε. Formally, this is expressible as (∀n ∈∗N)(n > N →|sn −L| < ε). Thus, the sentence (∃mε ∈∗N)(∀n ∈∗N)(n > mε →|sn −L| < ε) must also be true. By transfer, we can conclude that (∃mε ∈N)(∀n ∈N)(n > mε →|sn −L| < ε) must hold. Since ε was taken to be any positive real, we have that the sentence (∀ε ∈R+)(∃mε ∈N)(∀n ∈N)(n > mε →|sn −L| < ε) must hold. This is indeed the formal statement for stating that the sequence sn converges, which ﬁnishes our proof.
4.2 Monotone convergence A standard theoremaboutconvergence fromcalculusis thetheorem ofmonotone convergence, which can be stated as Theorem 4.2. Let hs1,s2,...i be a sequence of real numbers which is bounded above and non-decreasing. Then hsni is convergent. The standard proof works by taking the supremum of the set {sn | n ∈N}, and showing that the sequence converges to this number. The non-standard proof also uses the supremum of that set, but in a very diﬀerent way. Proof. Let sN be an extended term of the sequence, and let b be an upper bound of the sequence. Since the sequence is non-decreasing, s1 ≤ sn for any n, and sn ≤ b must also hold for any n since the sequence was bounded above by b. Thus the statement (∀n ∈N)(s1 ≤ n∧n ≤ b) must be true, and so must its ∗-transform (∀n ∈∗N)(s1 ≤ n∧n ≤ b). Applying this to our extended term sN, it is clear that sN is limited and so has a shadow L = sh(sN). What we now want to prove is that L is the least upper bound for the set {sn | n ∈N}. Since a set can only have one least upper bound, this L must be the same for all extended terms, and so all extended terms have the same shadow. Then, for any extended term sN, sN ' L, and then by Theorem 4.1, the sequence must be convergent. If m ≤ n, sm ≤ sn since the sequence is non-decreasing. By transfer, this holds for any m,n ∈∗N as well. In particular, if m ∈N, and N is the index for 11
our chosen extended term sN, then sm ≤ sN ' L, and hence sm ≤ L since both sm and L are real. Hence, L ≥ si for any i ∈N, and so L is an upper bound of our set. Now we show that L is the least upper bound. Let r be any upper bound of our set. Then (∀n ∈N)(sn ≤ r), and so by using transfer, we must have that sN ≤ r. Then we have that L ' sN ≤ r, and then that L ≤ r, since both L and r are real. So for any upper bound of our set, L is not larger, and so L is the least upper bound, completing our proof.
5 Continuity 5.1 Continuity in hyperreal calculus The standard deﬁnition of continuity states that a function f is continuous at c if for any positive real ε, there exists a positive real δ such that|f(x)−f(c)| < ε whenever |x−c| < δ, which can be expressed by the formal statement (∀ε ∈ R+)(∃δ ∈R+)(∀x ∈R)(|x−c| < δ →|f(x)−f(c)| < ε). The intuitive notion in this deﬁnition is that f(x) gets arbitrarily close to f(c) when x gets arbitrarily close to c. What our non-standard deﬁnition formalizes, is that f(x) is inﬁnitely close to f(c) when x is inﬁnitely close to c. Theorem 5.1. A function f: R → R is continuous at c ∈ R if and only if f(x)' f(c) whenever x ' c. Proof. We start by assuming that f is continuous at c, and also that we have a hyperreal x such that x ' c. From this, we want to show that f(x)' f(c), and we do this by showing that |f(x)−f(c)| < ε for all ε ∈R+. Take any positive real ε. By the deﬁnition of continuity, there exists a δ such that for all real x, |f(x)−f(c)| < ε whenever |x−c| < δ. Fix such a δ. Then, the statement (∀x ∈R)(|x−c| < δ →|f(x)−f(c)| < ε) must hold, and so by transfer its ∗-transform (∀x ∈∗R)(|x−c| < δ →|f(x)−f(c)| < ε) mustalsohold. Forthe x weassumedwasinﬁnitesimallycloseto c,thestatement |x−c| < δ →|f(x)−f(c)| < ε is true. But since δ is a positive real and x ' c, it must be true that |x−c| < δ, and so we can conclude that |f(x)−f(c)| < ε. Since this holds for any ε ∈R+, it must be true that f(x)' f(c), which is what we needed to show. For the converse, assume that f(x) ' f(c) whenever x ' c. We want to prove that the formal statement of continuity must be true. First, let ε be any positive real, and let d be any positive inﬁnitesimal. Then, it must be the case that x ' c whenever|x−c| < d. Then, byassumption, wehavethat f(x)' f(c), and thus that |f(x)−f(c)| < ε for any ε ∈R+. From this we can conclude that if |x−c| < d, then |f(x)−f(c)| < ε. This can be expressed formally as (∀x ∈∗R)(|x−c| < d →|f(x)−f(c)| < ε). Since this is true, the statement (∃δ ∈∗R+)(∀x ∈∗R)(|x−c| < δ →|f(x)−f(c)| < ε) 12
must also be true. But this is the ∗-transform of the sentence (∃δ ∈R+)(∀x ∈R)(|x−c| < δ →|f(x)−f(c)| < ε), and so by transfer we can conclude that this last sentence is also true. Since ε was chosen arbitrarily, with no conditions other than it being positive and real, we can conclude that the formal statement of continuity, (∀ε ∈R+)(∃δ ∈R+)(∀x ∈R)(|x−c| < δ →|f(x)−f(c)| < ε) must be true, which concludes our proof.   This theorem only deals with functions which are deﬁned on all of R. In many circumstances it is useful to study functions which are deﬁned only on some subset A of R. The proof of Theorem 5.1 can be easily extended to showing the following theorem. Theorem 5.2. The function f: A → R is continuous at c ∈ A if and only if f(x)' f(c) for all x ∈∗A with x ' c. Note that we here do not require that f(x)' f(c) for all x,c ∈∗A. This turns out to be a stronger condition, and is in fact equivalent with the notion of uniform continuity, which we will discuss later in this section.
5.2 Examples Here we give some examples of using hyperreal calculus to show that some functions are continuous or discontinuous. Proposition 5.3. The function f(x)= x2 is continuous at any a ∈R. Proof. By Theorem 5.1, it suﬃces to show that f(x)' f(a) whenever x ' a. If x ' a,thenx = a+εforsomeinﬁnitesimalε. Nowf(x)= f(a+ε)= a2+2aε+ε2. Then f(x)−f(a)= a2 +2aε+ε2−a2 = ε(2a+ε), which is inﬁnitesimal since the product of a limited number with an inﬁnitesimal is inﬁnitesimal. Hence, whenever x ' a, f(x)' f(a), so f is a continuous function.   Proposition 5.4. The function f deﬁned by f(x)=(1 if x is rational 0 if x is irrational is discontinuous at all a ∈R. Proving this with hyperreal calculus is rather straightforward, but requires establishing some propositions ﬁrst. Proposition 5.5. The extended function ∗f can be deﬁned as ∗f(x)=(1 if x ∈∗Q 0 if x 6∈∗Q. (2)
13
Proof. By transfer of the true sentences (∀x ∈R)(x ∈Q→ f(x)=1) (∀x ∈R)(x 6∈Q→ f(x)=0) we can conclude that ∗f(x) = 1 if x ∈ ∗Q, and that ∗f(x) = 0 if x 6∈ ∗Q, which shows that the deﬁnition (2) is a correct deﬁnition of ∗f.   Proposition 5.6. Any halo contains both hyperrationals (members of ∗Q) and hyperirrationals (members of ∗R\∗Q) Proof. Since any halo contains some hyperreal number r and the hyperreal number r+ε, where ε issomepositiveinﬁnitesimal,italsocontainsallhyperreals between these, the set X ={x ∈∗R| r < x < r +ε}. Now, since the sentence (∀x,y ∈ R)(∃z ∈ Q)(x < y → x < z ∧z < y) is true, using transfer, and applying the statement to r and r +ε, the statement (∃z ∈∗Q)(r < z∧z < r +ε) is true, and so X ∩∗Q6=∅, which means that our given halo contains at least one hyperrational number. For the other case, since the sentence (∀x,y ∈ R)(∃z ∈ (R\Q))(x < y →x < z ∧z < y) is true, using transfer and applying the statement to r and r +ε, the statement (∃z ∈ ∗(R\Q))(r < z ∧z < r + ε) is true, which means that X ∩∗(R\Q)6=∅, so our halo contains at least one hyperreal which is a member of ∗(R\Q). But by Proposition 2.3, ∗(R\Q)= ∗R\∗Q, so our halo contains at least one member of ∗R\∗Q, or a hyperirrational.   Proof of Proposition 5.4. From these two propositions, we can show that f is not continuous in any point. Let c be a rational number. Then f(c) = 1. By Proposition 5.6, there is a hyperirrational d in hal(c)\∗Q, with f(d)=0. Since 06'1, we have that c ' d, but f(c)6' f(d), so f is not continuous in c. Now, let c be an irrational number. Then f(c) = 0. By Proposition 5.6, there is a hyperrational d ∈hal(c)∩∗Q, and so f(d)=1. Again we have that c ' d, but f(c)6' f(d), so f is not continuous in c. So regardless of whether c is rational or irrational, f is not continuous in c, and therefore f is discontinuous in all points of R.
5.3 Theorems about continuity Theorem 5.7. If f and g are continuous at c, then f + g, f −g and fg are continuous at c. Furthermore, if g(c)6=0, then f/g is also continuous at c. Proof. Assume that f and g are continuous at c. Hence when x ' c, we have that f(x) ' f(c) and g(x) ' g(c), and these values are all limited. It then follows from Proposition 3.2 that • If x ' c, then (f +g)(x)= f(x)+g(x)' f(c)+g(c)=(f +g)(c), and so f +g is continuous at c. • If x ' c, then (f −g)(x)= f(x)−g(x)' f(c)−g(c)=(f −g)(c), and so f −g is continuous at c. • If x ' c, then (fg)(x) = f(x)•g(x)' f(c)•g(c) = (fg)(c), and so fg is continuous at c.
14
• If x ' c, then (f/g)(x)= f(x)/g(x)' f(c)/g(c)=(f/g)(c). Note that we require that g(c)6=0, and so g(x)6'0, and we can apply Proposition 3.2. Hence f/g is continuous at c.   Theorem 5.8. If f is continuous at c, and g is continuous at f(c), g ◦f is continuous at c. Proof. Let x ' c. Since f is continuous at c, we have that f(x)' f(c). Since g is continuous at f(c), for any number v which is inﬁnitely close to f(c), we have that g(v) ' g(f(c)). Since f(x) is inﬁnitely close to f(c), we have that (g◦f)(x)= g(f(x))' g(f(c))=(g◦f)(c), which proves that g◦f is continuous at c.   Theorem 5.9 (The Intermediate Value Theorem). Let f: [a,b] → R be a continuous function. Then for every real number d strictly between f(a) and f(b) there exists a real number c ∈(a,b) such that f(c)= d. Proof. Assume that f(a) < d < f(b). The case where f(a) > d > f(b) is similar. For each n ∈N, we partition [a,b] into n subintervals of equal length b−a n . These intervals then have the endpoints pk = a + kb−a n for 0 ≤ k ≤ n. Now, we lets n be the greatest endpoint for which f(pk) < d. sn is then the maximum of the set {pk | f(pk) < d}, which exists since the set is ﬁnite and non-empty (it contains p0 = a since f(a) < d by assumption). Since f(b) > d, pn = b 6∈{pk | f(pk) < d}. Thereforewehavethat a ≤ sn < bforall n ∈N. Byconstructionof sn itmustbetruethat f(sn) < d ≤ f(sn+ b−a n ) for any n ∈N. By transfer, we conclude that both of these statements also hold for any n ∈∗N. Now, let N be an unlimited hypernatural. We have that a ≤ sN < b, hences N is limited and has a shadow c =sh(sN)∈R. Now, since N is unlimited, b−a N is inﬁnitesimal, and so we have that sN ' c and sN + b−a N ' c. Now, by theassumption that f is continuous, and our equivalent formulation of continuity, we have that f(sN)' f(c) and fsN + b−a N ' f(c). Therefore, it is the casethat f(c)' f(sN) < d ≤ fsN + b−a N ' f(c). Therefore f(c) ' d, but since both f(c) and d are real, we can conclude that f(c)= d, which completes the proof.
5.4 Uniform continuity The notion of uniform continuity is a strengthening of the ordinary notion of continuity, and can be expressed with the formal sentence (∀ε ∈ R+)(∃δ ∈ R+)(∀x,y ∈ A)(|x−y| < δ →|f(x)−f(y)| < ε). The big diﬀerence here is that for a given ε, the same δ should work for all x,y ∈ A, whereas in the ordinary notion of continuity, δ can depend on x. Theorem 5.10. The function f: A →R is uniformly continuous on A if and only if f(x)' f(y) whenever x ' y for all x,y ∈∗A.
15
Proof. This can be proven in a similar manner to the theorem for standard continuity, but then using the formal sentence (∀ε ∈ R+)(∃δ ∈ R+)(∀x,y ∈ A)(|x−y| < δ →|f(x)−f(y)| < ε).   Theorem 5.11. If f is continuous on [a,b], then f is uniformly continuous on [a,b]. Proof. Assume that f is continuous. Now, take hyperreals x,y ∈ ∗[a,b] with x ' y. Let c = sh(x). Then since a ≤ x ≤ b, and x ' c, then c ∈ [a,b], and so by assumption f is continuous at c. Since both c ' x and c ' y, we have that f(c)' f(x) and f(c)' f(y) by the continuity of f. By the transitivity of ', we conclude that f(x)' f(y), and hence that f is uniformly continuous on [a,b].   Remark. This proof does not transfer to more general intervals (for example (0,1) or [0,∞]) since it is a necessary part of the proof that the shadow of x is contained in the original interval, but for these intervals this is not guaranteed. As an example, let (0,1) be our interval and let x = ε be a positive inﬁnitesimal, which is in ∗(0,1). Then c =sh(x)=06∈(0,1). Proposition 5.12. f(x)= 1 x is not uniformly continuous on (0,1). Proof. Let H be any positive unlimited hyperreal. Then H +1 is also unlimited. Hence both 1 H and 1 H+1 are positive inﬁnitesimals, and hence we have 1 H ' 1 H+1 and 1 H , 1 H+1 ∈∗(0,1). Howeverf1 H= H andf1 H+1= H+1,butH 6' H+1.Therefore we have x,y ∈ ∗(0,1) such that f(x) 6' f(y), so f is not uniformlycontinuous.
6 Limits and derivatives 6.1 Limits in hyperreal calculus In order to talk about derivatives of functions, we want to be able to talk about limits of functions. In standard analysis, L is the limit of f as x goes to c, written limx→c f(x) = L if for any ε ∈ R+, there exists a δ ∈ R+ such that |f(x)−L| < ε whenever|x−c| < δ. Theintuitionbehindthisdeﬁnitionisthat f gets very close to L as x gets very close to c. The deﬁnition using non-standard analysis formalizes the intuitive idea that f is inﬁnitely close to L when x is inﬁnitely close to c. Given c,L ∈R and a function f deﬁned on A ⊆R, we have that lim x→c f(x)= L ⇐⇒ f(x)' L for all x ∈∗A with x ' c and x 6= c. Similarly, onecandeﬁnediﬀerenttypesoflimits, bothone-sidedlimitsandlimits as x tends to ∞. We have that • limx→c+ f(x)= L iﬀ f(x)' L for all x ∈∗A with x ' c and x > c. • limx→c− f(x)= L iﬀ f(x)' L for all x ∈∗A with x ' c and x < c. • limx→+∞f(x)= L iﬀ f(x)' L for all x ∈∗A+ ∞ (and ∗A+ ∞ 6=∅).
16
• limx→−∞f(x)= L iﬀ f(x)' L for all x ∈∗A− ∞
(and ∗A− ∞ 6=∅). These can be proved in a similar manner to the related theorems for continuity or for convergence, but we will not give the proof here.
6.2 Differentiation in hyperreal calculus In standard analysis, we say that f is diﬀerentiable at x if
lim h→0
f(x+h)−f(x) h exists, and if it does, we let f0(x) denote the derivative of f in x and f0(x) = limh→0 f(x+h)−f(x) h . Theorem 6.1. If f is deﬁned at x ∈R, then L ∈R is the derivative of f at x if and only if for every nonzero inﬁnitesimal ε, f(x+ε) is deﬁned, and f(x+ε)−f(x) ε ' L. Proof. Let g(h) = f(x+h)−f(x) h . Then the statement that limh→0 g(h) = L isequivalent with f having derivative L at x, and so applying the characterisation of limits from Section 6.1, the theorem follows.   This means that when f is diﬀerentiable, we can ﬁnd the derivative as f0(x)= shf(x+ε)−f(x) ε for any non-zero inﬁnitesimal ε. 6.3 Examples Proposition 6.2. The function f(x) = x2 is diﬀerentiable at any x ∈R, and f0(x)=2x for all x ∈R. Proof. Using the deﬁnition, we want to show that f(x+ε)−f(x) ε ' 2x for anyinﬁnitesimal ε 6=0 and real x. By straightforward calculations, we have that f(x+ε)−f(x) ε = (x+ε)2−x2 ε = x2 +2xε+ε2−x2 ε = ε(2x+ε) ε =2x+ε '2x Since for any ε, f(x+ε)−f(x) ε ' 2x, by Theorem 6.1, f is diﬀerentiable at allx ∈R, and f0(x)=2x, as we wanted to show.   Proposition 6.3. The function f(x)=|x| is not diﬀerentiable at x =0. Proof. Let ε be some positive inﬁnitesimal. Then f(x+ε)−f(x) ε = |0+ε|−|0| ε = ε ε =1.
17
However, we also have that f(x+(−ε))−f(x) ε
= |0+(−ε)|−|0| −ε
= ε −ε
=−1.
Since −16'1, we have that f(x+ε)−f(x) ε 6' f(x+δ)−f(x) δ for two non-zero inﬁnitesimals ε and δ = −ε, and so they can not both be inﬁnitely close to the same real number L, which means that f is not diﬀerentiable at 0.
6.4 Increments We introduce some notation to simplify our arguments. Let ∆x denote som non-zero inﬁnitesimal, representing a small change or an increment in the value of x. Then we let ∆f = f(x+∆x)−f(x) denote the corresponding increment in the value of f at x. To be explicit, we should write this as ∆f(x,∆x), since this value depends on both those variables, but we will mainly use the more convinient shorthand ∆f. The way we will use this shorthand is to compute ∆f ∆x, and if this is always inﬁnitely close to the same real number, then we have that f0(x)=sh(∆f ∆x). But since ∆f ∆x is just an ordinary fraction of hyperreal numbers, we can compute ∆f on its own, something which will be useful. An important thing to note is that if f is diﬀerentiable at x, ∆f ∆x ' f0(x), and so ∆f ∆x is limited. Since ∆f = ∆f ∆x∆x, we then have that ∆f is inﬁnitesimal, and thus f(x+∆x)' f(x) for all inﬁnitesimal ∆x. This proves that Theorem6.4. Ifafunction f: A →Risdiﬀerentiableat x, then f iscontinuous at x. The lemma that follows is needed mainly in our proof of the chain rule. Lemma 6.5 (Incremental Equation). If f0(x) exists at real x and ∆x is inﬁnitesimal, then there exists an inﬁnitesimal ε, dependent on x and ∆x, such that ∆f = f0(x)∆x+ε∆x Proof. Sincef0(x)exists,wehavethatf0(x)' ∆f ∆x,andhencethatf0(x)−∆f ∆x = ε for some inﬁntesimal ε. Multiplying through by ∆x and rearranging, we get that ∆f = f0(x)∆x+ε∆x, which is what we wanted.
6.5 Theorems about derivatives Theorem 6.6. If f and g are diﬀerentiable at x, so is f +g and fg, and we have that • (f +g)0(x)= f0(x)+g0(x) • (fg)0(x)= f(x)g0(x)+g(x)f0(x). Proof. We take the case of addition. First we compute ∆(f +g). We have that ∆(f +g)=(f(x+∆x)+g(x+∆x))−(f(x)+g(x)) =(f(x+∆x)−f(x))+(g(x+∆x)−g(x)) =∆f +∆g
18
and hence that
∆(f +g) ∆x = ∆f ∆x + ∆g ∆x ' f0(x)+g0(x) under the assumption that both f and g are diﬀerentiable. Since the real value f0(x)+ g0(x) is independent of ∆x, we conclude, by Theorem 6.1, that (f +g)0(x)= f0(x)+g0(x). For our proof of the statement regarding multiplication, we need a little trick, namely that f(x+∆x)= f(x)+(f(x+∆x)−f(x))= f(x)+∆f. Then we get that ∆(fg)= f(x+∆x)g(x+∆x)−f(x)g(x) =(f(x)+∆f)(g(x)+∆g)−f(x)g(x) = f(x)∆g +g(x)∆f +∆f∆g which yields that ∆(fg) ∆x = f(x)∆g ∆x +g(x)∆f ∆x + ∆f ∆x∆g ' f(x)g0(x)+g(x)f0(x)+0 where we again use that g and f are diﬀerentiable. The last term is 0 since ∆f ∆x is limited and ∆g is inﬁnitesimal. Since this last real number is independent of ∆x, weconclude,byapplyingTheorem6.1,that(fg)0(x)= f(x)g0(x)+g(x)f0(x).   Theorem6.7(ChainRule). If f isdiﬀerentiableat x ∈R, and g isdiﬀerentiable at f(x), then g◦f is diﬀerentiable at x with derivative g0(f(x))f0(x). Proof. For any non-zero inﬁnitesimal ∆x, f(x+∆x) is deﬁned and f(x+∆x)' f(x). Since g0(f(x)) exists, g is deﬁned at all points inﬁnitely close to f(x), which means that (g◦f)(x+∆x)= g(f(x+∆x)) is deﬁned. Now, we want to express ∆(g ◦ f) in other terms. Again, we use thatf (x+∆x)= f(x)+∆f. We get that ∆(g◦f)= g(f(x+∆x))−g(f(x))= g(f(x)+∆f)−g(f(x)) which shows that ∆(g◦f) is also the increment of g at f(x) corresponding to ∆f. Using the more explicit notation for increments, we have that ∆(g◦f)(x,∆x)=∆g(f(x),∆f). By the incremental equation applied to g, there exists an inﬁnitesimal ε such that ∆(g◦f)= g0(f(x))∆f +ε∆f and hence that ∆(g◦f) ∆x = g0(f(x))∆f ∆x +ε∆f ∆x ' g0(f(x))f0(x)+0 which establishes our claim, namely that g0(f(x))f0(x) is the derivative of g◦f at x.
19
Theorem 6.8 (Critical Point Theorem). Let f be deﬁned on some open interval (a,b), and have a maximum or minimum at x ∈(a,b). If f is diﬀerentiable at x, then f0(x)=0. Proof. Let f have a maximum at x. By the transfer principle, we conclude that f(x+∆x)≤ f(x) and thus that f(x+∆x)−f(x)≤0 for all inﬁnitesimal ∆x. Hence for a positive inﬁnitesimal ε and a negative inﬁnitesimal δ, we have that f0(x)' f(x+ε)−f(x) ε ≤0≤ f(x+δ)−f(x) δ ' f0(x). Since f0(x) is real, it must be equal to 0. The case when f has a minimum is similar.
References
[Gol98] Robert Goldblatt. Lectures on the hyperreals. An introduction to nonstandard analysis. Springer-Verlag, New York, 1998.
[Kei76] H. Jerome Keisler. Foundations of inﬁnitesimal calculus. 1976.

预告：无穷小微积分改版，寻找接班人

猜你喜欢