FOUNDATIONS OF NONSTANDARD ANALYSIS

FOUNDATIONS OF NONSTANDARD ANALYSIS
A Gentle Introduction to Nonstandard Extensions
BY C. WARD HENSON
1. Introduction There are many introductions to nonstandard analysis, (some of which are listed in the References) so why write another one? All of the existing introductions have one or more of the following features:
(A) heavy use of logical formalism right from the start;
(B) early introduction of set theoretic apparatus inexcess of what is needed formost applications;
(C) dependence on an explicit construction of the nonstandard model, usually by means of the ultrapower construction. All three of these features have negative consequences. The early use of logical formalism or set theoretic structures is often uncomfortable for mathematicians who do not have a background in logic, and this can effectively deter them from using nonstandard methods. The explicit use of a particular nonstandard model makes the foundations too speciﬁc and inﬂexible, and often inhibits the free use of the ideas of nonstandard analysis. In this exposition we intend to avoid these disadvantages. The readers for whomwehavewrittenareexperiencedmathematicians(includingadvanced students) who do not necessarily have any background in or even tolerance for symbolic logic. We hope to convince such readers that nonstandard methods are accessible and that the small amount of logical notation which turns out to be useful in applying them is actually simple and natural. Of course readers who do have a background in logic may also ﬁnd this approach useful. We give a natural, geometric deﬁnition of nonstandard extension in Section 2; no logical formulas are used and there are no set theoretic structures. In Section 3 we introduce logical notation of the kind that all students of
2
mathematics encounter, and we carefully show how it can be used without diﬃculty to obtain useful facts about nonstandard extensions. In Section 4 we extend the concept of nonstandard extension to mathematical settings in which there may be several basic sets (such as the vector space setting, where there is a ﬁeld F and a vector space V ). In Sections 5 and 6 we show how these ideas can be used to introduce nonstandard extensions in which sets and other objects of higher type can be handled, as is certainly necessary for applications of nonstandard methods in such areas as abstract analysis and topology. However, we do this in stages; in particular, in Section 5 we indicate how to deal with nonstandard extensions in a simple setting where a limited amount of set theoretic apparatus has been introduced. Such limited frameworks for nonstandard analysis are nonetheless adequate for essentially all applications. Section 6 treats the full superstructure apparatus which has become one of the standard ways of formulating nonstandard analysis and which is frequently used in the literature. In Section 7 we brieﬂy discuss saturation properties of nonstandard extensions. In several places we introduce speciﬁc nonstandard extensions using the ultraproduct construction, and we explore the meaning of certain basic concepts (such as internal set) in these concrete settings. (See the last parts of Sections 2, 4, 5, and 7.) Our experience shows that it is often helpful at the beginning to have such explicit nonstandard extensions at hand. As noted above, however, we think it is limiting to become dependent on such a construction and we encourage readers to adopt the more ﬂexible axiomatic approach as quickly as possible. In writing this exposition we have beneﬁtted greatly from conversations with Lou van den Dries about the best ways to present ideas from model theory to the general mathematical public. His ideas are presented in [5] and, with Chris Miller, in [6], and our treatment obviously depends heavily on that work. We have also freely used many ideas from other expositions of nonstandard analysis (listed among the References) and from the other Chapters in this book. To all these authors we express our sincere appreciation, and we recommend their writings to the reader who ﬁnishes this exposition with a desire to learn more about how nonstandard methods can be applied.
2. Nonstandard Extensions
The starting point of nonstandard analysis is the construction and use of an ordered ﬁeld ∗R which is a proper extension of the usual ordered ﬁeld R of real numbers, and which satisﬁes all of the properties of R (in a sense that we will soon make precise). We refer to ∗R as a ﬁeld of nonstandard real numbers, or as a ﬁeld of hyperreal numbers. Because the ordered ﬁeld
3
R is Dedekind complete, it follows that the extension ﬁeld ∗R will necessarily have among its new elements both inﬁnitesimal and inﬁnite numbers. These new numbers play a fundamental role in nonstandard analysis, which was created by Abraham Robinson [14] in order to make reasoning with inﬁnitesimals rigorous. (An element a of ∗R is ﬁnite if there exists r ∈ R such that −r ≤ a ≤ r in ∗R; otherwise a is inﬁnite; a is inﬁnitesimal if −r ≤ a ≤ r holds for every positive r ∈R. In some places a ﬁnite number is called limited.) It is easily seen that a proper extension ﬁeld ∗R of R cannot satisfy literally all the properties of R. For example it cannot be Dedekind complete. (The set of ﬁnite numbers in ∗R cannot have a least upper bound s, because then s−1 would be a smaller upper bound.) The challenge was to establish a clear and consistent foundation for reasoning with inﬁnitesimals, that captured the known heuristic arguments as much as possible. This was accomplished by Abraham Robinson in the 1960s. The purpose of this paper is to describe the essential features of the resulting frameworks without getting bogged down in technicalities of formal logic and without becoming dependent on an explicit construction of a speciﬁc ﬁeld ∗R. We usually think of R as being equipped with additional structure, in the form of distinguished sets, relations, and functions; we include whatever objects play a role in the mathematical problems we are considering. For example, these will normally include the set N of natural numbers (non-negative integers) and often such functions as the sine, the cosine, exponentiation, and the like. When we say (as we did above) that ∗R satisﬁes all of the properties of R, we mean (in part) that each of these additional sets, relations, and functions on R will have a counterpart on ∗R, and that the entire system of counterpart objects will satisfy an appropriate set of conditions. For example, if we are thinking of N as one of the given sets, then ∗R contains a discrete set ∗N which is the counterpart to N. The conditions that we impose on ∗R and ∗N will imply that N is an initial segment of ∗N, and that the elements of ∗N\N are inﬁnite numbers in ∗R. Moreover, for every positive number r in ∗R there will be a unique N ∈∗N which is the hyperinteger part of r, in the sense that N ≤ r < N +1. We will refer to ∗N as the set of nonstandard natural numbers or as the set of hypernatural numbers. The presence of inﬁnitesimal and inﬁnite numbers allows us to give elegant and useful characterizations of many important mathematical concepts, and this phenomenon is the basis for a large part of the impact of nonstandard analysis. For example, suppose (sn)n∈N is a sequence of real numbers and t is a real number. Then one can prove the following characterization of the limit concept: sn → t as n →∞⇐⇒ sN ≈ t for all inﬁnite N ∈∗N.
4 (For any two numbers s,t ∈∗R we write s ≈ t to mean that the diﬀerence s−t is inﬁnitesimal.) Using such a characterization allows many heuristic arguments about limits to be made precise; for example, it becomes easy to give elementary algebraic proofs of the basic facts about the algebra of limits. Note that in the characterization of the limit condition given above, we used the expression sN where N was an element of ∗N. This requires explanation, since sn was originally given only for n ∈N. We think of the sequence as a function s:N → R and we regard this function as part of the basic apparatus with which R is initially equipped. Therefore, it has a counterpart on ∗R, which will be a function deﬁned on ∗N and having values in ∗R. It is this function that we have in mind when we write sN for N ∈∗N. We are now ready to give a formal description of the properties we requireournonstandardrealﬁeld∗Rtosatisfy.Forthemomentwewillonly consider ﬁrst order structure onR. Therefore we will not yet be considering the higher order objects of analysis, such as measures, Banach spaces, and the like. We start out in this limited way for pedagogical reasons, to make the task of mastering the fundamental language of nonstandard analysis easier for beginners. (Later on, in Section 5 and especially in Section 6, we will add the machinery of higher type objects which is needed for the full range of arguments in nonstandard analysis.) We consider R as being equipped with all possible ﬁrst order properties (i.e. sets and relations) and functions. We do this in order to have a foundation which is as ﬂexible as possible and which provides any object we might need later in handling speciﬁc mathematical problems. In order to make our basic deﬁnition simpler technically, we handle functions by means of their graphs. Therefore, we take the point of view that our additional structure on R consists of the collection of all possible subsets of every Cartesian power Rn, as n ranges over the integers ≥0. Next we give the key deﬁnition. In it we give a precise description of the properties that must be preserved by passage to the nonstandard extension. The requirements are simple and natural, and they have a distinctly geometric character. (Strictly speaking we are deﬁning here a ﬁrst order concept of nonstandard extension; the deﬁnition will be suitably modiﬁed below when we add higher order objects to our setting.)
2.1. Deﬁnition. [Nonstandard Extension of a Set] Let X be a nonempty set. A nonstandard extension of X consists of a mapping that assigns a set ∗A to each A ⊆Xm for all m ≥0, such that ∗X is non-empty and the following conditions are satisﬁed for all m,n ≥0: (E1) The mapping preserves Boolean operations on subsets of Xm:
5 if A ⊆ Xm, then ∗A ⊆ (∗X)m; if A,B ⊆ Xm, then ∗(A∩B) = (∗A∩∗B), ∗(A∪B) = (∗A∪∗B), and ∗(A\B) = (∗A)\(∗B). (E2) The mapping preserves basic diagonals: if 1 ≤ i < j ≤ m and ∆ = {(x1,...,xm) ∈ Xm | xi = xj} then ∗∆ = {(x1,...,xm)∈(∗X)m | xi = xj}. (E3) The mapping preserves Cartesian products: if A ⊆Xm and B ⊆Xn, then ∗(A×B) = ∗A×∗B. (We regard A×B as a subset of Xm+n.) (E4) The mapping preserves projections that omit the ﬁnal coordinate: let π denote projection of n+1-tuples on the ﬁrst n coordinates; if A ⊆Xn+1 then ∗(π(A)) = π(∗A). While this deﬁnition is reasonably elegant and can be comprehended rather easily, there is certainly some work to be done before we can exploit it. For example, suppose we have a nonstandard extension of R. How do we prove that the subset ∗N of ∗R has the properties that were claimed above? (Namely, that N is an initial segment of ∗N, that the elements of ∗N\N are inﬁnite numbers in ∗R, and that for every positive number r in ∗R there is a unique N ∈∗N which satisﬁes N ≤ r < N +1.) Moreover, ∗R is supposed to be an ordered ﬁeld extension of R, and even this does not seem to be directly guaranteed by the conditions in the deﬁnition. We ﬁrst turn to a series of elementary arguments which prove some of themostbasicpropertiesofnonstandardextensions,especiallythosehaving todowiththehandlingoffunctions.Notonlyaretheresultsimportant,but the arguments illustrate how one can derive information from conditions (E1) – (E4). Near the end of the Section we continue this theme by means of a set of Exercises. 2.2. Proposition. For each n ≥0, ∗(Xn) = (∗X)n and ∗∅=∅. Proof. The ﬁrst equation follows from repeated use of (E3) and the second equation follows from (E1); note that ∗∅= ∗(∅\∅) = ∗∅\∗∅=∅. 2 2.3. Proposition. If A ⊆ Xm is non-empty, then ∗A is also non-empty. Therefore, for any A,B ⊆Xm, ∗A = ∗B ⇐⇒ A = B. Proof. For ease of notation we consider only the case m = 2. Let π2 and π3 be the projections deﬁned by π2(x,y) = x and π3(x,y,z) = (x,y) respectively. If A ⊆X2 is non-empty, then X = π2(π3(X×A)). Using (E4) we get ∗X = π2(π3(∗X×∗A)). Since ∗X is non-empty, it follows that ∗A must also be non-empty. The second statement follows from the ﬁrst and (E1). 2 2.4. Proposition. For all A,B ⊆Xm, A ⊆ B ⇐⇒∗A ⊆∗B.
6 Proof. Suppose A ⊆ B. Then A = A ∩ B, so by (E1) we have ∗A = ∗(A∩B) = ∗A∩∗B and hence ∗A ⊆∗B. The reverse implication follows by a similar argument and Proposition 2.3. 2 2.5. Proposition. For each x ∈X, ∗{x} has exactly one element. Proof. By Proposition 2.3, ∗{x}has at least one element. Let ∆ ={(u,u)| u ∈X}, and note that {x}×{x}={(x,x)}⊆∆. Using (E3) and (E2) we get ∗{x}×∗{x}⊆∗∆ ={(u,u)| u ∈∗X}, from which it follows that ∗{x} has exactly one element. 2 Propositions 2.3 and 2.5 allow us to introduce an embedding of X into ∗X which is canonically associated with the given nonstandard extension. After introducing this embedding, we prove that it is fully compatible with the operation of forming n-tuples, and hence with Cartesian products. 2.6. Notation. For each x ∈ X, we let ∗x denote the unique element of the set ∗{x}. For each x = (x1,...,xn) ∈ Xn we let ∗x = (∗x1,...,∗xn). Note that this gives two usages for an expression of the form ∗β; if β is an element of X, then ∗β is deﬁned in this paragraph, while if β is a subset of some Cartesian power Xm, then ∗β is the subset of (∗X)m which is provided by the given nonstandard extension. 2.7. Deﬁnition. An element of (∗X)n is called standard if it is of the form ∗x for some x ∈Xn. It follows that an element of (∗X)n is standard if and only if all of its coordinates are standard elements of ∗X. 2.8. Proposition. For each x1,...,xn ∈X, ∗{(x1,...,xn)}={(∗x1,...∗xn)}.
Proof. ∗{(x1,...,xn)} = ∗({x1}×•••×{xn}) = ∗{x1}×•••×∗{xn} = {∗x1}×•••×{∗xn}={(∗x1,...,∗xn)}. 2.9. Proposition. For each A ⊆Xm and x1,...,xm ∈X, (x1,...,xm)∈ A ⇐⇒(∗x1,...,∗xm)∈∗A.
Proof. Using Propositions 2.4 and 2.8 note that (x1,...,xm) ∈ A ⇐⇒ {(x1,...,xm)} ⊆ A ⇐⇒ ∗{(x1,...,xm)} ⊆ ∗A ⇐⇒ {(∗x1,...,∗xm)} ⊆ ∗A ⇐⇒(∗x1,...,∗xm)∈∗A. 2
7 2.10. Remark. The map taking x ∈ X to ∗x is an embedding of X into ∗X. Without loss of generality we may assume that X is a subset of ∗X and that ∗x = x for all x ∈X. When this additional condition is satisﬁed, the given nonstandard extension is truly an extension mapping, in the strong sense that for all A ⊆ Xm, (∗A)∩Xm = A (and therefore, in particular, A ⊆∗A). Justiﬁcation. For x,y ∈X we have: ∗x = ∗y ⇐⇒∗{x}= ∗{y}⇐⇒{x}= {y} ⇐⇒ x = y, so this map is an embedding. Therefore we may follow the conventional practice of “identifying” ∗x with x for all x ∈ X. The precise way to do this is to construct an isomorphic nonstandard extension as follows: let Y be a set and h:∗X→Y a bijection, chosen together so that X⊆Y and x = h(∗x) for all x ∈X. For each m ≥0 and each A ⊆Xm, let Θ(A) be the subset of Ym deﬁned by Θ(A) ={(h(u1),...,h(um))|(u1,...,um)∈∗A}. It is a straightforward exercise using the previous Propositions to show that the set mapping Θ is a nonstandard extension. It is easily seen that X ⊆ Y = Θ(X) and Θ({x}) = {x} for all x ∈ X, from which it follows that Θ has the extra properties we wanted to achieve. The facts given in the second sentence of this Remark follow immediately using Proposition 2.9. Note that Θ is isomorphic to the original nonstandard extension in a natural sense. 2 When we established the framework of nonstandard extensions, we stated briskly that we would handle functions by means of their graphs. Now we must show that this actually works. 2.11. Proposition. Suppose A ⊆ Xm and B ⊆ Xn, and let f:A → B be a function; take Γ⊆Xm+n to be the graph of f. Then ∗Γ is the graph of a function from ∗A to ∗B. Proof. For ease of notation we treat only the case m = n = 1. Let π denote the projection deﬁned by π(x,y) = x. The key properties of Γ which reﬂect the fact that it is the graph of a function from A to B are the following: Γ⊆ A×B; π(Γ) = A; and (Γ×Γ)∩{(x,y,u,v)∈X4 | x = u}⊆{(x,y,u,v)∈X4 | y = v}. The ﬁrst of these statements expresses the fact that the domain of the function is contained in A and the range is contained in B. The second statement expresses that the domain of the function is A. The third (displayed) statement expresses the fact that Γ is the graph of a function.
8 Using conditions (E1) – (E4) we conclude: ∗Γ ⊆ ∗A×∗B; π(∗Γ) = ∗A;and (∗Γ×∗Γ)∩{(x,y,u,v)∈(∗X)4 | x = u}⊆{(x,y,u,v)∈(∗X)4 | y = v}. From these conditions the desired statements about ∗Γ follow immediately. 2 2.12. Notation. Suppose A ⊆ Xm and B ⊆ Xn, and let f:A → B be a function; take Γ⊆Xm+n to be the graph of f. We denote by ∗f the function from ∗A to ∗B whose graph is ∗Γ. 2.13. Proposition. If f is the identity function on A ⊆ Xm, then ∗f is the identity function on ∗A. Proof. If f is the identity function on A ⊆Xm, then the graph Γ of f is given by the following deﬁnition: Γ ={(x1,...,xm,y1,...,ym)∈X2m | x1 = y1,...,xm = ym}∩(A×A). This set is the intersection of A×A with m diagonal subsets of X2m, Γ = ∆1 ∩•••∩∆m ∩(A×A), where for each 1≤ j ≤ m we deﬁne ∆j ={(x1,...,xm,y1,...,ym)∈X2m | xj = yj}. Therefore, using (E1) – (E3) we have ∗Γ ={(x1,...,xm,y1,...,ym)∈(∗X)2m | x1 = y1,...,xm = ym}∩(∗A×∗A). Since ∗Γ is the graph of ∗f, this proves the desired result. 2 2.14. Proposition. Suppose A ⊆ Xm and B ⊆ Xn, and let f:A → B be a function. For all (x1,...,xn)∈ A, (∗f)(∗x1,...,∗xn) = ∗(f(x1,...,xn)).
Proof. Take x1,...,xn ∈ A and let y = f(x1,...,xn), so (x1,...,xn,y)∈ Γ where Γ is the graph of f. From Proposition 2.9 we get (∗x1,...,∗xn,∗y)∈ ∗Γ, so that (∗f)(∗x1,...,∗xn) = ∗y. 2 2.15. Proposition. [Permuting and Identifying Variables] Suppose σ is any function from {1,...,m} into {1,...,n}. Given A ⊆Xm deﬁne B ={(x1,...,xn)∈Xn |(xσ(1),...,xσ(m))∈ A}. Then ∗B ={(x1,...,xn)∈(∗X)n |(xσ(1),...,xσ(m))∈∗A}.
9 Proof. For ease of notation we consider the case where A ⊆ X3 is given and B ⊆X2 is deﬁned by B ={(x,y)∈X2 |(y,x,y)∈ A}. Introduce C ⊆X5 by the deﬁnition C ={(x,y,u,v,w)∈X5 | u = y∧v = x∧w = y∧(u,v,w)∈ A}. Evidently B is the result of projecting C onto the ﬁrst two coordinates. Moreover, C is the intersection of three diagonal subsets of X5 and the set X2 ×A. Therefore, it follows using conditions (E1) – (E4) that ∗C ={(x,y,u,v,w)∈(∗X)5 | u = y∧v = x∧w = y∧(u,v,w)∈∗A} and that ∗B is the result of projecting ∗C onto the ﬁrst two coordinates. The desired result follows immediately. 2 2.16. Proposition. Condition (E4) in Deﬁnition 2.1 holds for all projections π from m-tuples to n-tuples, where n ≤ m. (By calling π a projection we mean that there exists a sequence 1 ≤ i(1) < ... < i(n) ≤ m such that π is deﬁned by π(x1,...,xm) = (xi(1),...,xi(n)).) That is, if A ⊆Xm, then ∗(π(A)) = π(∗A). Proof. Let π be as described in the statement of the Proposition. Let σ be a permutation of {1,...,m} so that σ(i(j)) = j for all j = 1,...,n. Let A ⊆Xm be given and deﬁne B ={(x1,...,xm)∈Xm |(xσ(1),...,xσ(m))∈ A}. It is routine to check that π(A) is the same as the result of projecting B onto the ﬁrst n coordinates. Condition (E4) (applied m−n times) therefore implies that ∗(π(A)) is the result of projecting ∗B onto the ﬁrst n coordinates. Proposition 2.15 implies that ∗B ={(x1,...,xm)∈(∗X)m |(xσ(1),...,xσ(m))∈∗A}. Hence π(∗A) is the same as the result of projecting ∗B onto the ﬁrst n coordinates. Therefore ∗(π(A)) = π(∗A). 2 2.17. Proposition. Let A ⊆Xm+n and a = (a1,...,am)∈Xm. Deﬁne A(a) ={(x1,...,xn)∈Xn |(a1,...,am,x1,...,xn)∈ A} and similarly (∗A)(∗a) ={(x1,...,xn)∈(∗X)n |(∗a1,...,∗am,x1,...,xn)∈∗A} Then ∗(A(a)) = (∗A)(∗a).
10
Proof. For ease of notation we consider only the case m = n = 1. Let π denote the projection deﬁned by π(x,y) = y. Note that A(a) = π(A∩({a}×X)). Therefore, using conditions (E1) and (E3), and Proposition 2.16 we have ∗(A(a)) = π(∗A∩(∗{a}×∗X)) = π(∗A∩({∗a}×∗X)) = (∗A)(∗a). 2 2.18. Proposition. Suppose A ⊆Xm,B ⊆Xn and C ⊆Xp; let f:A → B and g:B → C be functions. Then ∗(g◦f) = (∗g)◦(∗f). Proof. For ease of notation we treat only the case m = n = p = 1. Let Γf be the graph of f and Γg the graph of g, and let Γ be the graph of the composition g◦f. Let π be the projection deﬁned by letting π(x,y,u,v) = (x,v). Consider the set A ⊆X4 deﬁned by A ={(x,y,u,v)∈X4 | y = u}∩(Γf ×Γg). Evidently Γ = π(A). The desired result follows immediately from Proposition 2.16. 2 Next we give some Exercises which continue the themes developed above. The reader is advised to solve them, as much as possible using the methods of this Section. They will be easier to solve once the machinery of logical notation is developed, as it will be in the next Section. However, especially for readers who have no previous experience with logic, working these Exercises at this point will bring signiﬁcant beneﬁts. Most of all, such eﬀort will cause the reader to appreciate the advantages of logical notation and to understand how simple are the few technical ideas that it embodies. 2.19. Exercise. Condition (E2) holds for all diagonal sets ∆ ⊆Xn. By a diagonal set we mean that there is an equivalence relation E on {1,...,n} such that ∆ ={(x1,...,xn)∈Xn | xi = xj whenever iEj}. For every such ∆, ∗∆ ={(x1,...,xn)∈(∗X)n | xi = xj whenever iEj}.
If A is a subset of Xm, then {(∗a1,...,∗am) | (a1,...,am) ∈ A} is a subset of ∗A, by Proposition 2.9. Indeed, {(∗a1,...,∗am)|(a1,...,am)∈ A} is precisely the set of standard elements of ∗A. The next two Exercises explore the extent to which {(∗a1,...,∗am) | (a1,...,am) ∈ A} is a proper subset of ∗A.
11
2.20. Exercise. If A is a ﬁnite subset of Xm, then ∗A ={(∗x1,...,∗xm)|(x1,...,xm)∈ A}. In particular, if A is ﬁnite, then ∗A is ﬁnite and has the same cardinality as A, and all of its elements are standard. 2.21. Deﬁnition. A nonstandard extension of X is called proper if for every inﬁnite subset A of X, ∗A contains a nonstandard element. 2.22. Exercise. Suppose our nonstandard extension is proper. Then, for any inﬁnite set A ⊆Xm, ∗A has a nonstandard element. 2.23. Exercise. Let A ⊆Xm and suppose f:A →Xn is a function. (a) If B ⊆ A, then ∗(f(B)) = (∗f)(∗B). (b) If C ⊆Xn, then ∗(f−1(C)) = (∗f)−1(∗C). (c) If B ⊆ A, then ∗(f|B) = (∗f)|(∗B). 2.24. Exercise. For j = 1,...,n let fj:Xm →Xbeafunction,andlet f = (f1,...,fn):Xm → Xn be the function with f1,...,fn as its coordinates. Then ∗f = (∗f1,...,∗fn). 2.25. Exercise. Suppose A ⊆ Xm and B ⊆ Xn, and let f:A → B be a function. (a) f is injective ⇐⇒ ∗f is injective. (b) f is surjective ⇐⇒ ∗f is surjective. (c) If f is a bijection and its inverse is g, then ∗g is the inverse of ∗f. 2.26. Exercise. Consider a nonstandard extension of R. The set ∗R is equipped with binary functions ∗+ and ∗× and with a binary relation ∗<. Equipped with this additional structure, ∗R is an ordered ﬁeld. 2.27. Exercise. Expand all proofs in this Section so that they are fully general and cover all cases of the results being proved. We conclude this Section by using the ultraproduct construction to prove the existence of proper nonstandard extensions. (See Deﬁnition 2.21.) Let J be any inﬁnite set and let U be an ultraﬁlter on J. Consider an indexed family (Aj | j ∈ J) of non-empty sets. We deﬁne the ultraproduct ofthesets(Aj | j ∈ J)withrespecttotheultraﬁlterU.Thiswillbedenoted ΠU(Aj | j ∈ J) or simply ΠUAj. To deﬁne the ultraproduct, consider the ordinary Cartesian product ΠAj of the given family of sets; this is the set of all functions α which are deﬁned on J and which satisfy α(j) ∈ Aj for all j ∈ J. We deﬁne a relation ∼ on ΠAj by α ∼ β ⇐⇒{j ∈ J | α(j) = β(j)}∈U.
12
This is an equivalence relation, as can be proved easily using basic properties of ultraﬁlters. For each α ∈ΠAj let [α] denote the equivalence class of α under ∼. The ultraproduct ΠUAj is then deﬁned to be the set of all equivalence classes [α] as α ranges over ΠAj: ΠU(Aj | j ∈ J):={[α]| α ∈Π(Aj | j ∈ J)}. If the sets (Aj | j ∈ J) are all equal to the same set A, then the ultraproduct ΠU(Aj | j ∈ J) is called an ultrapower of A and it is denoted AJ/U. 2.28. Theorem. [Existence of Nonstandard Extensions] Each nonempty set X has a proper nonstandard extension, in which the set ∗X may be taken to be an ultrapower of X with respect to a countably incomplete ultraﬁlter. Proof. Let J be any inﬁnite index set and let U be any countably incomplete ultraﬁlter on J. This means that there exists a sequence (Fk)k∈N of sets in U whose intersectionT{Fk | k ∈N} is empty. There exists such an ultraﬁlter on each inﬁnite index set J. Moreover if J is countable and U is any nonprincipal ultraﬁlter on J, then it is easy to see that U must be countably incomplete. The underlying set ∗X of our nonstandard extension will be the ultrapower XI/U deﬁned above. Therefore each element of ∗X is an equivalence class [α] for some function α:J →X. Given m ≥0 and A ⊆Xm, we deﬁne ∗A ⊆(∗X)m by: ∗A ={([α1],...,[αm])|{j ∈ J |(α1(j),...,αm(j))∈ A}∈U}. In this deﬁnition, α1,...,αm range over the set XJ of all functions from J into X. We need to show this mapping satisﬁes conditions (E1) – (E4) in Deﬁnition 2.1. (E1) Fix m ≥0 and let A,B ⊆Xm. It is immediate that ∗A is a subsetof ( ∗X)m. Let α1,...,αm be functions from J to X and set F ={j ∈ J |(α1(j),...,αm(j))∈ A} G ={j ∈ J |(α1(j),...,αm(j))∈ B}. Using properties of the ultraﬁlter, it is easy to prove ([α1],...,[αm])∈∗(A∩B)⇐⇒(F ∩G)∈U ⇐⇒ F ∈U∧G ∈U ⇐⇒([α1],...,[αm])∈∗A and ([α1],...,[αm])∈∗B.
13
Similarly
([α1],...,[αm])∈∗(Xm \A)⇐⇒ J \F ∈U ⇐⇒ F 6∈U ⇐⇒([α1],...,[αm])∈(∗Xm)\∗A. This suﬃces to prove condition (E1). (E2) For simplicity of notation we consider the basic diagonal subset of X2 given by ∆ ={(x,y)∈X2 | x = y}. Then ([α],[β])∈∗∆⇐⇒{j ∈ J | α(j) = β(j)}∈U ⇐⇒ α ∼ β ⇐⇒[α] = [β]. This shows that ∗∆ is the desired diagonal subset of (∗X)2. (E3) The fact that Cartesian products are preserved by this mapping is immediate from the deﬁnition and an argument similar to the proof of (E1). (E4)Forsimplicityofnotationweconsideronlyasubset A ofX2 andthe projection π(x,y) = x onto the ﬁrst coordinate. Let B be the projection of A under π. Given functions α,β:J →X, let F ={j ∈ J |(α(j),β(j))∈ A} and G = {j ∈ J | α(j) ∈ B}. If ([α],[β]) ∈ ∗A then F ∈U and F ⊆ G, so that also G ∈U and hence [β] ∈ ∗B. Conversely, suppose [β] ∈ ∗B so that G ∈U. Deﬁne α(j) for j ∈ G by choosing it so that (α(j),β(j)) ∈ A. For j 6∈ G deﬁne α(j) arbitrarily in X. For this pair of functions α,β we have G ⊆ F so F ∈U and therefore ([α],[β])∈∗A. This proves (E4). To ﬁnish the proof we must prove that this nonstandard extension is proper.Let A beaninﬁnitesubsetofX.Considerasequence(Fk)k∈N ofsets in U whose intersectionT{Fk | k ∈N} is empty. Without loss of generality we may assume F0 = J and Fn ⊇ Fn+1 for all n ∈ N. This allows us to deﬁne d(j) for all j ∈ J to be the largest k ∈N for which j ∈ Fk. Note that for all n ∈N and all j ∈ J, d(j) = n if and only if j ∈ Fn \Fn+1. Choose a sequence (ak)k∈N out of A which has no repetitions. Deﬁne α:J → A by setting α(j) = ad(j) for all j ∈ J. It remains to show that [α] is not standard. Indeed, if [α] were equal to the standard element ∗a for some a ∈X, then the set F ={j ∈ J | α(j) = a} would be an element of U. (See Exercise 2.29.) However, the construction of α ensures that F = Fn\Fn+1 for some n ∈N, and therefore F is not an element of U. 2 We conclude this Section with a few Exercises about ultrapower nonstandard extensions. 2.29. Exercise. Let J be an index set and U an ultraﬁlter on J, and consider the ultrapower nonstandard extension of X that is constructed in the proof of Theorem 2.28.
14 (a) For each a ∈ X, ∗a = [α], where α:J → X is the constant functionwith α(j) = a for all j ∈ J. (b) Let f:Xm →X be a function. Let α1,...,αm be elements of XJ anddeﬁne β ∈ XJ by setting β(j) = f(α1(j),...,αm(j)) for all j ∈ J. Then∗ f([α1],...,[αm]) = [β]. Give a similar description of ∗f where f:A → B is any function, with A ⊆Xm, and B ⊆Xn. 2.30. Exercise. Let J be an index set and U an ultraﬁlter on J. Suppose A ⊆ Xm and consider the set ∗A deﬁned in the proof of Theorem 2.28 above. There is a natural way to identify ∗A with the ultrapower AJ/U. (Hint: if α1,...,αm are functions from J to X, then α = (α1,...,αm) may be regarded as a function from J into Xm, and every such function arises in this way. If{j ∈ J | α(j)∈ A}∈U, then there exist functions β1,...,βm from J into X such that (i) for all j ∈ J, (β1(j),...,βm(j)) ∈ A and (ii) βi ∼ αi for all i = 1,...,m.) 2.31. Exercise. LetU be a nonprincipal ultraﬁlter on the index set N and let ∗R be the ultrapower nonstandard extension of R that is deﬁned in the proof of Theorem 2.28 above. (a) Let α:N → R be a sequence which converges to +∞; the element[ α] of ∗R is positive inﬁnite. Give some speciﬁc examples of such inﬁnite elements of ∗R and compare them with respect to the ordering ∗<. (b) Let β:N→R be a sequence which converges to 0 from above; the element [β] is a positive inﬁnitesimal in ∗R. Give some speciﬁc examples of such inﬁnitesimal elements of ∗R and compare them with respect to the ordering ∗<.
3. Logical Formulas In this Section we discuss how to use informal and familiar logical notation to streamline our reasoning about nonstandard extensions. Logical notation is often suggestive and transparent when deﬁning or describing sets and functions. Using logical formulas permits us to take advantage of our natural linguistic and logical abilities. Moreover, this turns out to be an ideal framework for bringing out the main properties of nonstandard extensions. To illustrate this use of formulas, let x,y be variables ranging over nonempty sets A,B respectively, and let ϕ(x,y) and ψ(x,y) denote conditions (formulas) on (x,y) deﬁning subsets Φ and Ψ (respectively) of A×B. We consider certain logical formulas that can be built up from ϕ(x,y) and ψ(x,y) (on the left below) and the sets that are deﬁned by them (on the right):
15
¬ϕ(x,y) deﬁnes the complement of Φ in A×B, ϕ(x,y)∨ψ(x,y) deﬁnes the union Φ∪Ψ, ϕ(x,y)∧ψ(x,y) deﬁnes the intersection Φ∩Ψ, ∃xϕ(x,y) deﬁnes the projection π(Φ), where π(x,y) = y is the projection onto the second coordinate, ∀y ϕ(x,y) deﬁnes {x ∈ A |{x}×B ⊆Φ}. Here we are using familiar logical symbols, which have the following meanings: ¬ stands for the negation, “not”, ∨ stands for the disjunction, “or” ∧ stands for the conjunction, “and”, ∃ stands for the existential quantiﬁer, “there exists”, and ∀ stands for the universal quantiﬁer, “for all.” In our use of the quantiﬁers above, we followed the given restriction that x ranges over A and y ranges over B. This can be made explicit by writing ∃x ∈ Aϕ(x,y) or ∀y ∈ B ψ(x,y) instead of what is written above. In our use of logical formulas we will always have an explicit or implicit understanding about the set over which a given variable ranges. To illustrate the usefulness of these simple ideas, consider a given function f:A → B. The range of f, namely the set f(A), can be deﬁned by the equivalence y ∈ f(A)⇐⇒∃x ∈ A[f(x) = y]. Let Γ be the graph of f, which is deﬁned as a subset of A×B by the formula f(x) = y. Therefore, this equivalence exhibits the fact that f(A) is the projectionofΓundertheprojectionmap π ontothesecondcoordinate.This reduction of arbitrary functions to projections is used frequently; indeed, we have used it already several times in Section 2, when we used condition (E4) of Deﬁnition 2.1 to prove results about functions. Simple and familiar logical equivalences often capture mathematical facts that seem complicated when they are viewed directly without the use of logical formulas. For example, the familiar equivalence ∀y ϕ(x,y)⇐⇒¬∃y¬ϕ(x,y) shows that the set deﬁned by ∀y ϕ(x,y) can be obtained from Φ by ﬁrst taking the complement in A×B, then projecting onto the ﬁrst coordinate, and then taking the complement of that set in A. This technique is particularly useful when dealing with logically complicated notions, such as
16
continuity or diﬀerentiability, which we express in the usual way with ’s and δ’s and quantiﬁers over them. In such cases we often deal with formulas having more than two variables and with repeated quantiﬁers. We will use several additional notational conventions. A formula ϕ(x,y) deﬁning a subset of A × B will also be viewed sometimes as deﬁning a condition on triples (x,y,z), where z ranges over a non-empty set C; in that case ϕ(x,y) deﬁnes a subset of A×B×C. In such a situation we will indicate the formula also as ϕ(x,y,z) to show that we are thinking of this formula as deﬁning a subset of A×B×C instead of just A×B. Here we are making a distinction between the appearance of the formula itself (in which the variable z does not occur) and the notation we use for referring to the formula in a proof or other discussion. This is similar to the situation in algebra where one routinely regards a polynomial p(x,y) as a polynomial in three variables x,y,z in which all monomials containing z are taken to have coeﬃcient 0. For logical formulas the general convention is that when we use notation such as ϕ(x1,...,xn) to refer to a formula, then the variables x1,...,xn must be distinct and they must include all of the variables that occur in the formula in a way that makes them free for substitution. The other variables in the formula, all of which are bound by quantiﬁers, need not be included in this list. We also sometime denote a formula by writing ϕ or ψ without any list of variables, when it is not important to name the variables that may be free for substitution. The context will determine which notation we are using. We use the implication sign →, as in ϕ(x,y) → ψ(x,y), to abbreviate the formula (¬ϕ(x,y))∨ψ(x,y). We use the equivalence symbol ↔, as in ϕ(x,y)↔ ψ(x,y), to abbreviate [ϕ(x,y)→ ψ(x,y)]∧[ψ(x,y)→ ϕ(x,y)]. Now let us consider the particular logical formulas that we will use in working with nonstandard extensions. For the moment, all of our variables will range overX. For each set A ⊆Xm we will regard (x1,...,xm)∈ A as a formula, in which x1,...,xm are variables ranging overX; we do not require these variables to be distinct. Sometimes we will write this formula in the equivalent form A(x1,...,xm), if this ﬁts more smoothly with the usual mathematical role of the set A. In a few situations this formula is written in other ways: for example, if A corresponds to a linear ordering <, in the sense that A is the set of pairs (a,b) which satisfy the ordering condition a < b, then it is natural to write the formula x < y as synonymous with (x,y)∈ A. All of this is quite familiar usage in mathematics. Moreover, in the formula (x1,...,xn)∈ A we can replace some or all of the variables xj by speciﬁc elements of X. If f:A → B is a function, where A ⊆Xm and B ⊆Xn, then we will also makeuseoftheformulas f(x1,...,xm) = (y1,...,yn).IfΓisthegraphof f, then this formula is equivalent to the formula (x1,...,xm,y1,...,yn) ∈ Γ.
17
This corresponds to what we did in the previous Section, handling functions by means of their graphs. Building on the basic formulas discussed in the previous paragraph, we construct more complicated formulas using quantiﬁers (with variables rangingoverthesetX)andthelogicalconnectivesthatarediscussedabove: ¬,∨,∧, →, and ↔. We will refer to these logical formulas as formulas over X. (To be precise, the logical formulas we are using here are ﬁrst order formulas. This reﬂects the fact that the quantiﬁers we use range over elements of X, and we do not have any quantiﬁers ranging over subsets of X or other higher type objects based on X.) To be precise, we have the following deﬁnition by induction: 3.1. Deﬁnition. [Formulas Over X] Let X be a non-empty set. The set of formulas over X is the smallest set of logical formulas which satisﬁes the following closure conditions. (We let x and x1,...,xm,y1,...,yn stand for arbitrary variables, which need not be distinct.) (i) For each set A ⊆Xm, (x1,...,xm)∈ A is a formula over X; (ii) for each function f:A → B, where A ⊆Xm and B ⊆Xn, f(x1,...,xm) = (y1,...,yn) is a formula over X; (iii) if ϕ(x1,...,xm,y1,...,yn) is a formula over X and a1,...,an ∈X,then ϕ(x1,...,xm,a1,...,an) is a formula over X; (iv) if ϕ and ψ are formulas over X, then ¬ϕ, ϕ∨ψ, ϕ∧ψ, ϕ → ψ,ϕ ↔ ψ, ∃xϕ, and ∀xϕ are formulas over X. NotethatfunctionsappearinformulasoverXonlythroughtheirgraphs. This is less restrictive than it may seem at ﬁrst. For example, suppose f, g, and h are functions fromXinto itself, and we want to express the condition that h is the composition of f and g. This can be done using the following formula ∀x∀y[h(x) = y ↔∃z(g(x) = z∧f(z) = y)] which is a formula over X. Suppose < ⊆ X2 is an ordering relation on X, and we want to express the condition that f(x) < g(x) holds for all elements x of X. This can be done using the following formula over X: ∀x∀y∀z[(f(x) = y∧g(x) = z)→ y < z]. In this way we see how statements involving the composition of functions and the substitution of functions in predicates can be expressed using formulas over X. IfweconsideraformulaoverXsyntactically,asastringofsymbols,then it is important to distinguish two diﬀerent ways in which variables can be
18
used. The free variables are the ones for which values can be substituted; all otheroccurrencesofvariablesarebound,meaningthattheiruseiscontrolled by the occurence of quantiﬁers in the formula. The best way to make this precise is to give the following inductive deﬁnition of free variables in a formula ϕ: (i) all variables occurring within a basic formula (x1,...,xm) ∈ A orf (x1,...,xm) = (y1,...,yn) are free variables; (ii) the free variables in ¬ϕ are the same as the free variables in ϕ; (iii) the free variables in ϕ∨ψ, ϕ∧ψ, ϕ → ψ, or ϕ ↔ ψ are the free variables in ϕ together with the free variables in ψ; (iv) the free variables in ∃xϕ, and ∀xϕ are the free variables in ϕ that are distinct from x. If ϕ is a formula whose free variables are among x1,...,xm, we indicate this fact by writing the formula as ϕ(x1,...,xm); when establishing this notation for the ﬁrst time we require that x1,...,xm be distinct. We then will indicate the result of substituting other variables or functional expressions t1,...,tm for x1,...,xm respectively by writing the result of the substitutions in the form ϕ(t1,...,tm); in such a situation we do not require that the substituted expressions t1,...,tm be distinct. A sentence is a logical formula with no free variables. It makes a deﬁnite true-or-false statement about the structures to which it refers. Now we discuss how formulas over X can be used in connection with nonstandard extensions of X. Consider a speciﬁc nonstandard extension of X, based on the set ∗X. We will regard this nonstandard extension as ﬁxed for the rest of this Section. Since ∗X is also a (non-empty) set, we also have the class of formulas over ∗X. We will now see that there is an important connection between the formulas over X and (some of the) formulas over ∗X. We will normally use the convention that lower case variables such as x1,...,xn range over X and upper case variables such as X1,...,Xn range over ∗X. We will never mix the two types of variables in the same formula. More generally, all of the formulas we consider will either be formulas over X or they will be formulas over ∗X. 3.2. Deﬁnition. [∗-Transform of a Formula Over X] Fix a nonstandard extension of the set X. Let ϕ(x1,...,xn) be a formula over X. The ∗transform of ϕ(x1,...,xn) is a formula over ∗X, written ∗ϕ(X1,...,Xn), which is deﬁned inductively by the following conditions: Basis cases: (i) Let {σ(1),...,σ(m)}⊆{1,...,n} and A ⊆Xm; the ∗-transform of the basic formula (xσ(1),...,xσ(m))∈ A is the formula (Xσ(1),...,Xσ(m))∈∗A;
19
similarly, the ∗-transform of f(xσ(1),...,xσ(k)) = (xσ(k+1),...,xσ(m)) is ∗f(Xσ(1),...,Xσ(k)) = (Xσ(k+1),...,Xσ(m)); (ii) if ϕ(x1,...,xn,y1,...,yp) is a basic formula (as treated in (i)) and a1,...,ap ∈ X, then the ∗-transform of ϕ(x1,...,xn,a1,...,ap) is ∗ϕ(X1,...,Xn,∗a1,...,∗ap); Induction cases: Let ϕ and ψ be formulas over X; (iii) the ∗-transform of the negation ¬ϕ is ¬∗ϕ; (iv) the ∗-transform of the disjunction ϕ∨ψ is ∗ϕ∨∗ψ; (v) the ∗-transform of the conjunction ϕ∧ψ is ∗ϕ∧∗ψ; (vi) if x is a variable ranging over X, then the ∗-transform of the quantiﬁed formula ∃xϕ is ∃X ∗ϕ; (vii) if x is a variable ranging over X, then the ∗-transform of the quantiﬁed formula ∀xϕ is ∀X ∗ϕ. While this deﬁnition may look complicated, it is merely the precise formulation of a simple idea: constructing the ∗-transform of a formula ϕ(x1,...,xn) over X requires the following steps: (a) Find all of the sets A ⊆ Xm that occur in ϕ(x1,...,xn) in basic formulas, and replace each such set by its counterpart ∗A over ∗X; similarly, replace each function f:A → B by ∗f and replace each element a of X by ∗a, and (b) replace every variable x in ϕ(x1,...,xn), including the ones that are used with quantiﬁers, by a corresponding variable X which ranges over ∗X. For example, suppose Γ is a subset of X2. The sentence over X given by ∀x∀y∀z[[(x,y)∈Γ∧(x,z)∈Γ]→ y = z]∧∀x∃y[(x,y)∈Γ] expresses the condition that Γ is the graph of a function from X to X. The ∗-transform of this sentence is given by ∀X∀Y∀Z [[(X,Y)∈∗Γ∧(X,Z)∈∗Γ]→ Y = Z]∧∀X∃Y [(X,Y)∈∗Γ]. This is a sentence over ∗X, meaning in particular that the variables X,Y,Z range over ∗X. This sentence expresses the condition that ∗Γ is the graph of a function from ∗X to ∗X. From Proposition 2.11 we know that these two sentences are equivalent, and this is no accident. This is an instance of the Transfer Principle, which we prove next. The Transfer Principle is a ﬂexible and useful result which expresses nearly everything that one needs to know about nonstandard extensions. In particular, it gives precise meaning to the statement “the nonstandard extension of X possesses all of the properties that X does.”
20
3.3. Theorem. [Transfer Principle] Let X be a non-empty set and consider a ﬁxed nonstandard extension of X. (a) Let ϕ(x1,...,xm) be a formula over X and let ∗ϕ(X1,...,Xm) be its ∗-transform. Suppose B ⊆Xm is the set deﬁned by ϕ(x1,...,xm): B ={(x1,...,xm)∈Xm | ϕ(x1,...,xm) is true in X}. Then ∗B is the set deﬁned by ∗ϕ(X1,...,Xm): ∗B ={(X1,...,Xm)∈(∗X)m |∗ϕ(X1,...,Xm) is true in ∗X}. (b) Let ϕ be any sentence over X, and let ∗ϕ be its ∗-transform. Then ϕ is true in X⇐⇒∗ϕ is true in ∗X.
Proof. We prove (a) by induction on the syntactic complexity of formulas over X. In other words we structure our proof so that it follows the same path as the inductive deﬁnition of the ∗-transform. Before giving the inductive proof, we prove that (a) implies (b). Suppose ϕ is a sentence over X and ∗ϕ is its ∗-transform, a sentence over ∗X. Let A be the set deﬁned by ϕ, so that ∗A is the set deﬁned by ∗ϕ, according to the statement above (which we are using in the case n = 0, where the formula treated does not have any variables that are free for substitution). Evidently A is either X0 or ∅, according to whether ϕ is true in X or not. Similarly ∗A is either (∗X)0 or ∅, according to whether or not ∗ϕ is true in ∗X or not. Proposition 2.2 implies that either A = X0 and ∗A = (∗X)0 must both hold, or A =∅ and ∗A =∅ must both hold. The equivalence of ϕ and ∗ϕ follows immediately. Now we turn to the inductive proof of (a). For the basis step we must consider formulas ϕ(x1,...,xn) of the form (xσ(1),...,xσ(m)) ∈ A, where {σ(1),...,σ(m)} ⊆ {1,...,n} and A ⊆ Xm. Then ∗ ϕ(X1,...,Xn) is (Xσ(1),...,Xσ(m)) ∈ ∗A. Let B be the set of all (x1,...,xn) ∈ Xn for which (xσ(1),...,xσ(m))∈ A is true. Proposition 2.15 states that ∗B is the set of all (X1,...,Xn) ∈ (∗X)n for which (Xσ(1),...,Xσ(m)) ∈ ∗A is true. This is what we needed to prove. More generally, in the basis step we must also take into account the possibility that one or more variables in (xσ(1),...,xσ(m))∈ A are replaced by speciﬁc elements of X. This is handled using Propositions 2.17 and 2.15. The induction steps are handled using the conditions in Deﬁnition 2.1 directly. The logical connectives are handled using condition (E1). Existential quantiﬁers are handled using the strengthening of (E4) that is given in Proposition 2.16; the stronger form of (E4) is needed in case the existentially quantiﬁed variable is not the last variable in the given list. Finally, the
21
duality between universal and existential quantiﬁers means that universal quantiﬁers can be handled as a combination of negations and existential quantiﬁers. 2
We remark that the Transfer Principle exactly captures the content of the deﬁnition of nonstandard extension. That is, if the Transfer Principle holds and if the equality relation = is given its usual interpretation in the nonstandard extension, then conditions (E1) – (E4) must be true. Proving this is an exercise in the use of logical formulas. Usually the Transfer Principle is explicitly included in the deﬁnition of nonstandard extension. We have delayed our discussion of the Transfer Principle in order to avoid heavy use of logical formulas at the beginning of the exposition and to permit introducing logical notation in a natural and convincing way. To illustrate the usefulness of this result, let us treat some of the Exercises from Section 2. First consider Exercise 2.23. For ease of notation, assume m = n = 1. Let A,B,C and f be as given there. The set f(B) is deﬁned by the equivalence x ∈ f(B)⇐⇒∃y[y ∈ B∧f(y) = x]. Therefore, the Transfer Principle gives us that the equivalence X ∈∗(f(B))⇐⇒∃Y [Y ∈∗B∧∗f(Y) = X] holds in ∗X. But the formula on the right side of this equivalence deﬁnes (∗f)(∗B), so we have the equality needed for part (a) of the Exercise. For part (b) we use the equivalence x ∈ f−1(C)⇐⇒∃y[y ∈ C ∧f(x) = y] and for part (c) we use the equivalence (f|B)(x) = y ⇐⇒[f(x) = y∧x ∈ B]. In both cases the Transfer Principle gives us immediately what is needed. NowconsiderExercise2.24.Forsimplicitytake m = n = 2.Thefunction f is characterized by the equivalence [f(u,v) = (x,y)]⇐⇒[f1(u,v) = x∧f2(u,v) = y] where u,v,x,y are variables ranging over X. The Transfer Principle yields that the equivalence [(∗f)(U,V ) = (X,Y)]⇐⇒[(∗f1)(U,V ) = X ∧(∗f2)(U,V ) = Y]
22
holds in ∗X. This implies ∗f = (∗f1,∗f2) as desired. Next we treat Exercise 2.25. For ease of notation we consider only the case m = n = 1. Suppose A ⊆ X and B ⊆ X, and let f:A → B be a function. For part (a), we note that f is injective if and only if the sentence ∀x∀y∀z[[f(x) = z∧f(y) = z]→ x = y] is true in X. The ∗-transform of this sentence is ∀X∀Y∀Z [[∗f(X) = Z ∧∗f(Y) = Z]→ X = Y]. This sentence holds in ∗X if and only if the function ∗f is injective. Therefore the Transfer Principle gives the desired result immediately. Similar arguments using other simple sentences will easily give parts (b) and (c). Finally we treat Exercise 2.26. Each of the axioms for ordered ﬁelds can be expressed as a ﬁrst-order sentence in which the quantiﬁers range over the underlying set of the ﬁeld. For example, the statement that every nonzero element of R has a multiplicative inverse is expressed by the following sentence over R: ∀x∃y[¬x = 0→ x×y = 1]. The ∗-transform of this sentence is ∀X∃Y [¬X = ∗0→ X ∗×Y = ∗1]. By theTransferPrinciple,thissentence istruein∗R.Since∗0is theadditive identity and ∗1 is the multiplicative identity in ∗R, as is shown in a similar way using other sentences over R, it follows that every non-zero element of ∗R has a multiplicative inverse. Similar arguments complete the Exercise.
4. Nonstandard Extensions of Multisets In many parts of mathematics it is customary to encounter not just a single set,butseveralsetswhichareinteractinginsomeway.Forexample,avector space over the real ﬁeld consists of the set R together with the underlying set W of the vector space. Among the objects which are included in this vector space setting is the operation of scalar multiplication, which is a function from R×W to W. If W is a normed space, it is convenient to add the dual space W0 as a third set. One operation which involves all three of these sets is the pairing hw,fi:= f(w), considered as a function from W×W0 into R. It is therefore natural to extend our concept of nonstandard extension to this kind of setting. Fortunately it is easy to do, requiring nothing more than a more elaborate notation. We lay out the details in this Section,
23
but we omit proofs since they are so close to the ones which we gave in Sections 2 and 3. This material will be required in the next two Sections when we develop frameworks for introducing higher type objects into the foundations of nonstandard analysis. Fix a non-empty index set I. The objects we consider here consist of families of (non-empty) sets indexed over I. We will refer to them as multisets or as many sorted sets, when the speciﬁc reference to I is omitted, and as I-sets when I needs to be mentioned. 4.1. Deﬁnition. An I-set XI is an indexed family (Xi)i∈I of sets. A sort of the I-set XI is one of the sets Xi, where i ∈ I. We say that XI is non-empty if Xi is non-empty for every i ∈ I. We now establish some notation for dealing with I-sets. We will let letters such as α,β,γ stand for ﬁnite sequences taken from I. Usually we will write α for the sequence α(1),...,α(m); m will be called the length of α and we will also denote the length by |α|. Similarly we will normally understand that n is the length of the sequence β and p is the length of γ. Given such a ﬁnite sequence α from I and given an I-setXI, we consider the Cartesian product of the sorts ofXI that are indexed by the coordinates of α; our notation for this Cartesian product is the following: Xα = Xα(1) ×•••×Xα(m). We consider the I-set XI as equipped with all possible subsets of every Cartesian product Xα, where α ranges over all ﬁnite sequences from the index set I. In particular, this includes the graph of every function from one such Cartesian product Xα to another Cartesian product Xβ. If f:Xα → Xβ is such a function, then its graph is a subset of Xα ×Xβ. Note that this product is also a Cartesian product of sorts. Indeed, Xα ×Xβ = Xγ, where γ istheconcatenationof α and β: γ = α(1),...,α(m),β(1),...,β(n); |γ|= m + n. Now we are ready to give the deﬁnition of nonstandard extension for I-sets. This results from a straightforward modiﬁcation of the concept of nonstandard extension introduced Deﬁnition 2.1 for single sets. A nonstandard extension of an I-set XI = (Xi)i∈I will be another non-empty I-set (∗Xi)i∈I. For each ﬁnite sequence α from I, we will use the notation (∗X)α for the Cartesian product ∗Xα(1) ×•••×∗Xα(m). 4.2. Deﬁnition. [Nonstandard Extension of a Multiset] Let XI be a non-empty I-set. A nonstandard extension of XI is a mapping which assigns a set ∗A to each A ⊆Xα for all ﬁnite sequences α from I, such that
24 ∗Xi is non-empty for all i ∈ I and the following conditions are satisﬁed for all ﬁnite sequences α,β from I: (M1) The mapping preserves Boolean operations on subsets of Xα: if A ⊆ Xα, then ∗A ⊆ (∗X)α; if A,B ⊆ Xα, then ∗(A∩B) = (∗A∩∗B), ∗(A∪B) = (∗A∪∗B), and ∗(A\B) = (∗A)\(∗B). (M2) The mapping preserves basic diagonals: suppose 1≤ i < j ≤ m =|α| and suppose α(i) = α(j); if ∆ ={(x1,...,xm)∈Xα | xi = xj} then ∗∆ ={(x1,...,xn)∈(∗X)α | xi = xj}. (M3) The mapping preserves Cartesian products: if A ⊆Xα and B ⊆Xβ, then ∗(A×B) = ∗A×∗B. (M4) The mapping preserves projections that omit the ﬁnal coordinate: suppose α has length n + 1 and let π be projection of n + 1-tuples on the ﬁrst n coordinates; if A ⊆Xα, then ∗(π(A)) = π(∗A). For the rest of this Section we ﬁx a nonstandard extension of XI, based on the non-empty I-set (∗Xi)i∈I. We now follow exactly the same path of Propositions and Exercises as in Sections 2 and 3. In order to be clear about what is intended, we give the results in a precisely worded form, modiﬁed appropriately for I-sets. It is routine to modify the arguments given in Sections 2 and 3 for this new setting, and we therefore omit all proofs here. 4.3. Proposition. For each ﬁnite sequence α of elements of I, ∗(Xα) = (∗X)α and ∗∅=∅. 4.4. Proposition. If A ⊆ Xα is non-empty, then ∗A is also non-empty. Therefore, for any A,B ⊆Xα, ∗A = ∗B ⇐⇒ A = B. 4.5. Proposition. For all A,B ⊆Xα, A ⊆ B ⇐⇒∗A ⊆∗B. 4.6. Proposition. For each i ∈ I and each x ∈ Xi, ∗{x} has exactly one element. 4.7. Notation. For each i ∈ I and each x ∈ Xi, we let ∗x denote the unique element of the set ∗{x}. For each x = (x1,...,xm) ∈ Xα we let ∗x = (∗x1,...,∗xm). 4.8. Deﬁnition. An element of (∗X)α is called standard if it is of the form ∗x for some x ∈Xα. It follows that an element of (∗X)α is standard if and only if all of its coordinates are standard elements of the appropriate sorts ∗Xα(j). 4.9. Proposition. For each (x1,...,xm)∈Xα, ∗{(x1,...,xm)}={(∗x1,...∗xm)}.
25
4.10. Proposition. For each A ⊆Xα and (x1,...,xm)∈Xα, (x1,...,xm)∈ A ⇐⇒(∗x1,...,∗xm)∈∗A.
4.11. Proposition. Suppose A ⊆Xα and B ⊆Xβ, and let f:A → B be a function; take Γ ⊆Xα ×Xβ to be the graph of f. Then ∗Γ is the graph of a function from ∗A to ∗B. 4.12. Notation. Suppose A ⊆ Xα and B ⊆ Xβ, and let f:A → B be a function; take Γ to be the graph of f. We denote by ∗f the function from ∗A to ∗B whose graph is ∗Γ. 4.13. Proposition. If f is the identity function on A ⊆ Xα, then ∗f is the identity function on ∗A. 4.14. Proposition. Suppose A ⊆Xα and B ⊆Xβ, and let f:A → B be a function. For all (x1,...,xm)∈ A, (∗f)(∗x1,...,∗xm) = ∗(f(x1,...,xm)).
4.15. Proposition. [Permuting and Identifying Variables] Suppose α,β are ﬁnite sequences from I, with m = |α| and n = |β|. Suppose σ is any function from {1,...,m} into {1,...,n}. Assume β(σ(j)) = α(j) for all j = 1,...,m. Given A ⊆Xα deﬁne B ={(x1,...,xn)∈Xβ |(xσ(1),...,xσ(m))∈ A}. Then ∗B ={(x1,...,xn)∈(∗X)β |(xσ(1),...,xσ(m))∈∗A}.
4.16. Proposition. Condition (M4) in Deﬁnition 4.2 holds for all projections π. 4.17. Proposition. Let A ⊆ Xγ and a = (a1,...,am) ∈ Xα, where γ is the sequence obtained by putting β after α. Deﬁne A(a) ={(x1,...,xn)∈Xβ |(a1,...,am,x1,...,xn)∈ A} and similarly (∗A)(∗a) ={(x1,...,xn)∈(∗X)β |(∗a1,...,∗am,x1,...,xn)∈∗A} Then ∗(A(a)) = (∗A)(∗a).
26 4.18. Proposition. Suppose A ⊆Xα,B ⊆Xβ and C ⊆Xγ; let f:A → B and g:B → C be functions. Then ∗(g◦f) = (∗g)◦(∗f). 4.19. Exercise. Condition (M2) holds for all diagonal sets ∆⊆Xα. 4.20. Exercise. If A is a ﬁnite subset of Xα, then ∗A = {(∗x1,...,∗xm) | (x1,...,xm) ∈ A}. In particular, ∗A is ﬁnite and has the same cardinality as A, and all of its elements are standard. 4.21. Deﬁnition. A nonstandard extension of XI is called proper if for every i ∈ I and every inﬁnite subset A of Xi, ∗A contains a nonstandard element. 4.22. Exercise. Suppose our nonstandard extension is proper. Then, for any inﬁnite set A ⊆Xα, ∗A has a nonstandard element. 4.23. Exercise. Let A ⊆Xα and suppose f:A →Xβ is a function. (a) If B ⊆ A, then ∗(f(B)) = (∗f)(∗B). (b) If C ⊆Xβ, then ∗(f−1(C)) = (∗f)−1(∗C). (c) If B ⊆ A, then ∗(f|B) = (∗f)|(∗B). 4.24. Exercise. For j = 1,...,n let fj:Xα →Xβ(j) be a function, and let f = (f1,...,fn):Xα → Xβ be the function with f1,...,fn as its coordinates. Then ∗f = (∗f1,...,∗fn). 4.25. Exercise. Suppose A ⊆ Xα and B ⊆ Xβ, and let f:A → B be a function. (a) f is injective ⇐⇒ ∗f is injective. (b) f is surjective ⇐⇒ ∗f is surjective. (c) if f is a bijection and its inverse is g, then ∗g is the inverse of ∗f.
Next we introduce logical formulas in order to state the Transfer Principle for nonstandard extensions of I-sets. Let XI be a ﬁxed I-set. For each i ∈ I we will make use of variables that range over the sort Xi; no other variables will be used in formulas over the I-set XI. If necessary, we will indicate that a variable ranges over the sort Xi by including i as a superscript in the name of the variable; thus xi,yi,xi j all denote variables that range over Xi. However, we will usually omit such superscripts and let the context determine the sort over which a given variable ranges. For each ﬁnite sequence α from I and for each set A ⊆Xα we will regard( x1,...,xm) ∈ A as a formula; x1,...,xm are variables with the property that for each j = 1,...,m the variable xj ranges over the sort Xα(j). As before, we do not require these variables to be distinct. If f:A → B is a function, where A ⊆ Xα and B ⊆ Xβ, then we also take f(x1,...,xm) = (y1,...,yn) to be a formula, where each variable xi ranges over the sort Xα(i) and each yj ranges over Xβ(j). Moreover, in these basic formulas we
27
can replace some or all of the variables by speciﬁc elements of the sorts over which they range. We construct more complicated formulas using quantiﬁers (with variables ranging over the sorts of XI) and the logical connectives ¬,∨,∧, →, and ↔. We will refer to these logical formulas as formulas over XI. It is left to the reader to formulate a precise deﬁnition of this set of formulas similar to Deﬁnition 3.1. Now we discuss how formulas over XI can be used in connection with nonstandard extensions of XI. Consider a speciﬁc nonstandard extension of XI, based on the I-set (∗Xi)i∈I. We will regard this nonstandard extension as ﬁxed for the rest of this Section. Since (∗Xi)i∈I is also an I-set, we also have the class of formulas over (∗Xi)i∈I. As before, we will see that there is an important connection between the formulas over XI and (some of the) formulas over (∗Xi)i∈I. We will again use the convention that lower case variables such as xi range over speciﬁc sorts Xi and the corresponding upper case variables Xi range over the corresponding sort ∗Xi of the nonstandard extension. As noted above, however, we will not always include the superscript and will let the natural context determine the role of the variables as much as possible. 4.26. Deﬁnition. [∗-Transform of a Formula Over XI] Consider a given nonstandard extension of the I-set XI. Let ϕ(x1,...,xn) be a formula over XI. The ∗-transform of ϕ(x1,...,xn) is a formula over (∗Xi)i∈I, written ∗ϕ(X1,...,Xn), which is deﬁned inductively by the following conditions: Basis cases: (i) Let {σ(1),...,σ(m)} ⊆ {1,...,n} and A ⊆ Xα, with |α| = m; the ∗-transform of the basic formula (xσ(1),...,xσ(m))∈ A is the formula (Xσ(1),...,Xσ(m))∈∗A; similarly, the ∗-transform of f(xσ(1),...,xσ(k)) = (xσ(k+1),...,xσ(m))
is
∗f(Xσ(1),...,Xσ(k)) = (Xσ(k+1),...,Xσ(m)); (ii) suppose ϕ(x1,...,xn,y1,...,yp) is a basic formula (as treated in (i)), where the variables xi range over sort Xβ(i) for all i = 1,...,n and the variables yj range over sort Xγ(j) for all j = 1,...,p; if aj ∈ Xγ(j)
28 for all j = 1,...,p, then the ∗-transform of ϕ(x1,...,xn,a1,...,ap) is ∗ϕ(X1,...,Xn,∗a1,...,∗ap); Induction cases: Let ϕ and ψ be formulas over X; (iii) the ∗-transform of the negation ¬ϕ is ¬∗ϕ; (iv) the ∗-transform of the disjunction ϕ∨ψ is ∗ϕ∨∗ψ; (v) the ∗-transform of the conjunction ϕ∧ψ is ∗ϕ∧∗ψ; (vi) if x is a variable ranging over a sort Xi, then the ∗-transform of the quantiﬁed formula ∃xϕ is ∃X ∗ϕ, in which X ranges over ∗Xi; (vii) if x is a variable ranging over a sort Xi, then the ∗-transform of the quantiﬁed formula ∀xϕ is ∀X ∗ϕ, in which X ranges over ∗Xi. As before, this deﬁnition captures a simple idea: constructing the ∗transform of a formula ϕ(x1,...,xn) over XI requires the following steps: (a) Find all of the sets A ⊆ Xα that occur in ϕ(x1,...,xn) in basic formulas, and replace each such set by its counterpart ∗A over (∗Xi)i∈I; similarly, replace each function f:A → B by ∗f and replace each element a of a sort Xi by ∗a, and (b) replace every variable xi in ϕ(x1,...,xn), including the ones that are used with quantiﬁers, by a corresponding variable Xi which ranges over ∗Xi. 4.27. Theorem. [Transfer Principle for I-sets] LetXI be an I-set and consider a ﬁxed nonstandard extension of XI, based on the I-set (∗Xi)i∈I. (a) Let ϕ(x1,...,xm) be a formula over XI; let α(j) be the index of the sort over which xj ranges, for each j = 1,...,m; let ∗ϕ(X1,...,Xm) be the ∗-transform of this formula. Suppose B ⊆ Xα is the set deﬁned by ϕ(x1,...,xm): B ={(x1,...,xm)∈Xα | ϕ(x1,...,xm) is true in XI}. Then ∗B is the set deﬁned by ∗ϕ(X1,...,Xm): ∗B ={(X1,...,Xm)∈(∗X)α | ϕ(X1,...,Xm) is true in (∗Xi)i∈I}. (b) Let ϕ be any sentence over XI, and let ∗ϕ be its ∗-transform. Then ϕ is true in XI ⇐⇒∗ϕ is true in (∗Xi)i∈I.
4.28. Theorem. [Existence of Nonstandard Extensions] Each nonempty I-set XI has a proper nonstandard extension, in which the sets (∗Xi)i∈I may be taken to be ultrapowers of the sets (Xi)i∈I with respect to a ﬁxed countably incomplete ultraﬁlter.
29 Proof. Let J be any inﬁnite index set and let U be any countably incomplete ultraﬁlter on J. For each i ∈ I let ∗Xi be the ultrapower XJ i /U. Foreach ﬁnite sequence α from I and each set A ⊆Xα, deﬁne ∗A by ∗A ={([γ1],...,[γm])|{j ∈ J |(γ1(j),...,γm(j))∈ A}∈U}. In this deﬁnition, for each k = 1,...m we let γk range over the set XJ α(k) of all functions from J into Xα(k), so that [γk] denotes a typical element of the ultrapower XJ α(k)/U The proof that this deﬁnes a proper nonstandard extension of XI is similar to the proof of Theorem 2.28, and we leave the details to the reader as an Exercise. 2 5. Nonstandard Extensions of the Multiset (X,P(X)) In this Section we will use the methods developed in Section 4 to give an indication of how to introduce higher type objects into the framework of nonstandard analysis. First we consider a non-empty set X and the collectionP(X) of all subsets of X. We regard this as a multiset (X0,X1) indexed over a set of two elements, with X0 = X and X1 =P(X). Consider an arbitrary nonstandard extension of (X,P(X)), which we denote as (∗X,∗P(X)). Let E be the restriction of the membership relation ∈ to X and P(X): E ={(x,A)∈X×P(X)| x ∈ A}. As usual, write P(∗X) for the collection of all subsets of ∗X. 5.1. Remark. Without loss of generality we may assume the given nonstandard extension satisﬁes the following conditions: (a) X⊆∗X and ∗x = x for all x ∈X; (b) ∗P(X) ⊆P(∗X) and ∗E is the restriction of the usual membership relation to ∗X×∗P(X): ∗E ={(x,Y)∈∗X×∗P(X)| x ∈ Y}.
Justiﬁcation. We show that every nonstandard extension of the multiset (X,P(X)) is isomorphic to a nonstandard extension which satisﬁes (a) and (b).FirstwecarryoutastepliketheoneinthejustiﬁcationofRemark2.10. As done there, let Y be a suitable set and h:∗X → Y a bijection, chosen so that X ⊆ Y and h(∗x) = x for all x ∈ X. Moreover, given Y ∈ ∗P(X), deﬁne Φ(Y) ={h(x)| x ∈∗X and (x,Y)∈∗E},
30 which is a subset of Y. It is easy to check that Φ is a 1-1 map on ∗P(X). Finally, we deﬁne the new nonstandard extension of (X,P(X)) to ensure that the pair (h,Φ) of bijections is an isomorphism of nonstandard extensions. Thatis,inthenewnonstandardextensionwemapeachset A ⊆Xm×P(X)n to the set {(h(x1),...,h(xm),Φ(Y1),...,Φ(Yn))|(x1,...,xm,Y1,...,Yn)∈∗A}. It is routine to check that this new mapping is a nonstandard extension of (X,P(X)) and that it satisﬁes conditions (a) and (b). 2 In this kind of situation it is often convenient to suppress the explicit use of the maps h and Φ; rather we may follow a customary abuse of notation and “identify” ∗x with x for each x ∈ X. With this understanding, the deﬁnition of Φ(Y) for each Y ∈∗P(X) becomes Φ(Y) ={x ∈∗X|(x,Y)∈∗E}. Wemaythenidentify Y withthesubsetΦ(Y)of∗Xdeﬁnedinthisway.This is particularly convenient when (as later in this Section) the nonstandard extension has been constructed in an explicit way, such as we do here using the ultrapower construction.
For the rest of this Section we assume that we have a nonstandard extension of (X,P(X)) which satisﬁes (a) and (b) in Remark 5.1. Condition (b) in Remark 5.1 ensures that the elements of ∗P(X) are ordinary subsets of ∗X, and that the ∗-transform of any formula ϕ over (X,P(X))iswellbehavedwithrespecttothemembershiprelation.Suppose x is a variable ranging over X and y is a variable ranging over P(X), and suppose x ∈ y occurs in ϕ. Recall that x ∈ y is equivalent to the basic formula (x,y)∈ E; the process of forming the ∗-transform will replace this basic formula by (X,Y) ∈ ∗E, which is equivalent to X ∈ Y according to condition (b). In other words, in forming the ∗-transform we may simply replace basic formulas of the form x ∈ y by X ∈ Y, when the nonstandard extension satisﬁes (b). Consider a subset A of X. It can be considered either as a subset of X or as an element of P(X). Accordingly, there are two possible interpretations of the expression ∗A: let us temporarily write ∗(A) for the set which the given nonstandard extension assigns to A, and reserve the notation ∗A (as in paragraph 4.7) to denote the unique element of the set ∗({A}) which this nonstandard extension assigns to{A}. Both of these are subsets of ∗X. Fortunately they are equal when we adopt the normalization described in Remark 5.1, as we now prove. 5.2. Proposition. For each A ⊆X, we have ∗(A) = ∗A.
31 Proof. Fix A ⊆X and let E be the restriction of the membership relation as above. Evidently we have that the sentence ∀x ∈X[(x,A)∈ E ↔ x ∈ A] istrueinourbasicstructure(X,P(X)).BytheTransferPrinciple(Theorem 4.27), we conclude that ∀X ∈∗X[(X,∗A)∈∗(E)↔ X ∈∗(A)] holds in the nonstandard extension. Using condition (b) in Remark 5.1, we see that ∀X ∈∗X[X ∈∗A ↔ X ∈∗(A)] holds in the nonstandard extension. This proves ∗(A) = ∗A since both are subsets of ∗X. 2 Next we introduce one of the key distinctions in nonstandard analysis: the distinction between internal and external subsets of ∗X. 5.3. Deﬁnition. [Internal Subset of ∗X] A subset A of ∗X is internal if it is an element of ∗P(X); A is external if it is not internal. We note that it is only internal subsets of ∗X that are referred to within the∗-transform of a logical formula over (X,P(X)). That is, all variables in such a formula either range over ∗Xitself, or they range over ∗P(X). If ϕ is a logical sentence over (X,P(X)) and ∗ϕ is its∗-transform, it follows that we get the same truth value for ∗ϕ in the nonstandard extension (∗X,∗P(X)) as in the multiset (∗X,P(∗X)). The same is true for logical formulas over (X,P(X)) into which we have substituted elements of ∗X for all the free ﬁrst order variables and internal subsets of ∗X for all the free set variables. (This need not be true if we substitute external subsets of ∗X for one or more of the free set variables in ∗ϕ and interpret it in (∗X,P(∗X)).) The next result is an easy consequence of the Transfer Principle, but it is a key tool for handling internal sets. 5.4. Theorem. [Internal Deﬁnition Principle] Let ϕ(x,x1,...,xm,y1,...,yn) be a formula over the multiset (X,P(X)). Suppose the variables x and xj range over X for each j and the variable yk ranges over P(X) for each k. Let a1,...,am ∈ ∗X and let A1,...,An be internal subsets of ∗X. Let B be the subset of ∗X deﬁned by ∗ϕ(X,a1,...,am,A1,...,An): B ={X ∈∗X|∗ϕ(X,a1,...,am,A1,...,An) is true in (∗X,∗P(X))}. Then B is internal.
32
Proof. Apply the Transfer Principle (Theorem 4.27) to the sentence ∀x1 ...∀xm∀y1 ...∀yn∃z∀x[x ∈ z ↔ ϕ(x,x1,...,xm,y1,...,yn)], which is true in the (X,P(X)); therefore the sentence ∀X1 ...∀Xm∀Y1 ...∀Yn∃Z∀X [X ∈ Z ↔∗ϕ(X,X1,...,Xm,Y1,...,Yn)] is true in the nonstandard extension (∗X,∗P(X)). Note that the variables X and X1,...,Xm range over ∗X and Y1,...,Yn are restricted to range over ∗P(X). Substituting aj for Xj for each j = 1,...,m and Ak for Yk for each k = 1,...,n gives the desired result. Note that the substitution of Ak for Yk is permitted only because Ak is assumed to be an internal subset of ∗X. This is a key aspect of the Internal Deﬁnition Principle. 2 5.5. Exercise. Let A,B be internal subsets of ∗X. (i) Every Boolean combination of A,B is internal. (ii) If f:X → X is any function, then the sets (∗f)(A) and (∗f)−1(A) are internal. (iii) Every standard element of ∗P(X) is an internal subset of ∗X. Now we return to the setting in which X = R. Consider the linear ordering < as a subset of R2 and the graphs Γ+ and Γ× of the functions + and × as subsets of R3. By Proposition 4.11, ∗Γ+ and ∗Γ× are subsets of (∗R)3 which are graphs of functions from (∗R)2 to ∗R. For ease of notation, we will follow the customary practice of dropping the ∗ and denoting these functions as + and ×. Using condition (a) in Remark 5.1 and Proposition 4.14,itfollowsthat+and×on(∗R)2 areextensionsoftheoriginalfunctions + and×on R2. Similarly, the relation < on ∗R is an extension of the given linear ordering < on R. Also, for any A ⊆R, A is easily seen to be a subset of ∗A using similar reasoning. The following Exercise can be fairly easily solved using the ideas above, especially including the Transfer Principle (Theorem 4.27) and the Internal Deﬁnition Principle (Theorem 5.4). 5.6. Exercise. (i) (∗R,+,×,<) is an ordered ﬁeld extension of the ordered ﬁeld (R,+,×,<). (ii) N is an initial segment of ∗N and the elements of ∗N\N are inﬁnite numbers in ∗R. (iii) For every positive r ∈ ∗R there exists a unique N ∈ ∗N such thatN ≤ r < N +1. (iv) If A is a non-empty internal subset of ∗R which is bounded above in ∗R, then A has a least upper bound in ∗R; this need not be true if A is external. (v) For each N ∈∗N, let {0,1,...,N} denote the set of M ∈∗N which satisfy 0≤ M ≤ N; the set {0,1,...,N} is internal.
33 (vi) For each r < s in ∗R let [r,s] denote the set of all t ∈∗R such thatr ≤ t ≤ s; the set [r,s] is internal. (vii) The set ∗N\N is not an internal subset of ∗R. (viii) The set of inﬁnitesimal elements of ∗R is not an internal subset of ∗R. (ix) The set of ﬁnite elements of ∗R is not an internal subset of ∗R. (x) (Overspill Principle) Let A be an internal subset of ∗R; if A contains arbitrarily large ﬁnite numbers, then it also contains an inﬁnite positive number. (xi) (Underspill Principle) Let A be an internal subset of ∗R; if A contains arbitrarily small positive inﬁnite numbers, then it also contains a positive ﬁnite number. We brieﬂy indicate an expansion of this approach which allows treatment of internal functions between internal subsets of∗X. To introduce such functions we consider the multiset (X,P(X),P(X×X)); if A,B ⊆ X and f:A → B is a function, then we regard f as an element of this multiset by considering its graph Γf, which is an element of the third sort P(X×X)). Consider an arbitrary nonstandard extension of (X,P(X),P(X×X)), which we denote as (∗X,∗P(X),∗P(X×X)). Expanding on the discussion in Remark 5.1, we may pass to an isomorphic nonstandard extension which satisﬁes the following three conditions. We use the notation E1 ={(x,A)∈X×P(X)| x ∈ A}; E2 ={(x,y,A)∈X×X×P(X×X)|(x,y)∈ A}. (a) X⊆∗X and ∗x = x for all x ∈X; (b) ∗P(X)⊆P(∗X) and ∗E1 is the restriction of the usual membership relation to ∗X×∗P(X): ∗E1 ={(x,Y)∈∗X×∗P(X)| x ∈ Y}. (c) ∗P(X×X) ⊆ P(∗X×∗X) and ∗E2 is the restriction of the usual ordered pairs membership relation to ∗X×∗X×∗P(X×X): ∗E2 ={(x,y,Y)∈∗X×∗X×∗P(X×X)|(x,y)∈ Y}. Internal subsets of ∗X are handled as was done earlier in this Section. Similarly,wecallasubsetof∗X×∗Xinternalifitisanelementof∗P(X×X). If A,B are subsets of ∗X and f:A → B is any function, we say that f is an internal function if its graph Γf is an internal subset of ∗X×∗X. 5.7. Exercise. Consider the multiset (R,P(R),P(R×R) and a nonstandard extension of it (∗R,∗P(R),∗P(R×R)), which has been normalized so
34
that conditions (a), (b), and (c) above are satisﬁed. We adopt the notation described just before Exercise 5.6. (i) If f is an internal function between subsets of ∗R, then the domain and range of f are internal subsets of ∗R. (ii)If f is an internalfunctionbetween subsets of∗Rand A is an internal set contained in the domain of f, then the restriction of f to A is internal. (iii) If f,g are internal functions between subsets of ∗R and the domain of g contains the range of f, then the composition g◦f is internal. (iv) Suppose f:∗N → ∗R is an internal function; there exists a unique internal function F:∗N → ∗R such that F(0) = f(0) and for all n ∈ ∗N, F(n +1) = F(n)+ f(n +1). 5.8. Remark. Consider the setting of part (iv) in the previous Exercise. The function F can be viewed as the result of summing the values of f over initial segments of ∗N, and this is a useful idea in many applications of nonstandard analysis. For obvious reasons, it is customary to denote F(n) for all n ∈∗N (including nonstandard n) by n X i=0 f(i). Such hyperﬁnite sums appear, for example, in the nonstandard approach to measure and integration. 5.9. Exercise. Suppose f:∗N→∗Rand g:∗N→∗Rare internal functions, and c ∈∗R. Consider the notation introduced in the previous Remark. (i) For all n ∈∗N,Pn i=0(f(i)+ g(i)) =Pn i=0 f(i)+Pn i=0 g(i). (ii) For all n ∈∗N,Pn i=0 c•f(i) = c•Pn i=0 f(i). 5.10. Exercise. Consider the multiset which has three sorts,X,P(X), and P(P(X)), and develop the ideas of this Section in that context. The nonstandard extension should be modiﬁed so that not only is ∗P(X) a subset of P(∗X), but also ∗P(P(X)) is a subset ofP(P(∗X)), and so that the modiﬁed nonstandard extension preserves the restriction of the membership relation between X and P(X), as well as between P(X) and P(P(X)). In this way all elements of ∗P(X) and ∗P(P(X)) can be handled as sets in a canonical way. The setting described in Exercise 5.10 provides a framework in which nonstandardmethodscanbeappliedtosubsetsofXaswellastocollections of subsets of X. This would be a suitable framework for applying nonstandard methods to the study of topologies on X, for example, with the collection of open sets being an element of the third sort. This would also allow us to consider “internal topologies” on ∗X. These are internal collections T
35 of (necessarily internal) subsets of ∗X which satisfy the ∗-transform of the formula expressing the familiar deﬁning conditions satisﬁed by topologies. If we take the point of view of this Section to its natural limit, we get the type theoretic formulation of nonstandard analysis that Abraham Robinson used in his book [14]. Although this framework did not catch on at the time, that is likely due to the heavily formal presentation in [14] rather than to any essential disadvantages of this point of view. For an example of the use of such a framework for an important application of nonstandard methods, see the nonstandard proof due to van den Dries and Wilkie of Gromov’s Theorem about groups of polynomial growth. (See [7], pages 356–363; they present their nonstandard extension explicitly as an ultrapower.) We conclude this Section by discussing the nature of internal sets in the ultrapower nonstandard extensions that are constructed in the proofs of Theorems 2.28 and 4.28. We restrict our attention to nonstandard extensions of (X,P(X)), where X is any nonempty set. Let J be any inﬁnite index set and let U be an ultraﬁlter on J. Let (∗X,∗P(X)) be the ultrapower nonstandard extension of (X,P(X)) that is constructed in the proof of Theorem 4.28. Let a be any element of X. We follow the customary practice of identifying a with the corresponding standard element ∗a of ∗X. As discussed in Exercise 2.29, this means we are identifying a with the equivalence class [α], where α is the constant function deﬁned by α(j) = a for all j ∈ J. Consider the eﬀect of the normalization that is discussed in Remark 5.1. This mainly hinges on the behavior of the mapping Φ that is deﬁned there. Let Y be an arbitrary element of ∗P(X) and consider Φ(Y) ={x ∈∗X|(x,Y)∈∗E}. In this setting Y is an equivalence class [F] where F is a function from J into P(X); in other words, F is an indexed family of subsets of X. Taking into account the deﬁnition of ∗E leads to the equation Φ([F]) ={[α]|{j ∈ J | α(j)∈ F(j)}∈U}. Therefore, a subset A of ∗X is internal (in the normalized version of the ultrapower nonstandard extension (∗X,∗P(X)) if and only if there is an indexed family F:J →P(X) of subsets of X such that for all [α]∈∗X: [α]∈ A ⇐⇒{j ∈ J | α(j)∈ F(j)}∈U. 5.11. Exercise. Consider the ultrapower nonstandard extension discussed in the preceding paragraphs. (a) Let F:J → P(X) be an indexed family of subsets of X with the property that F(j) is nonempty for each j ∈ J. The internal subset of ∗X
36 determinedasaboveby F canbeidentiﬁedwiththeultraproductΠU(F(j)| j ∈ J). (b) Every non-empty internal subset of ∗X can be represented as described in (a).
6. Superstructures In this Section we will explain a setting for nonstandard analysis which was introduced in [13] by Robinson and Zakon. This framework gives a convenient way to apply nonstandard methods to essentially any part of mathematics. Much of the research literature of nonstandard analysis is expressed in terms of the framework that is explained here. The essential ideas in this Section are just an easy elaboration of what was done in the previous Section. In order to use nonstandard extensions eﬀectively, they must be applicable to mathematical systems which contain objects of higher type, such as spaces of functions, collections of sets (such as ﬁlters), systems of open sets in a topological space, and the like. Such objects occur in essentially every part of mathematics, and our framework must accomodate them in a smooth way. Experience has shown that a convenient way to accomplish this is to introduce the superstructure based on a given set S of elementary mathematical objects. (In most applications it is natural to take S = R or S = N. We will always assume that S contains N as a subset. The choice of S is otherwise somewhat arbitrary and depends on the mathematical problems that are being considered.) The elements of this superstructure are precisely the mathematical objects that can be obtained from S in a ﬁnite number of steps, where in each step we form all sets of the previously constructed objects and add each of these sets as a new object in its own right. If T is a set, we writeP(T) for the power set of T, which is the collection of all subsets of T. 6.1. Deﬁnition. [Superstructure] Fix a set S such that N ⊆ S. The superstructure based on S is the family of sets (Vk(S))k∈N deﬁned by the following induction on k: V0(S) = S; Vk+1(S) = Vk(S)∪P(Vk(S)). This system is an N-set in the terminology of Section 4; we denote it by V(S). An element of the unionS∞ k=0Vk(S) is called an object in V(S). The rank of an object a in V(S) is the smallest k for which a ∈Vk(S). Note that S = V0(S)⊆V1(S)⊆V2(S)⊆ ...
37
and hence also
Vj(S)∈Vk(S) whenever j < k. When we interpret the membership relation ∈ in V(S), we treat mem-bers of S as having no elements. Note that the objects of rank ≥1 in V(S) are precisely the sets in V(S), and the basic objects (elements of S) are the objects of rank 0. The empty set ∅ has rank 1. If b is an object in V(S) of rank ≥ 1 and a is an element of b, then a is also an object in V(S). Note also that when a,b are objects in V(S), a ∈ b always implies that the rank of a is strictly less than the rank of b. We assume that the reader is familiar with a small amount of naive set theory. In particular, we form basic pairs within V(S) using the familiar deﬁnition hx,yi={{x,y},{x}}. Note that the rank of hx,yi is r+2 where r is the larger of the ranks of x,y. For each n ≥ 2 we deﬁne the ordered n-tuple (x1,...,xn) to be the set {hi,xii | i = 1,...,m}. Recall that we require N to be a subset of the basic set S of the superstructure V(S); therefore (x1,...,xn) is an object in V(S) whenever x1,...,xn are objects in V(S). Moreover, the rank of (x1,...,xn) is r+3 if r is the maximum of the ranks of x1,...,xn. If A is a set of rank k in V(S) and n ≥2, then An, taken to be the set of ordered n-tuples of elements of A, will be a set in V(S) and its rank will be k+3. Note that this is independent of n. Similar remarks can be made about mixed Cartesian products. We now want to develop a suitable concept of nonstandard extension for superstructures, based on regarding a superstructure as an N-set and using the tools from Section 4. However, as in Section 5 some additional considerations arise from the fact that the sortsVk(S) are not just independent sets but rather have a high degree of interrelation. We will work with nonstandard extensions that have been normalized in a way similar to that discussed in Remark 5.1. In the superstructure setting it is convenient to change perspective slightly and to work with rank preserving embeddings between superstructures. Suppose V(S) and V(T) are superstructures and F:V(S) → V(T) is anyrankpreservingfunction.Let α beaﬁnitesequencefromNandsuppose A ⊆ V(S)α = Vα(1)(S)×...×Vα(m)(S). For large enough k ∈ N the set A is a set in Vk(S) so F(A) is a well deﬁned set in V(T). (Here F(A) is the value of the function F at A, not to be confused with {F(a)| a ∈ A}.) It turns out that a good approach to deﬁning nonstandard extensions of superstructures is to deﬁne ∗A to be F(A) for every such A. 6.2. Deﬁnition. [Nonstandard Extension of a Superstructure] Let V(S),V(T) be superstructures and let F:V(S)→V(T) be a rank preserving function. Consider the mapping deﬁned by letting ∗A = F(A) for each A ⊆ V(S)α, where α is any ﬁnite sequence from N. We say F is a nonstandard
38
extension of V(S) (as a superstructure) if T = ∗S and the following conditions are satisﬁed: (Note that ∗Vk(S) ⊆ Vk(∗S) for each k ∈ N, because T = ∗S and F is rank preserving.) (S1) This mapping is a nonstandard extension of V(S) (considered as the multiset (Vk(S))k∈N indexed by N) in the sense of Deﬁnition 4.2. (S2) If a ∈ S, then ∗a = a; in particular S ⊆∗S. (S3) This mapping preserves the membership relation: For each k ∈ N let Ek be the usual membership relation restricted toV k(S), Ek ={(a,b)∈Vk(S)2 | a ∈ b}; we require that ∗Ek is the restriction of the usual membership relation to ∗Vk(S), ∗Ek ={(x,y)∈(∗Vk(S))2 | x ∈ y}. (S4) The nonstandard universe is transitive: For each k ∈N, if a ∈∗Vk+1(S) and b ∈ a, then b ∈∗Vk(S). The extra conditions (S2) – (S4) have a normalizing eﬀect on the nonstandardextensionandmakeiteasiertoworkwith.Moreover,if F:V(S)→ V(T) satisﬁes only condition (S1), then the extra conditions (S2) – (S4) can be achieved by a series of simple modiﬁcations to F which are like those used in the justiﬁcation of Remark 5.1. For the remainder of this Section assume that F:V(S)→V(∗S) satisﬁes the conditions in Deﬁnition 6.2. We will explore a few consequences of the Deﬁnition and then proceed to introduce some of the main ideas through which nonstandard extensions of superstructures are applied. We will use the notation ∗V(S) for the N-set (∗Vk(S))k∈N. Note that ∗V(S) is contained in the superstructure V(∗S). This means that every object in ∗V(S) is a mathematical object of the usual kind, and that it lies at some ﬁnite level of higher type objects over the set ∗S. When working with the sets in ∗V(S) from the outside so to speak, this means that we can regard them as ordinary mathematical objects, to which all of the usual mathematical concepts can be applied. In particular, we can speak of the cardinality (ﬁnite, inﬁnite, countable, uncountable, etc) of each set A in ∗V(S). This plays a useful role in many applications of nonstandard analysis. When needed for clarity, we will refer to the external cardinality of A when we are making use of this point of view. The set theoretic nature of superstructures means we need to be careful when interpreting the deﬁnition of nonstandard extension and when applying the Transfer Principle. An element of Vk(S)\V0(S) is simultaneously (1) an element of the sort Vk(S) and (2) a subset of possibly many diﬀerent Cartesian products of sorts V(S)α. Under each of these interpretations there is a separate deﬁnition of the expression ∗a. In all of the cases under
39
(2), ∗a is taken to be the unique element F(a) by deﬁnition. In case (1), we know that ∗a is the unique element of F({a}), as deﬁned in paragraph 4.7 and justiﬁed by Proposition 4.6. But in fact there is no ambiguity, as follows from condition (S3) and a proof like that given for Proposition 5.2. All of these interpretations of the notation ∗a refer to the same object. Suppose x,y are variables ranging over Vk(S) of the kind that occur in applications of the Transfer Principle. In this context it is permissible to use x ∈ y as a basic formula, since it is equivalent to the basic formula (x,y) ∈ Ek. The ∗-transform of (x,y) ∈ Ek is deﬁned to be (X,Y) ∈ ∗Ek, where X,Y are variables ranging over ∗Vk(S). However, condition (S3) implies that this is equivalent to X ∈ Y. In other words, if we use x ∈ y in a formula ϕ over V(S), and we want to construct the ∗-transform of ϕ in order to apply the Transfer Principle, then we simply modify the formula x ∈ y to the formula X ∈ Y. Precisely the same thing is true when the variables x,y do not necessarily have the same rank. When applying the Transfer Principle to nonstandard extensions of superstructures, it is common to use bounded quantiﬁers. These are relativized quantiﬁers of the form ∀x ∈ a and ∃x ∈ a, where a is a set in V(S). Here we can take x to be a variable ranging over Vk(S), where k is chosen to be at least as large as the ranks of the elements of a. It is easy to interpret these quantiﬁers in terms of the ones we have been using, and thus determine what to do with them when applying the Transfer Principle. For example, consider a formula of the form ∀x ∈ aϕ; this is equivalent to ∀x ∈ Vk(S)[x ∈ a → ϕ]. The ∗-transform of this formula is ∀X ∈ ∗Vk(S)[X ∈ ∗a → ∗ϕ], which is in turn equivalent to ∀X ∈ ∗a[∗ϕ]. (Here we used the fact that all elements of ∗a are in ∗Vk(S).) Similarly we can take the ∗-transform of ∃x ∈ aϕ to be ∃X ∈∗a[∗ϕ]. It is also possible to use bounded quantiﬁers in which two variables appear: ∀x ∈ y and ∃x ∈ y. Recall that x and y must be variables that range over speciﬁc levels of V(S); say x ranges over Vk(S). Then ∃x ∈ y ϕ is equivalent to∃x ∈Vk(S)[x ∈ y∧ϕ], which is a formula we already know how to handle. Similarly we rewrite ∀x ∈ y ϕ as ∀x ∈Vk(S)[x ∈ y → ϕ]. Formulas such as z = {{x,y},{x}} and z = (x1,...,xn) can easily be expressed in superstructures using simple logical formulas in which only the membershiprelation∈andboundedquantiﬁersoccur.Therefore,whenconstructing the ∗-transform of a formula in which such basic formulas occur, they are unchanged except for the fact that the variables x,y,z,x1,...,xn are modiﬁed to range over the sorts ∗Vk(S) for appropriate k. Finally, if a,b are subsets of Vk(S) and f:a → b is a function, the condition f(x) = y can also be expressed using a formula over V(S) in which only the membership relation ∈ and bounded quantiﬁers are used. (As in the previous paragraph, x and y can also be replaced by ordered
40
tuples.) Indeed, if we take x,y to be variables ranging over the sort Vk(S), then f(x) = y is expressed by ∃z ∈Vk+3(S)[z = (x,y)∧z ∈ f] which is a bounded formula over V(S). The ∗-transform of this formula is equivalent to ∃Z ∈∗Vk+3(S)[Z = (X,Y)∧Z ∈∗f] and this is a formula over ∗V(S) which expresses the condition ∗f(X) = Y. 6.3. Exercise. Let a1,...,an be in V(S). (i) ∗{a1,...,an}={∗a1,...,∗an}. (ii) ∗(a1,...,an) = (∗a1,...,∗an). 6.4. Exercise. Let a,b,c,d,a1,...,an,f,r be sets in V(S). (i) a ∈ b ⇐⇒∗a ∈∗b. (ii) a = b ⇐⇒∗a = ∗b. (iii) a ⊆ b ⇐⇒∗a ⊆∗b. (iv) ∗(a∪b) = ∗a∪∗b, ∗(a∩b) = ∗a∩∗b, and ∗(a\b) = (∗a)\(∗b). (v) ∗(a1 ×...×an) = ∗a1 ×...×∗an. (vi) f is a function from a to b ⇐⇒ ∗f is a function from ∗a to ∗b. (vii) r is a relation on a × b ⇐⇒ ∗r is a relation on ∗a ×∗b; if these conditions are true, and if c is the domain of r (projection on the ﬁrst coordinate) and d is the range of r (projection on the second coordinate), then ∗c is the domain of ∗r and ∗d is the range of ∗r. The arguments needed to solve these Exercises are simple applications of the Transfer Principle (Theorem 4.27).
In the next deﬁnition we introduce one of the most important concepts in the superstructure framework; this is a natural extension of what was done in Section 5 (Deﬁnition 5.3): 6.5. Deﬁnition. [Internal Object in V(∗S)] An object in V(∗S) is internal if there exists k ∈N such that a ∈∗Vk(S). Therefore, the collection of internal objects is transitive: a internal and b ∈ a implies b internal. An object in V(∗S) is external if it is not internal. 6.6. Proposition. Let a be in V(∗S); a is internal if and only if it is an element of some standard set in V(∗S). That is, a is internal if there exists b in V(S) such that a ∈∗b. Proof. If a is internal, then a ∈ ∗Vk(S) by deﬁnition, so a is an element of a standard set. Conversely, suppose b ∈ V(S) and a ∈ ∗b. This implies b is a non-empty set so there exists k ∈ N with b ⊆ Vk(S). But then a ∈∗b ⊆∗Vk(S), and we are done. 2
41 Suppose ϕ is a logical formula over V(S) and let ∗ϕ be its ∗-transform. Observe that each quantiﬁed variable in ∗ϕ is restricted to range over the elements of ∗Vk(S) for some k. Therefore the quantiﬁed variables in ∗ϕ range only over internal elements. This means that if the free variables of ∗ϕ are taken to stand for internal objects in V(∗S), then we will get the same truth value if we interpret ∗ϕ in the full superstructure V(∗S) as if we evaluate it in the nonstandard extension ∗V(S). External objects simply do not enter into the picture when we evaluate whether or not ∗ϕ is true in V(∗S).
The following result is an easy consequence of the Transfer Principle (Theorem 4.27) applied to nonstandard extensions of superstructures. Nonetheless, it is an important tool in applications of nonstandard methods. 6.7. Theorem. [Internal Deﬁnition Principle] Let ϕ(x1,...,xm,y1,...,yn) be a formula over V(S). Suppose the variable xj ranges over Vα(j)(S) for each j and the variable yk ranges over Vβ(k)(S) for each k. Let a1,...,an be internal objects in V(∗S), with aj ∈∗Vα(j)(S) for each j. Let b be the set in V(∗S) deﬁned by ∗ϕ(X1,...,Xm,a1,...,an): b ={(X1,...,Xm)∈∗V(S)α |∗ϕ(X1,...,Xm,a1,...,an) holds in ∗V(S)}. Then b is internal. Proof. This proof follows the same line of argument as the proof of Theorem 5.4. 2 6.8. Remark. Note the requirement in the Internal Deﬁnition Principle that the objects a1,...,an are internal. This is very important. A very common mistake when using nonstandard analysis is to misapply the Internal Deﬁnition Principle in a situation where some of the objects a1,...,an are external. 6.9. Exercise. Let a1,...,am be internal objects in V(∗S). (i) {a1,...,an} is internal. (ii) (a1,...,ani) is internal. (iii) Every standard set in V(∗S) is internal. 6.10. Exercise. Let a,b,a1,...,an,f,r be internal sets in V(∗S). (i) Every Boolean combination of a,b is internal. (ii) Every element of a is internal. (iii) a1 ×...×an is internal. (iv) If r is a relation on a×b, then the domain of r and the range of r are internal.
42
(v) The union of all members of a and the intersection of all members of a are both internal. (vi) The collection of all internal subsets of a is internal. An important example of an internal concept is the notion of hyperﬁnite set. These are internal sets in V(∗S) which obey all the formally expressible properties of ﬁnite sets. As a result, they can be handled using ideas of combinatorial and discrete mathematics. However, when viewed from outside, they may be inﬁnite sets, and may share many qualitative features of continuous objects of mathematics. Many important applications of nonstandard analysis depend on the use of hyperﬁnite sets. The deﬁnition of hyperﬁnite is also a model for the introduction of many interesting concepts for internal sets. 6.11. Deﬁnition. [Hyperﬁnite Sets] Let Fk be the collection of all ﬁnite sets in Vk(S). A set a in V(∗S) is hyperﬁnite (equivalently ∗-ﬁnite) if a ∈∗Fk for some k ∈N. 6.12. Remark. Note that according to Proposition 6.6, every hyperﬁnite set is internal, since ∗Fk is a standard set for each k. 6.13. Notation. See the discussion before Exercise 5.6 for an explanation of the meaning of the relation < and the functions + and × on ∗R. Given N ∈ ∗N, we write {0,1,...,N} for the set {a ∈ ∗N | 0 ≤ a ≤ N}. Note that {0,1,...,N} is an inﬁnite set if N is an inﬁnite number in ∗N. 6.14. Exercise. (i) Every ﬁnite set in ∗V(S) is hyperﬁnite. (ii) For each N ∈ N the set {0,1,...,N} is hyperﬁnite; ∗N is not hyperﬁnite. (iii) If a is a set in V(S) and ∗a is hyperﬁnite, then a is a ﬁnite set and ∗a ={∗b | b ∈ a}. 6.15. Exercise. A set a in V(∗S) is hyperﬁnite if and only if there exists an internal bijection between a and {0,1,...,N−1} for some N ∈N. This N, if it exists, is unique. 6.16. Deﬁnition. If a is a hyperﬁnite set in V(∗S), the unique N ∈ N such that there exists an internal bijection between a and {0,1,...,N −1} is called the internal cardinality of a. 6.17. Exercise. Let a,b,a1,...,an,f,r be hyperﬁnite sets in V(∗S). (i) Every Boolean combination of a,b is hyperﬁnite. (ii) a1 ×...×an is hyperﬁnite. (iii) If r is a relation on a×b, then the domain of r and the range of r are hyperﬁnite. (iv) If every member of a is a hyperﬁnite set, then the union of all members of a and the intersection of all members of a are both hyperﬁnite.
43
(v) The collection of all internal subsets of a is hyperﬁnite. (vi) Every internal subset of a is hyperﬁnite, and its internal cardinality is ≤ the internal cardinality of a. (vii) Suppose S contains R; if a is a hyperﬁnite subset of ∗R and N is the internal cardinality of a, then there is an internal increasing bijection from {0,1,...,N −1} onto a. 6.18. Exercise. Suppose S containsR.Let A bethesetofall(α,N)where N ∈∗N and α is an internal function from {0,1,...,N} into ∗R. See part (iv) of Exercise 5.7 and Exercise 5.9. (i) A is an internal set; (ii) there is a unique internal function Σ:A → ∗R such that for all( α,N)∈ A Σ(α,N) = N X k=0 α(k). 6.19. Remark. If a is an internal set in ∗V(S), we let ∗P(a) denote the set of all internal subsets of a. By part (vi) of Exercise 6.10 we know that ∗P(a) is an internal set in ∗V(S); it is called the internal power set of a. Using part (i) of the same Exercise, we see that ∗P(a) is closed under ﬁnite Boolean operations. Since it is an internal set, this implies that ∗P(a) is actually closed under hyperﬁnite unions and intersections. Such internal Boolean algebras of sets are very important in the construction of Loeb measures. 6.20. Exercise. Let A be a set in V(S). Then (∗A,∗P(A),∗P(A × A)) is a nonstandard extension of (A,P(A),P(A × A)), and it satisﬁes the normalizing assumptions (b) and (c) given above just before Exercise 5.7. For a more complete discussion of superstructures and more complete proofs of many facts about their nonstandard extensions, the reader may consult the textbook [8]; see also [1] [2] [4] [9] [11] [13] and [15]. Acompletelydiﬀerentsettheoreticfoundationfornonstandardanalysis, Internal Set Theory (IST), was introduced by Nelson in [12]. It is based on nonstandard models for the full ZFC axioms for the foundations of mathematics. (ZFC = Zermelo Fraenkel axioms for set theory with the Axiom of Choice.)
7. Saturation For most applications, especially those in topology and abstract analysis, it is necessary to work with nonstandard extensions which satisfy richness conditions stronger than nontriviality or properness. (See Deﬁnitions 2.21
44
and 4.21.) The most useful of the extra hypotheses are the saturation conditions, which were carried over from model theory to nonstandard analysis by Luxemburg [11]. For this Section we ﬁx a superstructure V(S) and a nonstandard extension ∗V(S) of it. Recall that a family F of sets is said to have the ﬁnite intersection property if each intersection of a ﬁnite subcollection of F is non-empty. We let κ stand for an uncountable cardinal number. 7.1. Deﬁnition. The given nonstandard extension is κ-saturated if it satisﬁes the following condition: let F be a (possibly external) family of internal sets; ifF has (external) cardinality strictly less than κ and F has the ﬁnite intersection property, then the total intersection of F is non-empty. (The total intersection of F is the set {a ∈ V(∗S) | a ∈ b for all b ∈ F}. Of course this set may be external.) The following result gives an alternate formulation of κ-saturation. It is expressed in terms of simultaneous satisﬁability of conditions, each expressed by formulas over ∗V(S), in which only internal objects are allowed. 7.2. Theorem. Let ∗V(S) be a κ-saturated nonstandard extension of the superstructure V(S), where κ is an uncountable cardinal number. Let J be an index set of cardinality < κ. Let a be an internal set in ∗V(S). For each j ∈ J, let ϕj(X) be a formula over ∗V(S), so all objects mentioned in ϕj(X) are internal. Further, suppose that the set of formulas {ϕj(X) | j ∈ J} is ﬁnitely satisﬁed in a; this means that for every ﬁnite subset α of J there exists some c ∈ a (which may depend on α) such that ϕj(c) holds in ∗V(S) for all j ∈ α. Then there exists c ∈ a such that ϕj(c) holds in ∗V(S) simultaneously for all j ∈ J. Proof. For each j ∈ J, let fj be the subset of a that is deﬁned by ϕj(X); that is, fj ={c ∈ a | ϕj(c) is true in ∗V(S)}. The Internal Deﬁnition Principle implies that each fj is an internal subset of a. The hypotheses imply that the collection {fj | j ∈ J} has the ﬁnite intersection property. Since the cardinality of J is < κ, the fact that our nonstandard extension is κ-saturated implies that the total intersection ∩{fj | j ∈ J} is nonempty. Any element of this intersection satisﬁes the conclusion of the Theorem. 2 Of special importance for most applications is ℵ1-saturation, where ℵ1 denotes the ﬁrst uncountable cardinal number. This means that whenever F is a countable collection of internal sets and F has the ﬁnite intersection property, then the total intersection of F is non-empty. It is customary in nearly all research articles in nonstandard analysis to assume that the
45 nonstandard extensions being used are at least ℵ1-saturated. It is this hypothesis that ensures, for example, that Loeb measures are σ-additive and that nonstandard hulls of metric spaces are complete. In some areas, especially in topology, an even stronger hypothesis of κ-saturation is needed for many applications. For example, in order to give a smooth treatment of a topological space T using the methods of nonstandard analysis, it is usually necessary to assume that the nonstandard extension is κ-saturated where κ is strictly larger than the number of open subsets of T. Note that the saturation hypotheses can also be applied to the simpler nonstandard extensions treated in Section 5. 7.3. Proposition. Assume that the nonstandard extension is κ-saturated. Every inﬁnite internal set in V(∗S) has (external) cardinality ≥ κ. Proof. Suppose otherwise, that a is an inﬁnite internal set of cardinality strictly less than κ. LetF be the collection of all sets of the form a\{x}as x ranges over a. Then F is a collection of internal sets, and the cardinality of F is less than κ. Moreover,F obviously has the ﬁnite intersection property, since a is inﬁnite. But the total intersection of F is obviously empty; this contradicts the hypothesis that the nonstandard extension is κ-saturated. 2 7.4. Proposition. Assume that the nonstandard extension is κ-saturated. Let a be an internal set in V(∗S). Let A be a (possibly external) subset of a such that A has cardinality strictly less than κ. Then there exists a hyperﬁnite subset b of a such that b contains A as a subset. Proof. For each x ∈ A, let Fx denote the set of all hyperﬁnite subsets of a which contain x as an element. The Internal Deﬁnition Principle (Theorem 6.7) yields that each Fx is an internal set in V(∗S). Let F be the collection of all the sets Fx as x ranges over A. Obviously F has cardinality strictly less than κ. Moreover, F has the ﬁnite intersection property: given ﬁnitely many elements x1,...,xn from A, the set {x1,...,xn} is hyperﬁnite and is an element of Fxj for all j = 1,...,n. Since our nonstandard extension is κ-saturated, there exists an object b which is an element of Fx for every x ∈ A. This b is therefore the desired hyperﬁnite set. 2 7.5. Theorem. [Saturated Extensions are Comprehensive] Assume that the nonstandard extension is κ-saturated. Let a and b be internal sets in V(∗S). Let A be a (possibly external) subset of a such that A has cardinality strictly less than κ and suppose that f:A → b is a (possibly external) function. Then there exists an internal function g:a → b such that g is an extension of f. In particular, if {ck | k ∈N} is a (possibly external) sequence of elements of b, then there exists an internal function g:∗N→ b such that g(k) = ck for all k ∈N.
46 Proof. For each x ∈ A let Fx be the set of all internal functions g:a → b which satisfy g(x) = f(x). The Internal Deﬁnition Principle (Theorem 6.5) implies that each Fx is internal. Let F be the collection of all Fx as x ranges over A. ObviouslyF has cardinality strictly less than κ. Moreover,F has the ﬁnite intersection property: given ﬁnitely many elements x1,...,xn from A, consider the function g:a → b which takes xj to f(xj) for j = 1,...,n and which takes all other elements of a to (say) f(x1). The Internal Deﬁnition Principle implies that this function is internal, and it is obviously an element of Fxj for all j = 1,...,n. From the fact that our nonstandard model is assumed to be κ-saturated, it follows that there is an object g which is an element of Fx for all x ∈ A. This g is the desired internal function from a to b. 2
7.6. Exercise. The following conditions are equivalent: (a) The nonstandard extension is ℵ1-saturated. (b) (Countable Comprehension Property) Whenever b is an internal set in V(∗S) and (ck)k∈N is a (possibly external) sequence of elements of b, then there exists an internal function g:∗N→ b such that g(k) = ck for all k ∈N. 7.7. Exercise. Assume the nonstandard extension is ℵ1-saturated and let a be an internal set in V(∗S). A (possibly external) subset b of a is a Σ0 1 set if there exists a sequence {ck | k ∈ N} of internal subsets of a such that b =S{ck | k ∈ N}. Similarly, b is a Π0 1 set if there exists a sequence {ck | k ∈N} of internal subsets of a such that b =T{ck | k ∈N}. A Σ0 1 set is sometimes called a galaxy, and a Π0 1 set is sometimes called a monad or a halo. (a) Suppose{ck | k ∈N}is a sequence of internal subsets of a such thatb 1 is the Σ0 1 setS{ck | k ∈ N}. If b1 is internal, then there exists N ∈ Nsuch that b1 = c1 ∪...∪cN. (Hint: without loss of generality the sequence {ck | k ∈ N} is the restriction to N of an internal increasing sequence {ck | k ∈∗N} of subsets of a. If b1 =S{ck | k ∈N} is internal, then the set of N ∈∗N such that b1 ⊆ c1 ∪...∪cN is internal and contains all inﬁnite N. By the Underspill Principle, Exercise 5.6, there exists a ﬁnite N ∈ N such that b1 ⊆ c1 ∪...∪cN.) (b) Suppose{dk | k ∈N}is a sequence of internal subsets of a such thatb 2 is the Π0 1 setT{dk | k ∈ N}. If b2 is internal, then there exists N ∈ Nsuch that b2 = d1 ∩...∩dN. (c) Suppose b1 is a Σ0 1 subset of a represented as in (a) and b2 is a Π0 1 subset of a represented as in (b). If b1 ⊆ b2 then there is an internal set e such that b1 ⊆ e ⊆ b2. In particular, if b1 = b2, then b1(= b2) is an internal set.
47 7.8. Exercise. This improves on Proposition 7.3 when κ = ℵ1. Assume the nonstandard extension is ℵ1-saturated. Every inﬁnite internal set has (external) cardinality ≥ 2ℵ0. (Hint: without loss of generality the inﬁnite internal set is of the form{0,1,...,N}where N is an inﬁnite element of ∗N. For each standard real number r in the interval 0 < r < 1 show that there is a smallest element k of ∗N such that N •∗r ≤ k. Obviously 1 ≤ k ≤ N. Moreover, k is uniquely determined by r.) It is useful to introduce two additional richness properties of nonstandard extensions: 7.9. Deﬁnition. (a) A nonstandard extension of V(S) is polysaturated if it is κ-saturated for some κ greater than or equal to the number of objects in V(S). (b) A nonstandard extension of V(S) is an enlargement if for each set a in V(S) there exists a hyperﬁnite set b ⊆∗a such that b contains every standard element of ∗a; that is, we require {∗c | c ∈ a}⊆ b ⊆∗a. 7.10. Exercise. The following conditions are equivalent: (a) The nonstandard extension is an enlargement. (b) If k ∈ N and F is a collection of subsets of Vk(S) with the ﬁnite intersection property, thenT{∗a | a ∈F} is nonempty. (c) If k ∈ N and (L,≤) is a partially ordered set in Vk(S) which is directed upwards, then there exists b ∈∗L such that for all a ∈ L, ∗a ∗≤ b. 7.11. Exercise. Every polysaturated nonstandard extension of V(S) is an enlargement. Next we give a proof using ultrapowers of the existence of enlargements. 7.12. Theorem. Every superstructure V(S) has a nonstandard extension which is an enlargement; it can be taken to be an ultrapower extension of V(S) with respect to a suitably chosen ultraﬁlter. Proof. Let J be the collection of all nonempty ﬁnite sets of objects from V(S). Let U be an ultraﬁlter on J such that for each object a in V(S) the set {j | j is a ﬁnite set of objects from V(S) and a ∈ j} isinU.Suchanultraﬁlterexistsbecausethecollectionofallthesesubsetsof J has the ﬁnite intersection property. Consider the nonstandard extension of V(S) constucted as an ultrapower using U as in the proof of Theorem 4.28. We will show that this is an enlargement of V(S). Fix a set a from V(S). For each j ∈ J let F(j) = a∩j and let b = [F] be the element of the nonstandard extension that is determined by F. Note that if a has rank k, then all of the values of F are in Vk(S) so b is an element of ∗Vk(S). Since every F(j) is a ﬁnite subset of a, it follows that b
48
is a hyperﬁnite subset of ∗a. It remains to show that every standard element of ∗a is an element of b. Let c ∈ a. As in Exercise 2.29 ∗c is the equivalence class [α] where α is the constant function with value c at each argument in J. To show that ∗c ∈ b we must show that the set {j ∈ J | c ∈ F(j)} is in U. But this set equals {j ∈ J | c ∈ j}, which is in U by construction. 2 7.13. Theorem. Let V(S) be a superstructure and let κ be an uncountable cardinal number. There exists a κ-saturated nonstandard extension of V(S). In particular, there exists a polysaturated nonstandard extension of V(S). Proof. We begin by proving the important fact that if U is any countably incomplete ultraﬁlter on an inﬁnite index set, then the ultrapower nonstandard extension of V(S) constructed as in Theorem 4.28 is necessarily ℵ1-saturated. SinceU is countably incomplete, we may suppose (Fk)k∈N is a decreasing sequence inU whose intersection is empty and with F0 = J. Let {ak | k ∈N} be a set of internal sets in V(∗S) with the ﬁnite intersection property. We must show thatT{ak | k ∈N} is non-empty. Without loss of generality we may suppose that ak+1 ⊆ ak for all k ∈ N. Therefore there exists r ∈N so that ak has rank at most r for all k ∈N. Since ∗V(S) was obtained by the ultrapower construction using the ultraﬁlter U, for each k ∈N there is a set function Ak:J →Vr(S) such that ak is the equivalence class [Ak]. Since {ak | k ∈N} has the ﬁnite intersection property, for each k ∈N we have {j ∈ J | A0(j)∩...∩Ak(j)6=∅}∈U. Deﬁne Gk for k ∈N as follows: G0 = J and for k ≥1 Gk = Fk ∩{j ∈ J | A0(j)∩...∩Ak(j)6=∅}. Therefore J = G0 ⊇ G1 ⊇ ... ⊇ Gk ⊇ ..., Gk ∈ U for all k ∈ N, and T{Gk | k ∈N}=∅. Therefore we may deﬁne d(j) for each j ∈ J to be the largest k ∈N for which j ∈ Gk. Now we construct [α] in ∗V(S) which is an element ofT{ak | k ∈N}. Fix j ∈ J and deﬁne α(j) as follows. If d(j) = 0 let α(j) be an arbitrary element of Vr(S). If d(j) ≥ 1, choose α(j) to be an element of A0(j)∩...∩Ad(j)(j), which is guaranteed to be non-empty by the deﬁnition of d(j). It is obvious that for each k ∈ N, α(j) ∈ Ak(j) holds whenever d(j)≥ k and d(j)≥1. Therefore {j ∈ J | α(j)∈ Ak(j)}⊇ Gk ∈ U for k ≥ 1 and {j ∈ J | α(j) ∈ A0(j)} ⊇ G1 ∈ U. This completes the proof that [α] is an element ofT{ak | k ∈N}. We will not give the details of a proof of the general case. The easiest construction of a κ-saturated nonstandard extension of V(S) for κ > ℵ1 is to take the direct limit of a well ordered chain of successive enlargements. The length of the chain should be a regular cardinal number ≥ κ; a chain
49
of length κ+ will suﬃce, where κ+ is the next cardinal number larger than κ. It is also possible, but rather intricate in the case where κ > ℵ1, to constructa κ-saturatednonstandardextensioninonestepasanultrapower, by choosing the ultraﬁlter carefully. Details may be found in [8] [9] [11] and [15].
References
1. Albeverio, S., Fenstad, J-E., Høegh-Krohn, R., and Lindstrøm, T., (1986) Nonstandard Methods in Stochastic Analysis and Mathematical Physics. Academic Press, New York. 2. Capin´ski, M. and Cutland, N. J., (1995) Nonstandard Methods for Stochastic Fluid Mechanics. World Scientiﬁc, Singapore. 3. Cutland, N. J., Editor, (1988) Nonstandard Analysis and its Applications. Cambridge University Press, Cambridge. 4. Davis, M., (1977) Applied Nonstandard Analysis. John Wiley & Sons, New York. 5. Van den Dries, L., Tame Topology and o-minimal Structures. (monograph in preparation). 6. Van den Dries, L. and Miller, C., Geometric categories and o-minimal structures. Duke Mathematical Journal, (to appear). 7. Van den Dries, L. and Wilkie, A. J., (1984) Gromov’s Theorem on groups of polynomial growth and elementary logic. Journal of Algebra. Pages 349–374. 8. Hurd, A. and Loeb, P. A., An Introduction to Nonstandard Real Analysis. Academic Press, New York. 9. Lindstrøm, T., (1988) An invitation to nonstandard analysis. In Cutland (1988). Pages 1–105. 10. Luxemburg, W. A. J.,(1969a) Applications of Model Theory to Algebra, Analysis, and Probability., Holt, Rinehart, and Winston, New York. 11. Luxemburg, W. A. J.,(1969b) A general theory of monads. In Luxemburg (1969a). Pages 18–86. 12. Nelson, E., (1977) Internal set theory. Bulletin of the American Mathematical Society. Pages 1165–1193. 13. Robinson, A. and Zakon, E., (1969) A set-theoretical characterization of enlargements. In Luxemburg (1969a). Pages 109–122. 14. Robinson, A., (1966) Nonstandard Analysis. North-Holland, Amsterdam. (Second, revised edition, 1974). 15. Stroyan, K. and Luxemburg, W. A. J., (1976) Introduction to the Theory of Inﬁnitesimals. Academic Press, New York.
INDEX ∗-ﬁnite set, 42 ∗-transform of a formula over a multiset, 27 over a set, 18 I-set, 23 N-set, 23 Π0 1 set, 46 Σ0 1 set, 46 bounded quantiﬁers, 39
comprehensiveness, 45 countable comprehensiveness, 46 countablyincompleteultraﬁlter,12 embeddingof(Xi)i∈I into(∗Xi)i∈I, 24 embedding of X into ∗X, 6 external object over a superstructure, 40
ﬁnite number, 3 formulas over a multiset, 27 formulas over a set, 17
galaxy, 46
halo, 46 hyperﬁnite set, 42 hyperﬁnite sum, 34, 43 hypernatural number, 3 hyperreal number, 2
inﬁnitesimal number, 3 internal cardinality of a hyperﬁnite set, 42 Internal Deﬁnition Principle, 31, 41 internal function, 33 internal object over a superstructure, 40
internal power set, 43 internal set, 31
limited number, 3 logical connectives, 15 logical formulas, 14 logical quantiﬁers, 15 logical quantiﬁers, bounded, 39 logical sentence, 18 logical symbols, 15
many sorted set, 23 monad, 46 multiset, 22
nonstandard extension ℵ1-saturated, 46 κ-saturated, 44 comprehensive, 45 enlargement, 47 of a multiset, 23 existence, 28 ultrapower, 28 of a set, 4 existence, 12 ultrapower, 12 of a superstructure, 37 of the multiset (X,P(X)), 29 polysaturated, 47 proper, 11, 26 nonstandardextension∗f ofafunction f, 8, 25 nonstandard natural number, 3 nonstandard number, 2
object, in a superstructure, 36 ordered n-tuple, 37 Overspill Principle, 33
quantiﬁers, 15
50
51
quantiﬁers, bounded, 39
rank, of an object in a superstructure, 36
sentence, 18 sort, of a multiset, 23 standard element, 6, 24 superstructure, 36
Transfer Principle over a multiset, 28 over a set, 20
ultrapower, 12 ultraproduct, 12 Underspill Principle, 33
variables, bound, 18 variables, free, 18

FOUNDATIONS OF NONSTANDARD ANALYSIS

猜你喜欢