Back Original

A Field Guide to Nonstandard Definitions

Nonstandard analysis trades quantifier mazes for proximity. $\forall\varepsilon>0\ \exists\delta>0\ \forall x\ldots$ becomes $x \approx y \Rightarrow f(x) \approx f(y)$. Same theorem. The transfer principle makes the two worlds equivalent for first-order statements — not cheating, just coordinates.

Standard definition, then nonstandard. The goal: see every familiar definition again for the first time.


Nonstandard analysis makes the dynamic static. Limits, convergence, continuity — all processes in standard analysis. Something approaches, something converges. The nonstandard versions replace process with snapshot. The limit of a sequence isn’t something it tends toward — it’s what it equals at an infinite index. The derivative isn’t a ratio approaching a value — it IS the ratio, at an infinitesimal, then rounded. Cauchy sequences aren’t converging — they’re already there. Take the standard part.

Robinson froze the whole thing. The infinite N is here. The infinitesimal epsilon is here. They’re not going anywhere. They’re numbers. “Infinitely many twin primes” becomes “one unlimited prime P with P+2 also prime.” The process becomes a thing.


Notation:


Analysis

Limit of a sequence

Standard: $\lim_{n\to\infty} a_n = L$ iff for every $\varepsilon > 0$ there exists $N$ such that $n > N$ implies $\lvert a_n - L\rvert < \varepsilon$.

Nonstandard: $a_H \approx L$ for every unlimited hypernatural $H$.


Continuity at a point

Standard: $f$ is continuous at $x$ iff for every $\varepsilon > 0$ there exists $\delta > 0$ such that $\lvert y - x\rvert < \delta$ implies $\lvert f(y) - f(x)\rvert < \varepsilon$.

Nonstandard: $y \approx x$ implies $f(y) \approx f(x)$. Equivalently: $f(\mu(x)) \subseteq \mu(f(x))$. The whole infinitesimal cloud around $x$ maps into the infinitesimal cloud around $f(x)$.


Uniform continuity

Standard: Same as continuity, but $\delta$ chosen independently of $x$ — one $\delta$ works everywhere.

Nonstandard: For all pairs

\[x, y \in {}^*[a,b]\]

(not just standard $x$), $x \approx y$ implies

\[{}^*f(x) \approx {}^*f(y)\]

The quantifier ranges over the entire hyperreal interval, including nonstandard points lurking between the standard ones.


Uniform convergence

Standard: $f_n \to f$ uniformly iff for every $\varepsilon > 0$ there exists $N$ such that $n > N$ implies $\sup_x \lvert f_n(x) - f(x)\rvert < \varepsilon$.

Nonstandard: For every unlimited $H$ and every

\[x \in {}^*[a,b]\]

(standard or not):

\[{}^*f_H(x) \approx {}^*f(x)\]

Pointwise convergence only checks standard $x$; uniform convergence checks all of them.


Derivative

Standard: $f’(x) = \lim_{h \to 0} \dfrac{f(x+h) - f(x)}{h}$.

Nonstandard: For any nonzero infinitesimal $\varepsilon$,

\[f'(x) = \mathrm{st}\!\left(\frac{ {}^*f(x+\varepsilon) - {}^*f(x)}{\varepsilon}\right)\]

The limit is the standard part. The difference quotient exists as a hyperreal; strip off the infinitesimal error. Leibniz and Newton, made rigorous. (See also: derivative at a discontinuity, where this breaks interestingly.)


Riemann integral

Standard: $\int_a^b f(x)\,dx$ as a limit of Riemann sums as the mesh goes to zero.

Nonstandard: Fix unlimited $H$, let $\Delta x = (b-a)/H$. Then

\[\int_a^b f(x)\,dx = \mathrm{st}\!\left(\sum_{i=0}^{H-1} {}^*f\!\left(a + i\cdot\Delta x\right)\cdot \Delta x\right)\]

A hyperfinite sum — actual (hyper)finite, $H$ terms — rounded. The integral is what survives after discarding the infinitesimal error.


Convergent series

Standard: $\sum_{n=0}^\infty a_n = L$ iff for every $\varepsilon > 0$ there exists $N$ such that $M > N$ implies $\left\lvert\sum_{n=0}^{M} a_n - L\right\rvert < \varepsilon$.

Nonstandard: The hyperfinite partial sum $\sum_{n=0}^{H} {}^*a_n \approx L$ for every unlimited $H$.


Bolzano-Weierstrass

Standard: Every bounded sequence in $\mathbb{R}$ has a convergent subsequence.

Nonstandard: Every limited hyperreal is nearstandard. If $\lvert x \rvert \le C$ for some standard $C$, then $\mathrm{st}(x)$ exists. No subsequence extraction needed — the limit is the standard part of the $H$-th term, for any unlimited $H$.


Intermediate value theorem

Standard: $f: [a,b] \to \mathbb{R}$ continuous, $f(a) < 0 < f(b)$ implies $f(c) = 0$ for some $c \in (a,b)$.

Nonstandard: Divide $[a,b]$ into $H$ equal parts. Somewhere in this hyperfinite partition the sign flips. Call that crossing point $c_H$. Then $f(c_H) \approx 0$ (since $f$ is continuous and $c_H$ is limited), so $f(\mathrm{st}(c_H)) = 0$.

There’s a catch. The hyperfinite partition argument finds the crossing point by scanning left to right and picking the first sign flip. This isn’t constructive — you can’t compute which index flips without checking all of them. Sam Sanders has papers on this (“The Unreasonable Effectiveness of Nonstandard Analysis”) exploring exactly where NSA and constructive math agree and disagree. The hyperfinite IVT proof is short, intuitive, and nonconstructive. The standard proof is longer, less intuitive, and also nonconstructive (it uses nested intervals or bisection, which requires excluded middle for the sign check). NSA doesn’t make things more or less constructive — it just makes the nonconstructivity visible, which is arguably more honest.


Topology

Monad of a point

The monad $\mu(x)$ of a standard point $x$: everything infinitely close to it, $\mu(x) = {y \in {}^*X : y \approx x}$. Plays the role of “all neighborhoods of $x$ simultaneously.” Every open set containing $x$ contains $\mu(x)$; every closed set containing a nonstandard point in $\mu(x)$ contains $x$.


Open and closed sets

Standard: Defined by neighborhoods or by taking complements.

Nonstandard: $U$ is open iff $x \in U$ implies

\[\mu(x) \subseteq {}^*U\]

Closed iff whenever

\[y \in {}^*U\]

and $y \approx x$ for standard $x$, we have $x \in U$. The monad fits inside an open set; a closed set captures the standard part of every approximating sequence.


Compactness

Standard: Every open cover of $X$ has a finite subcover.

Nonstandard: Every point of ${}^*X$ is nearstandard — infinitely close to some standard point in $X$. The $H$-th term of any would-be divergent sequence already has a standard shadow.


Sequential compactness

Standard: Every sequence in $X$ has a convergent subsequence.

Nonstandard: Every point of ${}^*X$ is nearstandard. Same condition as compactness. In metric spaces these coincide; the nonstandard formulation makes that equivalence obvious — both are just “every hyperpoint has a standard shadow.”


Hausdorff separation

Standard: Any two distinct points have disjoint open neighborhoods.

Nonstandard: Monads of distinct standard points are disjoint. $x \ne y$ (both standard) implies $\mu(x) \cap \mu(y) = \emptyset$. Two points are topologically indistinguishable iff their monads coincide.


Connectedness

Standard: $X$ cannot be partitioned into two disjoint nonempty open sets.

Nonstandard: ${}^*X$ cannot be partitioned into two internal sets that each contain the complete monad of every standard point they touch. If $x$ is in one component, its entire infinitesimal cloud is too.


Algebra

Hyperfinite fields

Pick an unlimited prime

\[P \in {}^*\mathbb{N}\]

Then

\[{}^*\mathbb{Z}/P{}^*\mathbb{Z}\]

is a hyperfinite field with $P$ elements. Satisfies every first-order field axiom by transfer (commutativity, distributivity, inverses — all of it), yet has more elements than any finite field. Standard finite field theory, lifted above all standard primes.


Ultraproducts

The hyperreals are an ultraproduct: ${}^*\mathbb{R} = \prod_\mathbb{N} \mathbb{R} \,/\, \mathcal{U}$ for a free ultrafilter $\mathcal{U}$ on $\mathbb{N}$. Łoś’s theorem: a first-order sentence holds in the ultraproduct iff it holds in $\mathcal{U}$-almost-all factors. That’s the transfer principle, stated as a construction. Ultraproduct of fields is a field. Ultraproduct of ordered fields is ordered.


Ax-Kochen theorem

Standard: For every positive integer $d$, for all but finitely many primes $p$, every homogeneous polynomial of degree $d$ in $d^2 + 1$ variables over $\mathbb{Q}_p$ has a nontrivial zero. The proof (1965) was one of the first major applications of model theory to number theory.

Nonstandard: Pick an unlimited prime $P$. The hyperfinite field

\[{}^*\mathbb{Q}_P\]

is an ultrapower of the $p$-adics. By Łoś’s theorem, a first-order sentence holds in

\[{}^*\mathbb{Q}_P\]

iff it holds for almost all standard $\mathbb{Q}_p$. So you prove the statement for one object — the ultraproduct — and get it for almost all primes for free. The “all but finitely many” quantifier is just the ultrafilter discarding a measure-zero set of exceptions. Ax and Kochen proved it this way. The Fields Medal committee noticed.


Probability

Loeb measure

Start with a hyperfinite probability space: $\Omega_H = {1, 2, \ldots, H}$ with uniform measure $P_H(A) = \lvert A\rvert / H$. Everything internal. Apply $\mathrm{st}$ to $P_H$ and extend to the $\sigma$-algebra it generates. The result: Loeb measure, a genuine countably additive probability measure on a standard measurable space. Standard probability theory from a discrete hyperfinite coin flip.


Brownian motion

Standard: The Wiener process, defined axiomatically by independent Gaussian increments, continuity, and scaling.

Nonstandard: Take a hyperfinite random walk — $H$ steps of $\pm 1/\sqrt{H}$ each. The paths live in

\[{}^*\mathbb{R}\]

but their standard parts (via the Loeb construction on the hyperfinite path space) are actual Brownian paths. Anderson (1976). Brownian motion isn’t the limit of random walks in some informal sense — it literally is a hyperfinite random walk, modulo rounding.


Law of large numbers

Standard: $X_1, X_2, \ldots$ i.i.d. with mean $\mu$. Then $\frac{1}{n}\sum_{i=1}^n X_i \to \mu$ almost surely.

Nonstandard: For every unlimited $H$, $\frac{1}{H}\sum_{i=1}^H {}^*X_i \approx \mu$ on a set of Loeb measure 1. The limit is already achieved — not approached — at unlimited index.


Generalized Functions

Standard distributions (Schwartz, 1950s) are linear functionals on test functions — you can’t evaluate a distribution at a point, only integrate it against a smooth bump. The Dirac delta “function” isn’t a function at all. It’s a functional. This works but it’s conceptually weird: you define the most basic object in physics (a point source) as something that can only be observed indirectly.

Nonstandard: a distribution is just a nonstandard function. The Dirac delta is an actual internal function — a spike of height $H$ and width $1/H$ for unlimited $H$, with area 1. You can evaluate it at a point. You can compose it. You can do arithmetic with it. When you want the standard distribution back, take the standard part of its integral against a test function. Loeb measure subsumes Lebesgue measure in the same way: start with hyperfinite counting measure, apply $\operatorname{st}$, get a real measure. The nonstandard object is simpler and the standard one is a shadow of it.

This is the pattern. The standard construction is an elaborate workaround for not having infinitesimals. The nonstandard construction just uses them.


Dynamical Systems

Recurrent point

Standard: $x$ is recurrent for $f$ iff for every $\varepsilon > 0$ and every $N$ there exists $n > N$ with $\lvert f^n(x) - x\rvert < \varepsilon$.

Nonstandard: There exists an unlimited $N$ such that ${}^*f^N(x) \approx x$. The orbit returns infinitely close to its start, at an infinite time step.


Attractor

Standard: A compact invariant set $A$ such that nearby orbits converge to $A$.

Nonstandard: $A$ is an attractor iff every point $y$ with $\mathrm{st}(y)$ near $A$ gets mapped under

\[{}^*f\]

to a point

\[{}^*f(y)\]

whose standard part is also near $A$.


Ergodicity

Standard: $f$ is ergodic for measure $\mu$ iff every $f$-invariant measurable set has measure $0$ or $1$.

Nonstandard (Kamae 1982): Using the hyperfinite construction with Loeb measure, $f$ is ergodic iff for any standard measurable $A$, $B$, the hyperfinite time-average of visits from $A$ to $B$ is $\approx \mu(A)\cdot\mu(B)$. Check one hyperfinite orbit of length $H$; ergodicity is equidistribution in the monad sense.


Combinatorics

Density in the integers

Standard: $d(A) = \lim_{N\to\infty} \lvert A \cap {1,\ldots,N}\rvert / N$.

Nonstandard: $d(A) = \mathrm{st}\bigl(\lvert {}^*A \cap {1,\ldots,H}\rvert / H\bigr)$ for any unlimited $H$. No limit. Count in a hyperfinite initial segment and round.


Szemerédi’s theorem

Standard: Any subset of $\mathbb{N}$ with positive upper density contains arithmetic progressions of every finite length.

Nonstandard: A hyperfinite set $A \subseteq {1,\ldots,H}$ with $\lvert A\rvert/H \not\approx 0$ contains an arithmetic progression ${a, a+d, a+2d, \ldots, a+kd}$ for any standard $k \ge 1$. The “for large enough $N$” quantifier becomes “for the actual hyperfinite $H$ you’re working in.”


Van der Waerden

Standard: For any $r$-coloring of $\mathbb{N}$, some color class contains arbitrarily long arithmetic progressions.

Nonstandard: Any $r$-coloring of ${1, \ldots, H}$ has a monochromatic arithmetic progression of any standard length $k$ — because $H$ exceeds every standard van der Waerden number $W(k,r)$. Transfer once; done.


Ramsey theory

Standard: For any $r$-coloring of ${1,\ldots,N}$ with $N \ge R(k,r)$, there is a monochromatic $k$-element set (or progression, depending on flavor).

Nonstandard: Transfer the finite Ramsey statement to ${}^*\mathbb{N}$: any $r$-coloring of ${1,\ldots,H}$ has a monochromatic set of any standard size $k$, with $H$ large enough because $H$ is larger than every standard Ramsey number. No asymptotics — just transfer applied once.


Number Theory

Twin prime conjecture

Standard: There are infinitely many primes $p$ such that $p+2$ is also prime.

Nonstandard equivalent: There exists an unlimited prime $P \in {}^*\mathbb{N}$ such that $P+2$ is also (hyperfinitely) prime. “Infinitely many” becomes “at least one unlimited one.” Unprovable by rephrasing alone, but cleaner to state.


Goldbach conjecture

Standard: Every even integer greater than 2 is the sum of two primes.

Nonstandard equivalent: For every unlimited even $N \in {}^*\mathbb{N}$, there exist internal primes $P, Q$ with $P + Q = N$. By transfer, this is equivalent to the standard statement — the nonstandard version just makes the universality explicit: even the unlimited even numbers must split.


Density of primes

Standard: $\pi(N)/N \to 0$ (primes have density zero).

Nonstandard: $\pi(H)/H \approx 0$ for every unlimited $H$, where $\pi(H)$ counts internal primes up to $H$. The prime number theorem says more: $\pi(H) \approx H/\ln(H)$, so $\pi(H)/H$ is not just infinitesimal but approximately $1/\ln(H)$.


Dirichlet’s theorem on primes in progressions

Standard: If $\gcd(a, m) = 1$, there are infinitely many primes $p \equiv a \pmod{m}$.

Nonstandard: For standard $a, m$ with $\gcd(a,m) = 1$, there exist unlimited primes $P \in {}^*\mathbb{N}$ with $P \equiv a \pmod{m}$. Transfer from the finite (sieving) version plus the density result does the work.


Big-$O$ Notation

If you’ve read the big-$O$ post or what does big mean, you know this already. But it fits so cleanly here.

Little-$o$

Standard: $f \in o(g)$ iff $f(x)/g(x) \to 0$ as $x \to \infty$.

Nonstandard: $f(H)/g(H)$ is infinitesimal for every unlimited $H$.

Big-$O$

Standard: $f \in O(g)$ iff there exist $C, N$ such that $\lvert f(x)\rvert \le C\lvert g(x)\rvert$ for all $x > N$.

Nonstandard: $f(H)/g(H)$ is limited (finite) for every unlimited $H$ — $\mathrm{st}(f(H)/g(H))$ exists as a real number.

$\Theta$

Nonstandard: $f(H)/g(H)$ is limited and bounded away from zero — nonzero real after taking the standard part.

The hierarchy is just classifying hyperreals: infinitesimal, finite, or infinite. Pick unlimited $H$, evaluate the ratio, see what kind of number you get. No quantifier alternation.


Standard analysis asks what happens under all sufficiently fine approximations. Nonstandard asks: what does it look like at infinite or infinitesimal scale? Same answer. Second question is easier. Not taking limits — just looking.


Further reading: Keisler’s free calculus textbook. Goldblatt’s Lectures on the Hyperreals. The NSA Wikipedia page is good.

From my notes and my existing posts on big-O notation, what does big mean, derivatives at discontinuities, and discontinuous linear functions.