The purpose of a real valued function is to represent the way a situation changes and the purpose of the differential calculus, the "mathematics of change", is to derive local information about (mostly gradual) changes from punctual information about that function.
The nature of the desired information depends on the situation, if only because a real number has a sign and a size. And so, the desired information can be qualitative---is \(f\) near \(x_{0}\) positive/negative? increasing/decreasing? concave up/concave down?--- or quantitative---what is the approximative value, rate of change, acceleration of \(f\) near \(x_{0}\)?
But, qualitatively, we might also want to know whether, at \(x_{0}\), \(f\) is continuous (resp. differentiable) while, quantitatively, we might ask what the jump (resp. the slope) is.
We hope to show that to study a function by way of its local polynomial approximations is considerably more
natural than, to quote Lagrange, "seeing derivatives in isolation". Specifically, we will argue that the systematic use of polynomial approximations has for the differential study of functions of one real variable much the same advantages that the use of decimal numbers has for the study of real numbers in that it organizes it, unifies and simplifies it «Gleason, 1967 \#34», and, moreover, extends canonically to the Frechet derivative in multi-variable calculus «Flanigan, 1971 \#78», Banach Spaces3 «Dieudonne, 1960 \#112», jets in Differential Topology «Bröcker, 1975 \#68».
Preliminaries
The definition, if not the notion, of function is deceptively simple: A function \(f \) can be as
simple as a power function or as complicated as a fractal. Thus, the general idea when discussing \(f(x_{0}+h)\), the value of \(f\) near a point \(x_{0}\), is naturally to separate the principal part, that is the part smooth enough to be relevant to the information being sought, from the remainder, that is the part too small to be significant in that regard. We thus distinguish \(P^{(n)}(h)\), a polynomial part of degree \(n\) in \(h = x-x_{0}\), and a remainder \(R_{(n)}(h)\) small enough that, compared to \(P^{(n)}(h)\) and for the given purpose, it can be neglected:
\[f(x_{0}+h) = P^{(n)}(h) + R_{(n)}(h) \]
where \(P^{(n)}(h) = A_{0} + A_{1}h + A_{2}h^{2} + ... + A_{n}h^{n}\)
and \(R_{(n)}(h) = o[h^{n}]\)
which we read as saying that \(R_{(n)}(h)\) approaches \(0\)
faster than \(h^{n}\), that is
\[
lim_{h\to0}\frac{R_{(n)}(h)}{h^{n}}=0
\]
Graphically, this means that the graph of \(R_{(n)}(h)\) is under the graph of \(h^{n}\)
in a neighborhood of \(0\).