The Exponential and Logarithmic Functions
2004-03-01 | 2025-04-19
Estimated Reading Time: 32 minutes
This is another in my series of blogs on fascinating and mathematically indispensable numbers. It follows on from blogs on zero, one, and π, and is likely to be followed by others. It happens that a single blog is sometimes too short to display the beauty of the subject, and I have had to segment the story into parts. Such will be the case here. While e is less well known to the general public than π, it is perhaps even more fundamental to all of Nature and pervades the entire realm of Mathematics. It would indeed be difficult to discover a nook or cranny of Nature that has not been penetrated by this omnipresent emissary of mathematical order.
After completing this blog, I became aware of Robin Wilson’s Euler’s Pioneering Equation: The most beautiful theorem in mathematics [1]. I was astounded to discover that parts of this blog bear a remarkable resemblance to his chapter 4 on \(e\), in content and unfoldment. This blog is based on lectures that I had originally given in 2004, and its content antedates Wilson’s book. Nevertheless, it is flattering to realize that I have come close to a seasoned professional mathematician’s conceptual exposition on \(e\).
Unfurling countless digits
Perversely, almost all important numbers like \(\sqrt{2}\), \(\pi\), \(e\), etc., in our world are irrational. One simply cannot predict the decimal digit sequence.
“What if I were the creator of such a virtual world, populated like ours, by irrational numbers with unending and unpredictable digits? How would I sustain that world without an infinite memory to hold all those countless digits?”
I would need some convenient, succinct, shorthand method by which to unfurl their countless digits, one after the other. It might be an algorithm like a convergent infinite series or a recursive definition or an infinite continued fraction1.
This thought is a preface to many of the fascinating numbers we will encounter in these blogs.
I am opening this blog with an abrupt exposure to the idea of exponentials, without any courteous introduction or gentle historical note on \(e\), which will follow soon enough though. The reason for this is that I wanted to dispel a possible confusion between \(x^n\) and \(n^x\) that often exists in the mind of the mathematical novice.
Such confusion is best dispelled using whole numbers, and ideally before \(e\) has made its august entrance, rather than afterward, when the door for even greater conceptual muddiness has been thrown wide open. In this blog, I will be zig-zagging repeatedly across the same concepts in different contexts, simply because what we are dealing with is a tad more abstract than usual.
Bases and Exponents
We have introduced the different types of numbers in the blog The Two Most Important Numbers: Zero and One. In that very same blog, we also introduced the idea of exponentiation, or raising (something) to a power, as repeated multiplication. That section is very important: do take a look at it again if it seems faint or foggy now, as some basic results from that blog are worth reviewing at this point.
Monomial power functions
At the very outset, it is important to clear up a possible source of confusion: monomial power functions and exponentials might look similar but are very different.
A monomial power function is a monomial \(ax^n\), with the coefficient \(a\) equal to one, and the value \(n\) being a non-negative integer, i.e., \[ y = x^n\; \text{where $n \in \mathbb{N}\cup\{{0}\}$ and $x \in \mathbb{R}$}. \qquad{(1)}\] Examples are \(y = 1 = x^{0}\), \(y = x\), \(y = x^2\), \(y = x^3\), etc., as shown by the graphs of these functions in Figure 1.
The following qualitative points should be noted:
In each case, \(x\) varies, but \(n\) is constant, as defined in Equation 1.
When \(n\) is even, like \(0, 2, 4\), etc., the graph of \(x^n\) is symmetrical about the \(y\)-axis. Such a function is called an even function, defined as \(f(x) = f(-x)\).
When \(n\) is odd, like \(1, 3, 5\) etc., the graph of \(x^n\) exhibits rotational symmetry about the origin \((0, 0)\), i.e., if the graph is rotated 180° about the origin, the graph remains unchanged. Such a function is called an odd function, defined as \(f(x) = -f(-x)\).
The graph of \(x^{0}\) is constant and its behaviour is anomalous when compared to others in the family, as is apparent from Figure 1.
For \(x \in [0, 1)\) the larger \(n\) is, the closer \(x\) is to \(0\).
For \(x \gg 1\), the larger \(n\) is, the steeper the graph climbs as \(x\) increases.
Except for \(n = 0\), the graphs of \(x^n\) pass through \((0, 0)\) for all other values of \(n\).
The monomial power functions are a subset of the polynomials.
As an exception, I have included in Figure 1 the special case of the positive non-integer power \(e \approx 2.71828\), which is the subject of this blog. This was simply to show that since \(e\) lies between \(2\) and \(3\) its graph is sandwiched between the curves \(x^2\) and \(x^3\). It is shown as a dashed line in Figure 1. But there ends the similarity. In fact, \(x^e\) is not a monomial power function. Negative numbers cannot be raised to non-integer powers and still remain real numbers. So, the domain for \(x^e\) alone is restricted to \([0, \infty)\). If you find all this unhelpful or confusing, simply ignore it for now.
Exponentials
We now consider the second family of functions which might look like the monomial power functions but are really a bird of a different feather. The exponentials are generally defined as: \[ y = a^{x} \; \text{ where $a \in \mathbb{R}; a > 0; a \ne 1;$ and $x \in \mathbb{R}$}. \qquad{(2)}\] Note that the value of \(a\) is constant whereas \(x\) varies. To keep matters simple, we will not consider the case of \(0 < a < 1\) here. Moreover, for our purpose of comparing the behaviour of graphs of \(n^{x}\), we have restricted the definition to be: \[ y = n^{x}; \; \text{ where $n \in \mathbb{N}; x \in \mathbb{R}$}. \qquad{(3)}\] Graphs of this family of functions are shown in Figure 2.
The following qualitative points are noteworthy:
For each graph, the exponent \(x\) varies, but the base \(n\) is held constant for that graph.
The graph for \(n = 1\) is anomalous and constant in value. It is shown only for completeness and may be excluded from the definition of exponentials as in Equation 2.
All other graphs pass through the point \((0, 1)\), which is characteristic of all exponentials.
For the base \(e\), when \(x = 1\), \(e^x = e\), i.e., the dashed graph passes through \((1, e)\).
For \(x < 0\), the values of \(n^{x}\) are greater than \(0\), but less than \(1\), and approach the asymptote \(y = 0\) as \(x \to -\infty\).
As \(x\) increases without bound, so does \(y\).
The larger \(n\) is the steeper the rise of \(n^{x}\) for values of \(x \gg 1\).
The graph of \(e^{x} = \exp(x)\)—shown as a dashed line—legitimately belongs to this class of curves and shares the same domain as other exponentials. Even as \(2 < e < 3\), its graph is sandwiched between those of \(2^{x}\) and \(3^{x}\) as would be expected.
The exponentials are neither odd nor even functions, but their range is non-negative.
The roles of \(n\) and \(x\) have been interchanged between the monomial power functions and the exponentials.
Note how the exponential functions increase exceedingly rapidly compared to the monomial power functions.
A tabular comparison of the values of \(x^{n}\) and \(n^{x}\) will better reveal the large-value behaviour of these two families of functions, as shown in Table 1.
| \(n\) | \(x\) | \(x^{n}\) | \(n^{x}\) |
|---|---|---|---|
| \(1\) | \(10\) | \(10\) | \(1\) |
| \(2\) | \(10\) | \(100\) | \(1,024\) |
| \(3\) | \(10\) | \(1,000\) | \(59,049\) |
| \(4\) | \(10\) | \(10,000\) | \(1,048,576\) |
| \(5\) | \(10\) | \(100,000\) | \(9,765,625\) |
Computational complexity theory
I am belabouring this distinction between the polynomials (or monomial power functions) and the exponentials because many students, especially of computer science, are usually clueless when they encounter the rather forbidding topic called Computational complexity theory in their university studies.
The exponential functions tend to increase extremely rapidly compared to the polynomial functions. Such distinctions become vital when evaluating the efficiency and execution times of algorithms in computer science, and indeed even their solvability in finite time. Keep this difference in mind as we navigate our way through the number \(e\) in this and subsequent blogs.
Introduction to the number e
We are now ready to make our formal acquaintance with the number \(e\), which stands modestly behind \(\pi\) in fame, though not in ubiquity. It appears interwoven into the very fabric of Nature and is pivotal to mathematics, science, and engineering.
Unlike \(\pi\), though, it is relatively unknown to the public at large. Indeed, it did not have its own symbol until relatively recently, when the Swiss mathematician Leonhard Euler assigned it the letter \(e\) around 1731. In fact, I wanted to call this blog, “Euler’s number \(e\)” before I realized that it was actually discovered by Jacob Bernoulli, and that there are several other candidates for Euler’s number besides \(e\).
The number \(e\) is associated with logarithms, exponential growth, exponential decay, compound interest, the differential and integral calculus, the circular and hyperbolic functions, probability, queueing and reliability theories, the Fourier transform, and many other areas of mathematics. This linkage, across sub-disciplines, was not known initially, but only recognized gradually as “things fell into place” later on.
In this sense, the history of \(e\) is like that of, say, wavelets [3] in recent times, when it transpired that physicists, electrical engineers, and pure mathematicians had all approached the same idea from different standpoints and terminologies. A sound theory was only born after these diverse viewpoints had been integrated into a coherent body of knowledge.
Among the important numbers of mathematics, the linkage between \(\pi\), \(e\) and \(i\) is deeply entrenched. Here is an equation, which was raised to mystical status by an American professor of mathematics, Benjamin Peirce, who was photographed standing in front of a blackboard on which he had written: \[ i^{-i} = \sqrt{e^\pi} \qquad{(4)}\] He was quoted as saying, “Gentlemen, we have not the slightest idea what this equation means, but we may be sure that it means something very important [4,5].” We will re-visit this equation and de-mystify it later in another blog in this series.
While \(\pi\) is the ratio of the circumference of a circle to its diameter, what exactly is \(e\)? And, if it is so important, why is \(e\) not more widely known? What properties does \(e\) possess that make it so useful and pervasive? We shall attempt to answer these questions and more in this and related blogs.
The power of the exponent
Did you read that heading carefully? And did you get the pun in it?
We have already peeked into exponentiation in Table 1. Just as multiplication is a shorthand for repeated addition so too is exponentiation a shorthand for repeated multiplication. It has been said that human beings are not very good when it comes to comprehending the very large and the very small.
If I gave you a stick that is one metre long and told you to divide it into one thousand equal parts, how long would each division be? If I now told you that the same stick represented one million divisions, and asked you to mark the first one thousandth part, where would you mark it? I am not going to tell you, because this one is easy enough for you to figure out for yourself. It will tell you how good or bad your ability to estimate is.
What happens if the scale is not linear but logarithmic? Let your mental cogwheels again start turning. If you find all this too exhausting, simply look at Figure 3 below.
The power of two
There is a famous story about the person who invented the game of chess.2 The monarch of the realm was so pleased with the game that he wanted to reward the inventor. Feeling very expansive, he said “Ask for anything and I will give it to you.” The inventor rather diffidently asked the king for one grain of rice on the first square of the chess board, double that number of grains on the second, double that number of grains again on the third, and so on till all the sixty four squares had their quotas filled [6].
The king laughed and said, “Ask for something more. You deserve it.” The inventor quietly but persistently said, “Sire, kindly grant me what I have asked.” The king jovially asked his ministers to fulfil the inventor’s modest request, thinking all would be well. Little did he know that the entire granary of the kingdom would be emptied before each square received its quota of rice grains. Can you explain why?
Grains of rice on a chess board
Let us number the squares on the chess board from \(1\) to \(64\). The first square has one grain, which is \(2^0\). The second has two grains, which is \(2^1 = 2^{(2-1)}\). Likewise, the \(k\)th square will have \(2^{k-1}\) grains of rice.
The total number of grains of rice will be given by the formula: \[ \begin{aligned} T = \sum_{k = 1}^{64} 2^{k-1} \end{aligned} \qquad{(5)}\]
Recognizing this as the sum of a geometric series with \(a = 1\), \(r = 2 > 1\), and \(n = 64\), the sum \(T\) is given by [7]: \[ \begin{aligned} T &= \dfrac{a(r^n - 1)}{r - 1}\\ &= \dfrac{1(2^{64} -1)}{1}\\ &= 2^{64} - 1 \\ &\approx 2^{64} \end{aligned} \qquad{(6)}\]
Assuming that 50 grains of rice have a mass of one gram, the total mass of \(2^{64}\) grains of rice in metric tonnes would be \(\tfrac{2^{64}}{50\times10^6} \approx 3.7 \times 10^{11}\) metric tonnes. India’s total annual rice production in 2023–2024 was \(1378.25 \times 10^5 \approx 1.38 \times 10^{8}\) metric tonnes. The inventor of chess in the seventh century asked for more than \(2,500\) times the rice produced in India in 2023–2024! He certainly knew about the power of the exponent.
The moral of this story is that exponentials are beguilingly difficult for human beings to grasp. That is why logarithms and logarithmic scales, which linearize exponentials, were invented.
Napier and logarithms
Logarithms were developed by an eccentric3 Scottish laird called John Napier around 1614. He devoted twenty years of his life to achieve this. In these days of mobile phones with calculators, and computational packages on laptops, it is difficult to imagine a time when the tedium of calculations impelled people to seek methods to ease the burden.
It has been suggested that Napier got the idea for performing additions in place of multiplications from trigonometric identities such as \[ \sin A\cos B = \dfrac{1}{2}\left[ \sin(A+B) + \sin(A-B) \right] \] He might just as well have gotten the idea from the geometric progression \(1, r, r^2, r^3, \dots r^n\), where each successive term is obtained by multiplying the previous one by \(r\): something which could equally well have been accomplished by adding the exponent of \(r\)—which is 1—to that of the previous term. This idea which may seem blasé to us now was profoundly significant in Napier’s time. The laws of indices which we now know, form the basis of the idea for logarithms.
Therefore, logarithms eventually reduced multiplications to additions and exponentiations to multiplications.4 Likewise, divisions became subtractions, and taking roots was replaced by divisions. This reduction in the hierarchy of the arithmetic operations came with a commensurate reduction in computational complexity. Logarithms were indeed a great labour saving device for arithmetic operations.
Where does \(e\) fit into all this? In quite a roundabout way, really.
Napier coined the word logarithm which means “ratio number”. The scheme he devised was to produce a table of numbers \(N\) against \(L\) [8] where \[ N = 10^7(1-10^{-7})^L \qquad{(7)}\] Comparing this with the modern notation introduced by the prolific Euler, that \[ N = b^L \] we find that what might correspond to the base in Napier’s logarithms was \(b = (1-10^{-7}) = 0.9999999\), which because it is less than 1 means that his logarithms decreased with increasing numbers. Moreover, because of the factor \(10^7\), setting \(L=0\) gives \(10^7\) in Napier’s scheme whereas in modern notation, \(L=0\) gives \(1\) regardless of the base.
The strange thing is that logarithms to the base \(e\), now called natural logarithms, used to be called Napierian logarithms, although he did not use \(e\) as the base. How did this association then arise?
Let us manipulate Equation 7 step-by-step as shown below to achieve the form \(N = b^L\): \[ \begin{aligned} N &= 10^7(1-10^{-7})^L\; ; \; \text{ divide both sides by $10^7$}\\ \dfrac{N}{10^7} = N' &= (1-10^{-7})^L\; ; \; \text{ set $L = 10^7L'$}\\ &= (1-10^{-7})^{10^{7}L'}\\ &= [(1-10^{-7})^{10^{7}}]^{L'}\\ &= [(1-\dfrac{1}{10^{7}})^{10^{7}}]^{L'}\\ &= b^{L'} \end{aligned} \qquad{(8)}\] The interesting point of the above derivation is that the number \(b = [(1-\tfrac{1}{10^{7}})^{10^{7}}]\), which we may now associate with the base of Napier’s logarithms using modern convention, works out to \(0.36787942297110\), which is very close to \(\tfrac{1}{e} \approx 0.36787944117144\) [9]. This means that Napier had unwittingly used \(\tfrac{1}{e}\) as the base of his logarithms and was tantalizingly close to discovering \(e\), or its reciprocal.5 The fact that \(e\) may be the result of a limiting process sets the scene for the next stage in the dénouement.
Compounding of interest
Banks charge or pay compound interest on money borrowed or invested with them. Let us assume that a sum \(P\) is invested with a bank that pays compound interest at the rate of \(r\) per annum, where \(r\) is expressed, not as a percentage, but as a fraction between zero and one. Let this interest be paid annually. Then at the end of one year, the money would have grown to \(P(1+r)\). At the end of two years, the money would have grown to \([P(1+r)](1+r)=P(1+r)^2\). Thus after \(t\) years, the money would have grown to \(P(1+r)^t\).
In point of fact, nowadays, banks do not compute interest on an annual basis. They do so on a daily basis. Let us assume that there are \(n\) days in a year. Then, the interest rate per period, which in this case is the rate per day, is \(\tfrac{r}{n}\) and there are \(n\) periods of compounding in one year giving a sum at the end of the year of \(P(1 + \tfrac{r}{n})^n\). Likewise, in \(t\) years, there are \(nt\) periods of compounding and the sum \(S\) at the end of \(t\) years will be: \[ S = P\left[ 1+\dfrac{r}{n}\right] ^{nt} \]
Now, what happens when the number of compounding periods grows? What happens if banks do not compute interest daily but every hour, or every minute, or every second? Is there a possible “get rich quick scheme” that involves getting paid interest every millisecond, say, or every nanosecond?
Change in compounding period
We will write a simple program to investigate how money grows as the frequency of compounding keeps increasing. The equation we will use is \[ S = P\left[ 1+\dfrac{r}{n}\right] ^{nt} \qquad{(9)}\] where \(P\) is the principal, \(r\) is the annual interest rate expressed as a fraction, \(n\) is the number of compounding periods per annum, \(t\) is the number of years and \(S\) is the sum or amount at the end of \(t\) years.
We assign \(P = 100\), \(t = 1\), \(r = 0.05\), and allow \(n\) to vary across annual, semi-annual, quarterly, monthly, weekly, daily, and hourly compounding periods. These correspond to values of \(n\) equal to \(1, 2, 4, 12, 52, 365, 8760\) respectively.
\(S\) is computed using Equation 9 and the values of \(n\) and \(S\) are tabulated in Table 2. Two scripts are provided, one in Julia, and the other in Python 3, that accomplish this. The results are shown below in Table 2 where the last row has been added manually, as explained later on.
| \(n\) | \(S\) |
|---|---|
1 |
105.000000 |
2 |
105.062500 |
4 |
105.094534 |
12 |
105.116190 |
52 |
105.124584 |
365 |
105.126750 |
8760 |
105.127095 |
| \(\infty\) | 105.127109 |
What do you find noteworthy about this? Regardless, of how frequently the interest is compounded, the amount or sum \(S\) is solidly stuck around \(105.127\) or thereabouts. One might be forgiven for thinking that if the interest were added with breathtaking rapidity, the sum would somehow multiply astronomically. But alas, that is not how it works.
There is one trend that is apparent from the figures in the above table, though. The numbers after the decimal place do increase very modestly even if they seem to bounded from above by some number. The one way to find that number is to progress from periodic compounding to instantaneous compounding. We derive the exact value of \(S\) for instantaneous compounding later in this blog in What is the amount with instantaneous interest?.
With the word instantaneous, we are on thin ice. Instantaneous velocity gave us calculus, with its inbuilt inconsistencies of dividing by something that is close to but not quite zero. So, we may expect something along those lines here also. Whenever instantaneous makes its presence onstage, zero and infinity cannot be far away. 😉
The road to \(e\)
There are three variables apart from \(n\) in Equation 9
for \(S\). Let us simplify it by
setting \(P = 1\), \(t = 1\), and \(r
= 1\). Note that the last assignment means that the bank pays
\(100\)% interest per annum: something
that is very unlikely, but mathematically expedient for us! The equation
for \(S\) now becomes \[
S = \left[ 1+\dfrac{1}{n}\right] ^{n}
\qquad{(10)}\] A Python 3 script called steps_to_e.py
evaluates Equation 10 at logarithmic intervals
and its results are tabulated below:
n e
-----------------------------
1 2.00000000000000000
10 2.59374246010000231
100 2.70481382942152848
1000 2.71692393223559359
10000 2.71814592682492551
100000 2.71826823719229749
1000000 2.71828046909575338
10000000 2.71828169413208176
100000000 2.71828179834735773
The values are suggestive of convergence, but it is not rapid. The
limit is the historically named number \(e\). A check
with Wolfram Alpha gives the value of \(e\) as 2.71828182845904524 to
seventeen decimal places.
We can also countercheck with SymPy, the Python library for symbolic mathematics, by running the script below:
from sympy import *
n = symbols("n")
S = limit((1 + 1 / n) ** n, n, oo)
print(S)to get the result E, which attests to the validity of
the limit. The script is at limit_e.py.
The expression \[ \lim_{n \to \infty}\left[1+\frac{1}{n}\right]^n \qquad{(11)}\] does converge to a finite non-zero value, which is its limit. And the value of this limit is the profoundly important mathematical constant \(e\): \[ e \triangleq \lim_{n \to \infty}\left[1+\frac{1}{n}\right]^n \qquad{(12)}\]
What is the sum with instantaneous interest?
Instantaneous compounding does not lead to unlimited growth. We have guessed as much from the results of evaluating Equation 9 for different values of \(n\), as shown in Table 2.
Now that we have defined \(e\), we may obtain a closed form solution for the amount from instantaneous compounding of interest [10]. In Equation 9, we retain \(P\), \(r\) and \(t\) and let \(n\) approach infinity. \[ \begin{aligned} S &= \lim_{n \to \infty} P\left[ 1+\dfrac{r}{n}\right] ^{nt}\\ &= P \lim_{n \to \infty}\left[ 1+\dfrac{r}{n}\right] ^{nt} \end{aligned} \] A purist would use \(x\) rather than \(n\) when moving from countable intervals to continuous compounding. So, let us re-state the equation with \(x\): \[ S = P \lim_{x \to \infty}\left[ 1+\dfrac{r}{x}\right] ^{xt} \] A magician’s distraction is called for here. We want the expression within the parentheses to have a second term with one as the numerator so that it looks like the second term in Equation 12. Let \(\frac{r}{x} = \frac{1}{u}\). Then, the above equation becomes \[ \begin{aligned} S &= P \lim_{x \to \infty}\left[\left( 1+\dfrac{r}{x}\right)^x\right]^t\\ &= P \lim_{x \to \infty}\left[\left( 1+\dfrac{1}{u}\right)^{ur}\right]^t\\ &= P \lim_{x \to \infty}\left[\left( 1+\dfrac{1}{u}\right)^u\right]^{rt}\\ &= P\left[ \lim_{x \to \infty}\left( 1+\dfrac{1}{u}\right)^u\right]^{rt}\\ &= Pe^{rt} \end{aligned} \] We can now confidently augment Table 2 by adding the last row with a value of \(\infty\) for \(n\) and an upper bound of \(S = Pe^{rt} = 100e^{0.05} = 105.1271096\).
Thus far, we have distinguished between \(x^n\) and \(n^x\), emphasized that exponential growth is truly phenomenal, considered compound interest at ever decreasing intervals between interest payments, which finally let to the definition of \(e\).
We have also glancingly looked at logarithms and contrasted linear and logarithmic scales. Central to all this is the rather diminutive number \(e\) lying between \(2.5\) and \(3\) that occupies a central place in much of mathematics.
Hereafter, we will continue exploring \(e\) and slowly invest it with mathematical trappings that go beyond mere numberhood and allow fascinating insights to emerge between seemingly unrelated fields.
Logarithms and the hyperbola
Limits are at the heart of both the differential and integral calculus. You have just seen one application of limits in defining the important number \(e\). We will now take a look at the use of limits in integral calculus and the use of the logarithm as a function rather than as a mere computational aid. Our journey takes us through the history of finding areas under curves before the calculus had been fully fleshed out.
The procedure of finding the area under a closed planar curve is called quadrature or squaring. This is because the area may be thought of as being composed of little squares, which when assembled together and summed, equal the area under the curve.
Pierre de Fermat in France had achieved great success in computing the areas under curves of the form \(y = x^n\).6 His method was to use a series of rectangles whose bases formed a geometric progression with common ratio, \(r\) less than one, and which therefore converged to a finite sum which could be calculated. The one curve, though, that he could not handle was the rectangular hyperbola, which is really a pair of curves defined by \(y = \frac{1}{x}\). If Fermat applied his formula \[ \int x^n dx = \dfrac{x^{n+1}}{n+1} + C \] he faced the problem of division by zero when \(n = -1\), and the method failed.
Computing the area
It was one of Fermat’s contemporaries, Grégoire Saint-Vincent, who was known as the “circle-squarer”, who found a way to solve this problem. He also used intervals that were in a geometric progression, but he made an important discovery in the case of a hyperbola like \(y = \frac{1}{x}\).
Saint-Vincent started his integration at \(x = r^0 = 1\) and divided the area under the curve into intervals along the \(x\)-axis that were in a geometric progression as shown in Figure 5. He estimated the areas of the differently coloured strips, and found that they were equal in area to each other. This was Saint-Vincent’s profound and original contribution. How did he do this?
The account of Saint-Vincent’s method, as described below, has been drawn from several sources [8,9,11,12]. It has been simplified to use modern methods and terminology, while remaining faithful to the original in spirit and conception.
Consider Figure 6 which is Figure 5 redrawn to show how the unknown areas \(A_1\), \(A_2\), etc., may be approximated by the known areas \(T_1\), \(T_2\), etc., of the respective trapeziums that are shown. Note that the areas \(A_1\), \(A_2\) etc., are contiguous and non-overlapping.
Dashed lines like \(PQ\) connecting the points \(P\) and \(Q\) on the arc of the hyperbola, are drawn corresponding to \(x = 1\) and \(x = r\) respectively, to get a trapezium whose area, \(T_1\) is known exactly. The area of that trapezium is used to estimate the area \(A_1\), as explained below.
The point \(P(1, 1)\) lies on every rectangular hyperbola. Its \(x\)-coordinate represents the start of both the geometric progression and the interval of integration. In our case, the common ratio \(r > 1\) because we do not seek convergence. The initial \(x\)-value is shown as \(r^0 = 1\) on Figure 6.
\(Q(r, \frac{1}{r})\) also lies on the hyperbola. The straight line \(PQ\) is an approximation to the arc \(PQ\) on the hyperbola. The trapezium with heights of \(1\) and \(\frac{1}{r}\) and width \((r -1)\) represents a first approximation to the unknown area \(A_1\) shown in Figure 5. The known area of the trapezium, \(T_1\), is \[ \begin{aligned} T_1 &= \frac{1}{2}\left[\frac{1}{1} + \frac{1}{r}\right]\left[r - 1\right]\\ &= \frac{1}{2r}\left[ r^2 - 1\right]\\ &\approx A_1. \end{aligned} \]
Moving to the next trapezium with base between \(x = r\) and \(x = r^2\), we have \[ \begin{aligned} T_2 &= \frac{1}{2}\left[\frac{1}{r} + \frac{1}{r^2}\right]\left[r^2 - r\right]\\ &= \frac{r}{2r^2}\left[r + 1\right]\left[r - 1\right]\\ &= \frac{1}{2r}\left[ r^2 - 1\right]\\ &\approx A_2. \end{aligned} \]
This pattern of all the trapezium areas being the same was the remarkable observation of Saint-Vicent.
By repeatedly subdividing the intervals it may be shown that in the limit, the values of each of the \(T_i\) and \(A_i\) will become equal. We will henceforth use \(A\) to denote the single value shown as \(A_1\), \(A_2\), \(A_3\), etc., in Figure 5, Figure 6. Note that the lower limit of area summation is \(1\) in all cases. We may then tabulate the respective integrals, intervals of summation, and areas so [12]:
| Integral | Upper limit | Area |
|---|---|---|
| \(\displaystyle\int_1^{r^0} \frac{1}{x}\mathrm{d}x\) | \(r^0\) | 0 |
| \(\displaystyle\int_1^{r^1} \frac{1}{x}\mathrm{d}x\) | \(r\) | \(A\) |
| \(\displaystyle\int_1^{r^2} \frac{1}{x}\mathrm{d}x\) | \(r^2\) | \(2A\) |
| \(\displaystyle\int_1^{r^3} \frac{1}{x}\mathrm{d}x\) | \(r^3\) | \(3A\) |
| \(\displaystyle\int_1^{r^4} \frac{1}{x}\mathrm{d}x\) | \(r^4\) | \(4A\) |
And this is where the matter rested, until Alphonse Antonio de Sarasa—a student and later a colleague of Grégoire Saint-Vincent—took a look at the results, and realized that it was a mapping between a geometric and an arithmetic series, which meant that logarithms were involved.
A logarithm is a continuous real-valued function with the following two properties [14]:
\(\log(1) = 0\); and
\(\log(ab) = \log(a) + \log(b)\).
Let us see if the function \[ \displaystyle\int_1^{t} \frac{1}{x}\mathrm{d}x = \lambda(t) \] satisfies these two properties. By the first row of Table 3 property (a) is satisfied. Again, we have from Table 3 that \[ \begin{aligned} \int_1^{r^2} &= \int_1^{r} \mathrm{d} x + \int_{r}^{r^2}\mathrm{d}x\\ &= \int_1^{r} \mathrm{d} x + \int_{1}^{r}\mathrm{d}x\\ &= 2\int_1^{r}\mathrm{d}x \end{aligned} \] In other words, \(\lambda(r^2) = 2\lambda(r)\). So, we may assert that the area under a hyperbola gives rise to a logarithm function: \[ \int_1^{t}\frac{1}{x} \mathrm{d}x = \lambda(t) = \log(t). \qquad{(13)}\] The only question now is, what is the base of the logarithm?
The function that equals its own derivative7
An exponential function \(f\) to a base \(b\) is defined as \[ y = f(x) = b^x,\; x \in \mathbb{R} \]
Let us investigate the derivative of \(y = b^x\) using the definition so: \[ \begin{aligned} \dfrac{\mathrm{d}y}{\mathrm{d}x} &= \lim_{h \to 0}\dfrac{b^{(x+h)} - b^{x}}{h}\\ &= \lim_{h \to 0}\dfrac{b^{x}(b^{h} - 1)}{h}\\ &= b^{x} \lim_{h \to 0}\dfrac{b^{h} - 1}{h} \end{aligned} \] The limit on the right hand side may or may not exist. Let us assume for now that it does. Then we may set it to a value \(k\) and we then have the important relationship \[ \begin{aligned} \dfrac{\mathrm{d}y}{\mathrm{d}x} &= b^{x} \lim_{h \to 0}\dfrac{b^{h} - 1}{h}\\ &= kb^x \end{aligned} \qquad{(14)}\] which means that the derivative of an exponential function at any point is proportional to the value of the function itself, at that point.
The next question is this: is there any value of \(b\) for which the constant of proportionality equals one? That would give us a function whose value at any point equals its derivative at that point. Let us investigate.
For finite \(h\) we set the limit term on the RHS of equation Equation 14 to 1, i.e., \[ \dfrac{b^{h} - 1}{h} = 1 \qquad{(15)}\] If this expression were identically equal to 1, then we may assert that \[ \displaystyle \lim_{h \to 0}\dfrac{b^{h} - 1}{h} = 1 \]
Solving Equation 15 for \(b\), we get \[ b^{h} = 1 + h \] and taking “roots” on either side, \[ b = \left( 1 + h \right)^{\frac{1}{h}} \qquad{(16)}\] Since Equation 16 has been “derived” from Equation 14, taking the limit as \(h\) tends to zero for either should give equivalent results. That is the value of \(b\) that makes an exponential function its own derivative is also the value of \(b\) that results from \[ b = \displaystyle \lim_{h \to 0}\left( 1 + h \right) ^{\frac{1}{h}} \qquad{(17)}\] If in Equation 17 we replace \(\frac{1}{h}\) by \(m\), and note that \(h \to 0\) is equivalent to \(m \to \infty\), we may re-write Equation 17 as \[ b = \lim_{m \to \infty}\left( 1 + \dfrac{1}{m} \right)^{m} \qquad{(18)}\] But we know from Equation 12 that the limit in Equation 18 is by definition equal to \(e\).
It has been a bit of a hard slog, but we can now confidently say that the unique function that is its own derivative and anti-derivative is the exponential function with base \(e\). Indeed, this function is sufficiently important for it to be called the natural exponential function or the exponential function, as we have already seen.
So, when we talk of the exponential function, we mean \[ \exp({x}) = e^x, \quad x \in \mathbb{R} \]
Let us see where the foregoing leads to. Let \(y = e^x\). Then \[ \begin{aligned} \dfrac{\mathrm{d}y}{dx} &= \dfrac{\mathrm{d}}{\mathrm{d}x}\left( e^x \right)\\ &= e^x\\ &= y \end{aligned} \qquad{(19)}\] If one takes reciprocals on either side of Equation 19, one gets \[ \begin{aligned} \dfrac{\mathrm{d}x}{\mathrm{d}y} &=\dfrac{1}{e^x} \\ &= \dfrac{1}{y},\; \text{ i.e.,} \\ \mathrm{d}x &= \dfrac{\mathrm{d}y}{y,}\; \text{ leading to} \\ x &= \int\dfrac{1}{y}\mathrm{d}y \end{aligned} \qquad{(20)}\] This appears similar to the equation for the area under the rectangular hyperbola given in Equation 13. But what is \(x\) in terms of \(y\)?
The natural exponential and logarithmic functions
The natural logarithm function is that logarithm function that has \(e\) as its base. In a generic fashion, one may write it as \(\log_{e}\) but the accepted convention is to refer to it as \(\ln\).8 Now \(\ln\) is the inverse of the exponential function, \(\exp\), which means, \[ \begin{aligned} \ln{(\exp{(x)})} &= \ln{e^x} = x,\quad x \in \mathbb{R}\; \text{ and conversely}\\ \exp(\ln(x)) &= e^{\ln(x)} = x,\quad x \in (0, \infty). \end{aligned} \] Note carefully that because \(\exp{(x)}\) is strictly greater than zero for real \(x\), the domain of the natural logarithm function (and indeed of all logarithm functions) is \((0, \infty)\). Inverse functions are reflections of each other on the line \(y = x\) on the Cartesian plane. This is illustrated for \(\exp{(x)}\) and \(\ln{(x)}\) in Figure 7.
We are now in a position to answer the question asked at the end of the section Computing the area about the base of the logarithm which gave the area under a hyperbola. The base of the logarithm is \(e\) and we may write: \[ \int_{1}^{t} \dfrac{1}{x} \mathrm{d}x = \ln t \qquad{(21)}\]
One may then use Equation 12 to define the \(\exp\) function and Equation 21 to define the \(\ln\) function, with the knowledge that they are an inverse function pair.
One might wonder if there is a geometrical significance to the number \(e\) like there is for \(\pi\) as the ratio of the circumference to the diameter of a circle. Think about this for a while before reading on. As before, the conics hold the answer.
Substituting \(t = e\) in Equation 21, we get \[ \int_{1}^{e} \dfrac{1}{x} \mathrm{d}x = \ln(e) = 1. \qquad{(22)}\] This equation comes closest to stitching \(e\) to geometry, but it uses the thread of calculus! Mark how the number \(1\) also plays a prominent role here.
Logarithms and dynamic range compression
Our human senses of sight and hearing each have enormous dynamic ranges. The eye can respond to light intensities across 13 orders of magnitude.9 Likewise our ears can hear sound intensities ranging from whispers to explosions, across 12 orders of magnitude.
If you think of a weighing scale, it usually has a scale that ranges, from say 0 kg to perhaps 150 kg. Most instruments only have a limited range over which they can measure. To increase the range, you may have to switch the input to another scale before making the measurement. How then do our ears and eyes accommodate such large dynamic ranges without the need for any form of switching?
The answer lies with logarithms. Logarithms naturally compress a large linear range to a more compact one. This would be clear from the graph of the logarithm function plotted in Figure 7.
There is a “law” first propounded by the German physiologist Ernst Heinrich Weber that the “just noticeable difference” (JND) that human beings experienced to any physiological stimulus was related by the differential equation \[ \mathrm{d}s = k \dfrac{\mathrm{d}W}{W} \] where \(\mathrm{d}s\) is the JND, \(W\) the stimulus already present, and \(\mathrm{d}W\) the stimulus increase. The German physicist Gustav Theodor Fechner popularized Weber’s hypothesis, which leads to the solution \[ s = k \ln W + C \] This is referred to as the Weber-Fechner law, but is really only a hypothesis that has not achieved the status of a theory, much less a law, especially because it has to do with subjective sensation and perception.
The important lesson for us is that logarithmic compression allows very large dynamic ranges to be accommodated, without input sensor switching. Logarithmic scales abound in the natural sciences and engineering: the pH scale for acidity, the Richter scale for measuring earthquake intensities, and the decibel scale for sound intensity, or for signal voltage, and power in electrical engineering, to name just a few.
Why is \(e\) important?
We have now reached the stage where we can answer the question, “Why is \(e\) important?”
The number \(e\)’s claim to fame is because of the remarkable properties of the exponential function \(\exp(x)\), which has the unique distinction of being its own derivative and anti-derivative. Stated formally, \[ \dfrac{\mathrm{d}}{\mathrm{d}x}\exp(x) = \exp(x) \qquad{(23)}\] and, \[ \int\exp(x) \mathrm{d}x = \exp(x) + C \qquad{(24)}\] where \(C\) is an arbitrary constant of integration. Nature is full of systems that can be modelled using this property of exponentials. In addition, the exponential and logarithmic functions are a formidable inverse mathematical pair, as we have seen in this blog.
In a succeeding blog, we will see that \(e\) is the natural bridge between the real and complex domains—a connection that has given rise to some very powerful mathematics.
Acknowledgements
Thanks are due to Wolfram Alpha, Desmos, and the various AI bots, too numerous to mention, for assistance at various stages of preparation of this blog.
Feedback
Since I work independently and alone, there is every chance that unintentional mistakes have crept into this blog, due to ignorance or carelessness. Therefore, I especially appreciate your corrective and constructive feedback.
Please email me your comments and corrections.
A PDF version of this article is available for download here:
References
I later found that this link is a chapter from a draft of the book with the charmingly alliterative title Amazing and Aesthetic Aspects of Analysis [2] where it is now chapter 8.↩︎
The precursor called chaturanga was invented in India around the 600s.↩︎
This word has both a common and a mathematical meaning. Can you reconcile the two?↩︎
We touched upon this idea in the blog Varieties of Multiplication.↩︎
It is often erroneously believed that Napier used \(e\) as the base of his logarithms, but we know that his “base” was less than 1 and was indeed \(\frac{1}{e}\).↩︎
These are our monomial power functions.↩︎
Beginning with the heading, this section, more than others, is heavily borrowed from Eli Maor’s excellent text e: The Story of a Number [9].↩︎
Mathematical conventions and practice might change. Programming languages might use \(\log\) instead of \(\ln\). Beware! You have been forewarned.↩︎
An order of magnitude conventionally means a power of ten. Two orders of magnitude thus refers to a ratio between two quantities that is either one hundred or one hundredth.↩︎