Eigenvalues and EigenvectorsโWhy are they important?
2015-12-13 | 2020-12-01
Estimated Reading Time: 16 minutes
Stimulating interest in an arcane topic
A university academic friend of mine recently remarked that it was not easy to motivate students to study eigenvalues and eigenvectors, let alone appreciate their importance: the subject itself was abstract, and the applications tended to be domain-specific and somewhat arcane.
A cursory Web search turned up results that confirmed his assertions and concerns. There have been pro and con views about motivating students to learn about eigenvalues and eigenvectors, and especially to convey intuitively their importance.
I then asked, โCan I explain to myself what eigenvalues and eigenvectors are, and why they are important?โ. It also occurred to me that the harried and hurried students of today might derive some benefit from my efforts; hence this blog. It is a brief, largely qualitative, and mathematically non-rigorous article on eigenvalues and eigenvectors that aims to provide meaning and motivation for their study. Corrections and suggestions for improvement are most welcome. ๐
Eigenvalues and eigenvectors
As a general rule, the more powerful an idea, the more prevalent it becomes. Think about words and numbers, and you will see what I mean.
Eigenvalues and eigenvectors are one such powerful idea. It is no surprise that they appear in different guises in different contexts: in oscillating electronic circuits, in dynamical systems, in computer games, in the spectra of atoms, and in Google searches, to name just a few.
The word eigen is German in origin and means โinherent, characteristic, natural, own, or peculiar (to)โ. So the prefix โeigenโ captures the natural essence of the noun it qualifies. Perhaps the word โidiosyncraticโ comes closest to conveying its import.
Matrices
Eigenvalues and eigenvectors are associated traditionally with matrices. If numbers are like tea-leaves, matrices are like tea-bags. They are rectangular arrays of numbers, whether real or complex, that have been hijacked by mathematicians to serve as a shorthand in a variety of contexts. What they mean depends on context and level of abstraction. They can represent geometric transformations in Euclidean space, or systems of linear equations, or systems of linear differential equations with constant coefficients, or linear transformations in vector spaces. Note the recurrence of the word linear here.
Invariance and identity elements
Invariance
is a central concept in mathematics and physics. Adding zero to a number
leaves it unchanged. Multiplying a number by one again leaves it
unchanged. And zero and one are important numbers, usually called the
additive and multiplicative identity elements
respectively. Consider now the matrix equivalent of multiplying by
The
Equation 2 is a particular case of the
general equation for eigenvalues and eigenvectors, which is written:
Calculus
The operation of taking a derivative may be denoted by the
differential operator,
Differential Equations
Linear homogeneous differential equations with constant
coefficients may be written using the
The roots of this characteristic polynomial give us the eigenvalues of the system. Perhaps, the prefix โeigenโ came to be used because of the adjective โcharacteristicโ.
These ideas masquerade under different terminology in linear system and control theory where transfer function, poles and zeros, natural frequency and resonance, and stability are encountered.
Characteristic polynomial of a square matrix
The characteristic polynomial of a square matrix is obtained
likewise. The equation
Linear transformations and vector spaces
A vector
space is a powerful mathematical abstraction that allows us to unify
many disparate branches of mathematics under a uniform taxonomy. Linear
transformations are a particular type of mapping between two vector
spaces over a scalar field, satisfying:
As a case in point, let us say
If, in addition,
The applications of eigenvalues and eigenvectors in linear algebra run far and deep. Suffice it here to merely mention that an extension, fortuitously called spectral theory, even explains the observed spectra of atoms in quantum theory!
A property of eigenvectors
I will here belabour a point that might seem blindingly obvious to
some but frustratingly obscure to others. Let
Worked example
A worked example would normally have made its way here at this point in the article. But because the example is long and might not interest everyone, I have relegated it to the end of the article. Stay tuned if you are enthused.
Resources
I hope that this article has not been so brief as to be cryptic and off-putting. To those in search of greater rigour or a more formal exposition, I would recommend a good linear algebra textbook. The venerable tome that I used at university went by the acronym โKKOPโ after the initials of the surnames of the four authors, Kreider, Kuller, Ostberg, and Perkins [1]. Unfortunately, it is out of print, but as a consolation, Figure 1 is an image of my copy. ๐
For something more contemporary, I would recommend the textbooks and lectures of Professor Gilbert Strang of MIT. They are attuned to those who apply mathematics, like engineers and scientists. There is an archived video of his lecture on eigenvalues and eigenvectors. There are also links to his MIT Open Course Ware (OCW) page for Course 18.06 of Spring 2010, his linear algebra textbook home page [2], and his academic home page.
Many academics make their lecture notes freely available online: google for them. You Tube videos of lectures are another source of information and knowledge, which offer the immediacy of a classroom lecture with the convenience of instant rewind in case you need to catch something you missed.
Online forums offer a slightly more interactive learning experience but again, their depth and quality varies. The Mathematics StackExchange and Quora are two sites that you might explore.
Examples of all the above types of resources have been tucked away within the various links in this article: try them out to get a flavour of what is available.
Importance and applications
If, after all this, you are still unconvinced about the utility of eigenvalues and eigenvectors, think of this analogy. Crystals have natural cleavage planes that allow them to be fractured easily along specific directions. This exploits the symmetry in the crystals. Likewise, eigenvalues and eigenvectors exploit the naturally occurring symmetries of mathematical structures and transformations to allow us to view them more simply and insightfully. Without eigenvalues and eigenvectors, we would have neither radios nor lasers.
To get an idea of the broad sweep of eigenvalues and their applicability, I strongly recommend that you should read a charming article entitled โFavourite Eigenvalue Problemsโ. Another article that takes a breezy look at the subject of this writeup is โWhat the Heck are Eigenvalues and Eigenvectors?โ. It has a disputed explanation (see comments on the article) of how a bridge collapsedโso take that cum grano salis. It also contains a link to a PDF paper interestingly entitled โThe 25,000,000,000.00 Dollar Eigenvector: The Linear Algebra Behind Googleโ, which, in good faith, I think is not a spoof! Indeed, the citation to the original Stanford InfoLab technical report and the actual report are both available online.
Worked example: modelling weather with a transition matrix
Now for the promised example of eigenvalues at workโin a simplified real-life situation, modelling the weather. Let us assume that yesterdayโs weather influences the probability of todayโs weather, and todayโs weather influences the probability of tomorrowโs weather. Each dayโs weather depends only on the previous dayโs weather, i.e., the weather has a โmemoryโ of one day.
To keep it simple, let us have only three weather states: sunny,
cloudy, and rainy, with the stipulation that each day can only be
one of these three. Further, in our matrix, let the ordering be
sunny, cloudy, and rainy, both left to right, and top to bottom. Then,
the column headings represent todayโs weather and the row
headings represent tomorrowโs weather. We then have the state-transition
matrix or Markov matrix
Note that each column of
Let the column-vector
We want to know whether, for this model, there will be an equilibrium or steady-state in the weather, represented by a probability vector with values that remain steady with temporal evolution. The question is how do we find that out?
One obvious way is to compute the downstream weather one day at a
time: think of forging a chain one link at a time because the weather
has a memory of only one day. From Equation 9 we can compute the following:
By induction, the weather vector
In this manner, we can trace the time evolution of the weather and,
if desired, draw a three-dimensional parametric plot of the successive
weather vectors in
A rough and ready method would be to evaluate Equation 11 with
But computing the fiftieth or one-hunderdth power of a matrix is tedious and error-prone if done by hand, and computationally expensive if done by machine, especially if the matrix in question is large.
To devise a better solution, we need to digress briefly to examine diagonal matrices and the diagonalization of square matrices.
Diagonal matrix raised to a power
Suppose that
Observe that:
If we could somehow decompose
Matrix diagonalization or eigen decomposition
We need to diagonalize
the transition matrixโa procedure called eigen
decomposition. A square matrix with non-repeating eigenvalues
and therefore, linearly independent eigenvectors, can be diagonalized.
We demonstrate how this is done for the
Let the three eigenvectors be so denoted:
By induction,
Software Implementation
To get numerical results, I initially tried implementing the above steps with the free open-source mathematics software system SageMath, but found it less than convenient for my purpose.
I then experimented with GNU Octave, which is a free, platform-neutral, open source, high-level interpreted language, primarily intended for numerical computations. It was better suited to the task at hand, and I easily obtained the results discussed below.
The self-explanatory file, weather.m
, may
be downloaded and executed on the command line in the Octave command
window. The discussion below will make better sense after you have thus
executed the file weather.m
. Instructions on how to
download and set up Octave are given here.
Discussion of results from
weather.m
The roots of the characteristic polynomial of
There are three distinct eigenvalues for the transition
matrix,
From Equation 16 we may surmise that the
contributions from
The eigenvectors associated respectively with these eigenvalues, as
spewed out by Octave, are:
None of the column sums of these eigenvectors sums to one. Indeed,
the column sums of
Assembling the matrices
The time evolution of the initial weather vector is then tracked with
1, 10, 20, 50, and 100 iterations of Equation 9. In this case, the weather vector stabilizes
after about twenty iterations to a steady-state vector,
When we track the same temporal evolution for eigenvector
What may be disconcerting, though, is that we now seem to have
two steady-state vectors,
Observe, however, that
Lo and behold!
We do not bother normalizing the eigenvectors associated with
To round things off, we substitute a random initial weather vector in
place of
This means that regardless of what initial weather vector we start with, in about two weeks we will end up with a vector that represents the steady-state.
Observations like these suggest that our inferences are only as good as our assumptions and models. Oversimplification could lead to absurd results, and weather prediction over time is a seriously non-trivial problem.
One general hypothesis that we could examine is whether it is
generally true that the normalized eigenvector associated with an
eigenvalue of
Feedback
Please email me your comments and corrections.
A PDF version of this article is available for download here: