Up: Laws of Physics :
Why should we study probability?
Probability lies at the heart of nature, and the inner workings of a myriad
of phenomena and processes are unlocked to the human mind only by understanding
the fundamental principles of probability. In particular, the
entire field of thermodynamics, statistical mechanics and quantum
theory are described by concepts and ideas from probability
Probability is that part of mathematics which gives a precise
meaning to the idea of uncertainty, of not fully knowing the
occurrence of some event. We often hear that there is a good chance of a
shower; people bet with different odds, say 1 to 5, that
a football match will be won by some team and so on. In each case,
we are making a guess as to what will be the outcome of some
event. Probability theory is the science that quantifies our ignorance,
and spells out the likelihood of our guess being realized for a
particular case. Probability theory cannot address every category
of uncertainty. In particular, probability theory can say nothing
about uncertainties in events that cannot in principle be repeated.
For example, uncertainty
as to whether miracles are possible and so on are
questions that are not amenable to quantitative study.
Rather, probability theory can only analyze those events which can
be repeated many times, and in such cases it can make
predictions such as what is the expected average value of an outcome, the
possible deviations from the expected average value and so on.
Probability theory is ideally suited, for
example, to address
questions related to games of chance - be it games
based on drawing from a pack of cards or on rolling a dice - since
in such games the same
procedure is followed time and again. Predictions can be made on,
how many times, on the average, that one will obtain an ace in a hundred draws.
It should come as no surprise that the theory of probability,
developed in its modern form by Simon de Laplace, originated in his
analysis of games of chance.
Ideas from probability theory will turn out to be fundamental
in understanding concepts such as entropy and quantum theory.
In physics, there is a two-fold source of probability.
The most universal and important
definition of entropy is based on our understanding of the microscopic nature of matter.
Given that any piece of macroscopic matter is made out of around
atoms, a detailed knowledge of what each
atom is doing is no longer possible, and instead one needs to resort to a statistical
and probabilistic description.
Note that we need to resort to probability in understanding entropy
due to our ignorance,
or equivalently, due to our incomplete knowledge, of what is happening at the
microscopic level. Probability in
physics originating from ignorance is called classical
In quantum theory,
the situation is quite different. Even when we describe a single quantum
particle, we need to resort, of necessity, to a probabilistic description
since it turns out that nature is inherently probabilistic. We
call such intrinsic uncertainty as quantum probability.
To prepare ourselves to engage with the ideas of entropy and quantum theory,
we briefly review the underlying ideas of probability. The review
is rudimentary, and is focused on the concepts which will be
directly employed in later discussions.
Consider the simplest possible example, namely the tossing of a coin.
The outcome of a throw is uncertain, and can come up H(head) or
T(tail). From the point of view of physics, this (classical)uncertainty is
due to our ignorance since in principle we can
predict the outcome of the toss once we know exactly how the coin is tossed.
Since the outcome of a throw is uncertain, we say that the outcome is
A random variable is defined to be a quantity, such as the
position of a particle, which has many possible outcomes, together with the
likelihood of each outcome being specified. If an event is certain
to occur, its probability is taken to
be 1, and if an event is impossible, its probability is taken to
be 0. The likelihood of any event occurring, namely , is always
positive and must lie between 0 (impossibility) and 1
(certainty), that is
A random variable whose outcomes are discrete, such as that of
tossing a coin, is a discrete random variable.
The random outcome of tossing a coin is an example of a Bernoulli random
variable, which is defined to be a random variable with only two
outcomes, namely either H or T. Let be the likelihood that
H will appear, and let be the likelihood that T will occur.
We also have that the
result of every throw must either be H or T, and hence
In summary for the general case of
a biased coin we have the following.
We all know that if we have a fair coin, it is equally likely
that H or T may result from each throw. Hence, for a fair coin we have
Hence, for a fair coin
Suppose we throw a coin times - or equivalently throw identical coins once,
and ask: what is the
likelihood of obtaining heads,
regardless of what sequence they appear. We denote this probability by
. The probability of a particular sequence of
-heads and tails - for example, the first outcomes are
heads and the rest being tails - is given by multipliying
the likelihood of every throw in the sequence, and yields
Recall we are interested in obtaining heads in throws
regardless of what sequence they appear in. For example, all the
heads can occur, for example, in the last throws, as well as in
any sequence of heads in
throws. Hence, we need to find out how many
different ways that heads can occur in throws, namely
and then multiply it by the probability
of obtaining a particular sequence which is given by (4.3).
Example: Suppose we throw the coin
times, and we want to know, for example, how many times a single head
occurs. We have the following possible outcomes:
Clearly . In general we can
make a Pascal triangle, as given in Figure 4.1, where the total number of boxes in
a given row denote the total number of throws . As one goes along a row of Pascal's
triangle, the entries count the number of heads , which is given by .
The result for the general case is given by the well known
Hence the probability of obtaining -heads in -throws is given by
A random variable having as its possible outcomes, with
the probability for
the outcomes given by eq.(4.6) above is called a binomial random variable.
We have, using the binomial theorem, that
The result above is simply a statement that when we throw a coin
times, we are certain that the outcome will either be 0 (no)
head, or 1 head, or 2 heads all the way to all heads, that is, heads.
For our example of tossing a coin three times we have for the
Fair Coin : p=q=
Consider the important case of a fair coin, that is, when
; in this case, we have
The crucial point to note above is that the proportionality
constant is independent of .
The formula 4.11 shows that for a random variable for
which all of its outcomes are equally likely, the probability that a
certain outcome will occur, in this case number of heads being equal to
, is proportional to the number of ways this configuration can occur,
namely . This result will be very important later
in our understanding of the concept of entropy.
Throwing a coin -times has a physical interpretation. Consider
a particle that can move in only one-dimension.
Let us also assume the particle can move only a fixed distance,
and can only move either to the right or the left. To decide which way
the particle will move, we toss a coin. If toss comes up with a
, the particle moves to the right; if the toss comes up with a tail,
the particle moves to the left. In other words, the outcome is
with probability , and outcome is with probability .
In effect, the particle moves on a lattice
of equally spaced points. This process that the particle is
undergoing is called a random walk - also called
Brownian motion. Random walk is precisely
the way the molecules from an open bottle of perfume spread the smell
into the entire room, and is called a diffusion process.
Throwing the coin times
corresponds to taking steps. Suppose the particle starts
the random walk at the origin,
takes steps to the right, and concomitant steps to the left.
If is the number of heads and is the number of tails,
then clearly, the distance from the origin is . There are many
paths which lead to the final position of , and correspond
to the number of different ways that -heads can come up in
Different paths leading to a final position
For a particle undergoing a random walk, its position at every
point in its
-steps is a random variable. An important tool for studying the
of random variables is to compute the average values of
quantities of interest. For a function of the random variable , say
, let us denote its average value by . We then have
The above expression means that the average value of is given
by the summing the value of for every outcome , weighted
by the likelihood of that value occurring.
We have the following natural definition for .
The two most important properties of any random variable is its average
and its standard deviation.
Let us return to the random walk with
The average position of the particle after -steps is given by
The reason we get zero is
because we have assumed equal probability to step to the right or
to the left. Hence, on the average, its steps on either side of the origin
cancel, with the average being at the starting point.
However, we intuitively know that even though the average position
of the particle undergoing random walk is zero, it will deviate
more and more from the origin as it takes more and more steps. The
reason being that every step the particle takes is random, it is highly
unlikely that the particle will take two consecutive steps in
opposite directions. The measure of the
importance of the paths that are far from the origin is measured
by the average value of the square of the position of the particle.
The reason this measure is useful is because
every time the particle deviates from the origin, be it in the right or
left directions, the square of the deviation is always be positive. We hence
have the standard deviation given by
We have the important result from (4.21) and (4.24)that,
since, for ,
, we have the following
The equation above has an important interpretation. In any particular experiment, all
we can obtain is = number of heads for trial. So how do we compute
? We would like to set , but there are errors
inherent in this estimate, since in any particular set of throws, we can get
any value of
which need not be equal to . In other words, what is the error
we make if set
Eq. (4.26) tells us that for a fair coin, with
, if we compute
, we have
In other words, the estimate that we obtain for from our experiment,
is, to within errors which are approximately
, equal to the actual value. The
point to note that the errors that are inherent in any estimate
are quantified above, and go down as the
In general, for any random variable with standard deviation given
by , the estimate for the probability , where is the
number of times that the outcome has occurred, is given by
In general, let
be an estimate of some quantity
that has a standard deviation given by , and
derived from a sample of size . The generalization of eq.(4.29) states
The relation of
with what it is estimating, namely ,
is graphically shown in Figure4.3.
In addition to random variables taking discrete values, as has been the case with
the Bernoulli and Binomial random variables, and there are also random variables
that can take continuous values. A simple example of this would
be the height of an individual. If go to a street and measure the
heights of pedestrians, we will find that the heights can take any value
from say 1 m to 2 m. In other words, the heights are varying continuously from
person to person. Not knowing any better, we would assume that the heights of the
pedestrians are samples of a continuous random variable.
Let be a continuous variable that takes values in some
continuous interval with a minimum value and maximum value of
, that is . The probability distribution function
is defined by the following. For a small interval , we
Estimate lies in a Range around
The simplest continuous, namely, the uniform random variable
, called , takes all values in
the continuous interval with equal likelihood. Hence the probability
for some event cannot depend on , which leads to
We hence have
The humble uniform random variable , it turns out surprisingly, is one
of the most important random variables. The reason being that one can prove a
theorem that all random variables, be they discrete or
continuous, can be mapped into the uniform random variable. Hence,
in all numerical simulations, the computer has built-in algorithms
for , and one is then faced with the prospect of
generating random variables of interest starting from .
A random variable which has wide applications in diverse fields
such as physics, statistics, finance, engineering and so on is the normal or
Gaussian random variable. A continuous random variable can have
any value on the real line, that is,
. For the case
of a Gaussian random variable, its probability distribution
function, displayed in Figure 4.4 is given by the following.
As mentioned earlier, a model for diffusion is to consider a particle
doing a random walk in a (continuous) medium.
It can be shown that its probability
distribution is then given by the normal distribution. Suppose the particle starts
its random walk at time from the point ; then at time
its position can be anywhere in space, that is
The probability for it to at different values of is given by
Probability Distribution of the Gaussian Random Variable
with being the diffusion constant of the medium in
which the particle is doing a random walk.
Up: Laws of Physics :