Some time ago, I wrote a short unpublished note (mostly for my own benefit) when I was trying to understand the derivation of the Black-Scholes equation in financial mathematics, which computes the price of various options under some assumptions on the underlying financial model.  In order to avoid issues relating to stochastic calculus, Itō’s formula, etc. I only considered a discrete model rather than a continuous one, which makes the mathematics much more elementary.  I was recently asked about this note, and decided that it would be worthwhile to expand it into a blog article here.  The emphasis here will be on the simplest models rather than the most realistic models, in order to emphasise the beautifully simple basic idea behind the derivation of this formula.

The basic type of problem that the Black-Scholes equation solves (in particular models) is the following.  One has an underlying financial instrument S, which represents some asset which can be bought and sold at various times t, with the per-unit price S_t of the instrument varying with t.  (For the mathematical model, it is not relevant what type of asset S actually is, but one could imagine for instance that S is a stock, a commodity, a currency, or a bond.)  Given such an underlying instrument S, one can create options based on S and on some future time t_1, which give the buyer and seller of the options certain rights and obligations regarding S at an expiration time t_1.  For instance,

  1. A call option for S at time t_1 and at a strike price P gives the buyer of the option the right (but not the obligation) to buy a unit of S from the seller of the option at price P at time t_1 (conversely, the seller of the option has the obligation but not the right to sell a unit of S to the buyer of the option at time t_1, if the buyer so requests).
  2. A put option for S at time t_1 and at a strike price P gives the buyer of the option the right (but not the obligation) to sell a unit of S to the seller of the option at price P at time t_1 (and conversely, the seller of the option has the obligation but not the right to buy a unit of S from the buyer of the option at time t_1, if the buyer so requests).
  3. More complicated options, such as straddles and collars, can be formed by taking linear combinations of call and put options, e.g. simultaneously buying or selling a call and a put option.  One can also consider “American options” which offer rights and obligations for an interval of time, rather than the “European options” described above which only apply at a fixed time t_1.  The Black-Scholes formula applies only to European options, though extensions of this theory have been applied to American options.

The problem is this: what is the “correct” price, at time t_0, to assign to an European option (such as a put or call option) at a future expiration time t_1?  Of course, due to the volatility of the underlying instrument S, the future price S_{t_1} of this instrument is not known at time t_0.  Nevertheless – and this is really quite a remarkable fact – it is still possible to compute deterministically, at time t_0, the price of an option that depends on that unknown price S_{t_1}, under certain assumptions (one of which is that one knows exactly how volatile the underlying instrument is).

— How to compute price —

Before we do any mathematics, we must first settle a fundamental financial question – how can one compute the price of some asset A?  In most economic situations, such a price would depend on many factors, such as the supply and demand of A, transaction costs in buying or selling A, legal regulations concerning A, or more intangible factors such as the current market sentiment regarding A.  Any model that attempted to accurately describe all of these features would be hideously complicated and involve a large number of parameters that would be nearly impossible to measure directly.  So, in general, one cannot hope to compute such prices mathematically.

But the situation is much simpler for purely financial products, such as options, at least when one has a highly deep and liquid market for the underlying instrument S.  More precisely, we will make the following (unrealistic) assumptions:

  • Infinite liquidity. Market participants can buy or sell a unit of the underlying instrument S at any time.  [In principle, the participant would need a certain amount of cash, or a certain amount of S, in order to buy or sell S, but see the infinite credit and short selling assumptions below.]
  • Infinite depth. Each sale of a unit of S of does not affect the price of futher sales of units of S.
  • No transaction costs. The purchase price and sale price of an asset is the same: in other words, the money spent by a buyer in a sale is exactly equal to the money earned by the seller.
  • No arbitrage. There do not exist risk-free opportunities for market participants to instantaneously make money.

With these assumptions, the supply situation is simplified enormously, because any participant in this market can, in principle, use cash to create an option to sell to others (for instance one can sell a call option for S and cover it by buying a unit of S at any time before the expiration time), in contrast to physical assets (e.g. barrels of oil) which cannot be created purely from market transactions.  This freedom of supply leads to upper bounds on the price of a financial asset A; if any market participant can instantaneously create a unit of A at time t_0 from market transactions using an amount X (or less) of cash, then clearly one should not assign such a unit of A a price greater than X at time t_0, otherwise there would exist an arbitrage opportunity.

As a simple example of such an upper bound, if a deep and liquid market allows one to repeatedly buy individual units of A at a price of X per unit, then for any integer k \geq 1, the price of k units of A has an upper bound of kX.  (The true price may be lower, due for instance to volume discounts, but in general the price of k units of A will be a subadditive function of A.  Note though that if the market is not infinitely deep, then each purchase of a unit may increase the price of the next unit, leading to superadditive behaviour instead.)

As another example, the price at time t_0 of a put option for a unit of S at time t_1 at strike price P cannot exceed P, because any market participant can create (and then sell) such an option simply by setting aside P units of cash to cover the future expense of buying a unit of S.  (This is an extremely crude upper bound, of course, as the option buyer might not exercise the option, in which case the P units of cash are recovered, or the option buyer does exercise in the option, in which case the seller is compensated for the P units of cash by a unit of S.  Also, we are assuming here that there are no costs (e.g. security costs) associated with holding on an asset over time.)  For similar reasons, the price at time t_0 of a call option for a unit of S at time t_1 cannot exceed S_{t_0}.

Dually to the above freedom of supply, there is also a freedom of demand: any participant can, in principle, purchase a financial asset and convert it into cash by combining the rights offered by that asset with other purchases.  For instance, one could attempt to profit from a put option by buying a unit of the underlying instrument S and then (if the price is favourable) exercising the right to sell that unit to the option seller.  This freedom of demand leads to lower bounds on the price of an asset: if any market participant can instantaneously convert a unit of A using market transactions into an amount X of cash, then clearly one should not assign a unit of A any price lower than X, otherwise there would be an arbitrage opportunity.

To give a trivial example: any option has a lower bound of zero for its price, since one can convert an option into zero units of cash simply by refusing to exercise it.  (Note that some financial assets can have a negative cash value – mortgages being a good example.)

To summarise so far: freedoms of supply give upper bounds on the price of an asset A, and freedoms of demand give lower bounds on the price of an asset A.  The lower bounds cannot exceed the upper bounds, as this would provide an arbitrage opportunity.  But if the lower bounds and upper bounds happen to be equal, then one can compute the price of A exactly.  This is a rare occurrence – one almost never expects the upper and lower bounds to be so tight.  But, amazingly, this will turn out to be the case for options in the Black-Scholes model.

To give a simple example of a situation in which upper and lower bounds match, let us make another assumption:

  • Infinite credit. Market participants can borrow or lend arbitrary amounts of money at a risk-free interest rate of r.  Thus, for instance, participants can deposit (or lend) X amount of cash at time t_0 and be guaranteed to receive \exp( r(t_1-t_0) ) X cash at time t_1, and conversely can borrow X amount of cash at time t_0 but pay back \exp( r(t_1-t_0) ) X cash at time t_1.

Remark. One can renormalise r to be zero, basically by using real units of currency instead of nominal units, but we will not do so here. \diamond

With this assumption one can now compute the time value of money.  Suppose one has a risk-free government bond A which is guaranteed to pay out X amount of cash at the maturity time t_1 of the bond.  Then, at any time t_0 prior to the maturity time, one can convert A to an amount \exp(-r(t_1-t_0) ) X of cash, by borrowing this amount of cash at time t_0, and using the proceeds of the bond A to pay off the debt from this borrowing at time t_1.  Thus there is a lower bound of \exp(-r(t_1-t_0)) X to the price of the bond A.  Conversely, given an amount \exp(-r(t_1-t_0)) X of cash at time t_0, one can create the equivalent of the bond A simply by depositing or lending out this cash to obtain X amount of cash at time t_1.  Thus, in this case the lower and upper bounds match exactly, and the price of the bond can be computed at time t_0 to be \exp(-r(t_1-t_0)) X.  (Because of this fact, the quantity r in the Black-Scholes model is usually set equal to the interest rate of an essentially risk-free asset, such as short-term Treasury bonds.)

One can use the time value of money to produce further upper and lower bounds on options.  For instance, the price at time t_0 of a put option for a unit of S at time t_1 at strike price P cannot be lower than \exp(-r(t_1-t_0)) P-S_{t_0}, since one can always convert the put option into this amount of cash by buying a unit of S at price S_{t_0} at time t_0, holding on to this unit until time t_1, and selling at price P at time t_1, which has the equivalent cash value of \exp(-r(t_1-t_0)) P at time t_0.  However, in order to make the lower and upper bounds match, we will need some additional assumptions on how the price S_t of the underlying stock evolves with time.

— The Black-Scholes model —

To simplify the computations, we shall assume

  • Discrete time. The time variable t increases in discrete steps of some time unit dt.  (At each time t, one can make an arbitrary number of purchases and sale of assets, but the price S_t of the underlying instrument stays constant for each fixed t, as guaranteed by the infinite depth hypothesis.)

For instance, one could imagine a market in which the price S_t only changes once a day, so in this case dt would be a day in length. Similarly if S_t only changes once a minute or once a second.

The Black-Scholes model then describes how the next price S_{t+dt} of the underlying instrument depends on the current price S_t.   The whole point, of course, is that there is to be some randomness (or risk) involved in this process.  The simplest such model would be that of a simple random walk

S_{t+dt} = S_t + \epsilon_t \sigma (dt)^{1/2}

where \sigma > 0 is a constant (representing volatility) and \epsilon_t = \pm 1 is a random variable, equal to +1 or -1 with equal probability; thus in this model the price either jumps up or jumps down by \sigma (dt)^{1/2} for each time step dt.  (The factor of (dt)^{1/2} is a natural normalisation, required for this model to converge to Brownian motion in the continuous time limit dt \to 0. with this normalisation, \sigma^2 basically becomes the amount of variance produced in S_t per unit time.)  One can assume that the random variables \epsilon_t are jointly independent as t varies, but remarkably we will not need to use such an independence hypothesis in our analysis.  Similarly, we will not use the fact that the probabilities of going up or down are both equal to 1/2; it will turn out, unintuitively enough, that these probabilities are irrelevant to the final option price.

This simple model has a number of deficiencies.  Firstly, it does not reflect the fact that many assets, while risky, will tend to grow in value over time.  Secondly, the model allows for the possiblity that the price S_t becomes negative, which is clearly unrealistic.  (A third deficiency, that it only allows two outcomes at each time step, is more serious, and will be discussed later.)

To address the first deficiency, one can add a drift term, thus leading to the model

S_{t+dt} = S_t + \mu dt + \sigma \epsilon_t (dt)^{1/2}

for some fixed \mu \in {\Bbb R} (which could be positive, zero, or negative), representing the expected rate of appreciation of a unit of S per unit time.  A remarkable (and highly unintuitive) consequence of Black-Scholes theory is that the exact value of \mu will in fact have no impact on the final formula for the value of an option: an underlying instrument which is rising in value on average will have the same option pricing as one which is steady or even falling on the average!

To address the second deficiency, we work with the logarithm \log S_t of the price of S, rather than the price itself, since this will make the price positive no matter how we move the logarithm up and down (as long as we only move the logarithm a finite amount, of course).  More precisely, we adopt the model

\log S_{t+dt} = \log S_t + \mu dt + \sigma \epsilon_t (dt)^{1/2} (1)

and so \mu now measures the expected relative increase in value per unit time (as opposed to the expected absolute increase), and similarly \sigma^2 measures the relative increase in variance per unit time.  This model may seem complicated, but the key point is that, given S_t, there are only two possible values of S_{t+dt}.

— Pricing options —

Now we begin the task of pricing an option with expiry date t_1 at time t_0.  The interesting case is of course when t_0 is less than t_1, but to begin with let us first check what happens when t_0=t_1, so that we are pricing an option that is expiring immediately.

Consider first a call option.  If one has the option to buy a unit of S at price P at time t_1, and S_{t_1} was greater or equal to P, then it is clear that this option could be converted into S_{t_1}-P units of cash, simply by exercising the option and then immediately selling the stock that was bought.  Conversely, given S_{t_1}-P units of cash, one could create such an option (and might even recover this money if the bearer of the option forgets to exercise it).  So we see that when S_{t_1} \geq P, the price of this option is S_{t_1}-P.

On the other hand, if S_{t_1} is less than P (in the jargon, the option is “underwater” or “out of the money”), then it is intuitively clear that the call option is worthless (i.e. has a price of zero).  To see this more rigorously, recall that any option has a lower bound of zero for its price.  To get the upper bound, one can issue an underwater call option at no cost, since if someone is foolish enough to exercise that option, one can simply buy the stock from the open market at S_{t_1} and sell it for P, and pocket or discard the difference.  Putting all this together, we see that the price V_{t_1} of the call option at time t_1 is a function of the price S_{t_1} of the underlying instrument at that time, and is given by the formula

V_{t_1}(S_{t_1}) := \max( S_{t_1} - P, 0 ). (2)

For similar reasons, the price V_{t_1} at time t_1 of a put option for a unit of S at expiry time t_1 and strike price P is given by the formula

V_{t_1}(S_{t_1}) := \max( P - S_{t_1}, 0 ). (3)

Thus we have worked out the price of both put and call options at the time of expiry.  To handle the general case, we have to move backwards in time.  For reasons that will become clearer shortly, we shall also need three final assumptions:

  • Infinite divisibility.  Stock can be sold in arbitrary non-integer amounts.
  • Short selling. Market participants can borrow arbitrary amounts of stock, at no interest, for arbitrary amounts of time.
  • No storage costs. Market participants can hold arbitrary amounts of stock at no cost for arbitrary amounts of time.

The fundamental lemma here is the following:

Lemma. If a financial asset A has a price at time t that is a function V_t(S_t) that depends only on the price S_t of S at time t, then the same asset has a price at time t-dt that is a function V_{t-dt}(S_{t-dt}) of the price S_{t-dt} of S at time t-dt, where V_{t-dt} is given from V_t by an explicit formula (see (5) below).

Iterating this lemma, starting from (2) and (3), and taking the limit as dt \to 0, will ultimately lead to the Black-Scholes formula for the price of such options.

Let’s see how this lemma is proven.  Suppose we are at time t-dt, and the price of S is currently s := S_{t-dt}.  We do not know what the price S_t of S at the next time step will be exactly, but thanks to (1), we know that it is one of two values, say s_- and s_+ with s_+ > s_-. From (1) we have the explicit formula

s_\pm = s \exp( \mu dt \pm \sigma (dt)^{1/2} ) (4).

By hypothesis, we know that the instrument A has a price of V_t(s_+) or V_t(s_-) at time t, depending on whether S has a price of s_+ or s_- at this time t.  Our task is now to show that A has a price at time t-dt that depends only on s.

Let us first consider the easy case when V_t(s_+) and V_t(s_-) are both equal to the same value, say X.  In this case, the instrument A is (for the purposes of pricing) identical to a bond which matures at time t with a value of X.  By the previous discussion, we thus see that the price of A at time t-dt is equal to \exp(-r dt) X.

Now consider the case when V_t(s_+) and V_t(s_-) are unequal.  Then there is some risk in the value of A at time t.  But – and this is the key point – one can hedge this risk by buying or selling some units of S.  Suppose for instance one owns one unit of A at time t-dt, and then buys k units of S at this time at the price s.  At time t, one sells the k units of S, earning k s_+ units of cash at time t if the price is s_+, and k s_- units if the price is s_-.  In effect, this hedging strategy adjusts V_t(s_+) and V_t(s_-) to V_t(s_+)+ks_+ and V_t(s_-)+ks_- respectively, at the cost of paying ks at time t-dt.  If V_t(s_+) < V_t(s_-), then one can find a positive k so that the adjusted values V_t(s_+)+ks_+ and V_t(s_-)+ks_- of the instrument are equal (indeed, k is simply k = (V_t(s_-)-V_t(s_+))/(s_+-s_-)).  We have thus effectively converted A, at the cost of ks units of cash at time t-dt, into a bond that matures at time t with a value of

\displaystyle V_t(s_+)+ks_+=V_t(s_-)+ks_- = \frac{s_+ V_t(s_-) - s_- V_t(s_+)}{s_+-s_-}.

Conversely, we can convert such a bond into one unit of A and ks units of cash at time t-dt by reversing the above procedure.  Namely, instead of buying k units of S at time t-dt to sell at time t, one instead short sells k units of S at time t-dt to buy back at time t.  More precisely, one borrows k units of stock at time t-dt to sell immediately, and then at time t buys them back again to repay the stock loan.  (Mathematically, this is equivalent to buying -k units of stock at time t-dt to sell at time t; thus short selling effectively allows one to buy negative units of stock, in much the same way that divisibility allows one to buy fractional units of stock.)  We thus conclude that in this case, A has a value of

\displaystyle \exp(-r dt) \frac{s_+ V_t(s_-) - s_- V_t(s_+)}{s_+-s_-} - ks

\displaystyle = \frac{(\exp(-rdt) s_+-s) V_t(s_-) - (\exp(-rdt) s_- - s) V_t(s_+)}{s_+-s_-}

This analysis was conducted in the case V_t(s_+) < V_t(s_-), but one can get the same formula at the end in the opposite case V_t(s_+) > V_t(s_-); k is now negative in this case, but since buying a negative amount of stock is equivalent to short-selling a positive amount of stock (and vice versa), the arguments go through as before.  Substituting the formula for k, we have thus proven the lemma, with

\displaystyle V_{t-dt}(s) := \frac{(\exp(-r dt) s_+-s) V_t(s_-) - (\exp(-rdt) s_- - s) V_t(s_+)}{s_+-s_-}. (5)

This is a somewhat complicated formula, but it can be simplified by means of Taylor expansion (assuming for the moment that V_t is smooth).  To illustrate the idea, let us make the simplifying assumption that r=0.  If we then Taylor expand

\displaystyle V_t(s_\pm)=V_t(s)+(s_\pm-s)\partial_s V_t(s)+\frac{1}{2} (s_\pm-s)^2 \partial_{ss} {V_t(s)}+O((dt)^{3/2})

(cautioning here that the implied constants in the O() notation depend on all sorts of things, such as the third derivative of V_t) and note that s_+-s_- is comparable to (dt)^{1/2} in magnitude, then the right-hand side of (5) simplifies to

V_t(s) - \frac{1}{2} \partial_{ss} V_t(s) (s_+-s)(s_- - s) + O( (dt)^{3/2} ).

Since

(s_+ - s) (s_- - s) = - s^2 \sigma^2 dt + O( (dt)^{3/2} )

we thus obtain

V_{t-dt}(s) = V_t(s) + \frac{1}{2} s^2 \sigma^2 \partial_{ss} V_t(s) dt + O( (dt)^{3/2} ).

Performing Taylor expansion in t, we thus conclude

\partial_t V_t(s) = - \frac{1}{2} s^2 \sigma^2 \partial_{ss} V_t(s) + O( (dt)^{1/2} )

and so in the continuum limit dt \to 0 one (formally, at least) obtains the backwards heat equation

\partial_t V = - \frac{1}{2} s^2 \sigma^2 \partial_{ss} V.

A similar (but more complicated) computation can be made in the r \neq 0 case (or one can renormalise using real currency units, as remarked earlier), obtaining the Black-Scholes PDE

\partial_t V = - \frac{1}{2} s^2 \sigma^2 \partial_{ss} V - r s \partial_s V + rV.

Using (2) or (3) as an initial condition, one can then solve for V at time t_0; the quantity V_{t_0}(S_{t_0}) is then the price of the option at time t_0.  (V can be computed explicitly in terms of the error function, leading to the Black-Scholes formula.)

The above analysis was not rigorous because the error terms were not properly estimated when taking the continuum limit dt \to 0, and also because the initial conditions (2), (3) were not smooth.  The latter turns out to be a very minor difficulty, due to the smoothing nature of the Black-Scholes PDE (which is a parabolic equation) and also because one can use the comparison principle (which formalises the intuitively obvious fact that if a financial asset A is always worth more than an asset B at time t, then this is also the case at time t-dt) to approximate the non-smooth options (2), (3) by smooth ones.  The former difficulty does require a certain amount of non-trivial analysis (e.g. Fourier analysis or Itō’s formula) but I will not discuss this here.

There is an enormous amount of literature aimed at relaxing the idealised hypotheses in the above analysis, for instance adding transaction costs, fluctuations in volatility, or more complicated financial features such as dividends.  In some of these more general models, the upper and lower bounds for the prices of options cease to match perfectly, due to transaction costs or the inability to perfectly hedge away the risk; this for instance starts occurring when the underlying price S_t can fluctuate to three or more values from a fixed value of S_{t-dt}, as it then becomes impossible in general to make V constant for all of these values at once purely by buying and selling S.  In particular, the reliability of the Black-Scholes model becomes suspect when the price movements of S differ significantly from the model (1), for instance if there are occasional very large price swings.

The other major issue with the Black-Scholes formula is that it requires one to compute the volatility \sigma, which is difficult to do in practice.  In fact, the formula is sometimes used in reverse, using the actual prices in option markets to deduce an implied volatility for an underlying instrument.