When teaching mathematics, the traditional method of lecturing in front of a blackboard is still hard to improve upon, despite all the advances in modern technology.  However, there are some nice things one can do in an electronic medium, such as this blog.  Here, I would like to experiment with the ability to animate images, which I think can convey some mathematical concepts in ways that cannot be easily replicated by traditional static text and images. Given that many readers may find these animations annoying, I am placing the rest of the post below the fold.

Suppose we are in the classical (Kolmogorov) framework of probability theory, in which one has a probability space (\Omega, {\mathcal F}, {\mathbb P}) representing all possible states \omega \in \Omega.  One can make a distinction between deterministic quantities x that do not depend on the state \omega, and random variables (or stochastic variables) X = X(\omega) that do depend (in some measurable fashion) on the state \omega.  (As discussed in this previous post, it is often helpful to adopt a perspective that suppresses the sample space \Omega as much as possible, but we will not do so for the current discussion.)

One can visualise the distinction as follows.  If I pick a deterministic integer x between 1 and 6, say 3, then this fixes the value of x for the rest of the discussion:


However, if I pick a random integer X uniformly from \{1,\dots,6\} (e.g. by rolling a fair die), one can think of X as a quantity that keeps changing as one flips from one state to the next:


Here, I have “faked” the randomness by looping together a finite number of images, each of which is depicting one of the possible values X could take.  As such, one may notice that the above image eventually repeats in an endless loop.  One could presumably write some more advanced code to render a more random-looking sequence of X‘s, but the above imperfect rendering should hopefully suffice for the sake of illustration.

Here is a (“faked” rendering of) a random variable Y that also takes values in \{1,\dots,6\}, but is non-uniformly distributed, being more biased towards smaller values than larger values:


For continuous random variables, taking values for instance in {\bf R}^2 with some distribution (e.g. uniform in a square, multivariate gaussian, etc.) one could display these random variables as a rapidly changing dot wandering over {\bf R}^2; if one lets some “afterimages” of previous dots linger for some time on the screen, one can begin to see the probability density function emerge in the animation.  This is unfortunately beyond my ability to quickly whip up as an image; but if someone with a bit more programming skill is willing to do so, I would be very happy to see the result :).

The operation of conditioning to an event corresponds to ignoring all states in the sample space outside of the event.  For instance, if one takes the previous random variable X, and conditions to the event X > 3, one gets the conditioned random variable


One can use the animation to help illustrate concepts such as independence or correlation.  If we revert to the unconditioned random variable


and let Z be an independently sampled uniform random variable from \{1,\dots,6\}, one can sum the variables together to create a new random variable X+Z, ranging in \{2,\dots,12\}:



(In principle, the above images should be synchronised, so that the value of X stays the same from line to line at any given point in time.  Unfortunately, due to internet lag, caching, and other web artefacts, you may experience an unpleasant delay between the two.  Closing the page, clearing your cache and returning to the page may help.)

If on the other hand one defines the random variable Z' to be Z' := 7-X, then Z' has the same distribution as Z (they are both uniformly distributed on \{1,\dots,6\}, but now there is a very strong correlation between X and Z', leading to completely different behaviour for X+Z':