Recently I got to thinking about my grad school course on system performance. The instructor had a day job at Bell Labs where he did the math that he tried to teach to us. It was all about things like queuing theory, think time and all sorts of statistical and stochastic analysis of information systems. (It was pretty overwhelming at the time but I’d like to take the class again.)

Anyway…one of the concepts that I took away from the class was about Poisson distributions. It got me wondering about how I would explain this concept to my College Mathematics students.

Let me explain.

College Mathematics at my school is a 100-level class for non-technical students. Most of my class are in programs like Medical Assisting, Graphic Design and Business Administration. This is the course description:

*This course develops problem-solving and decision-making strategies using mathematical tools from arithmetic, algebra, geometry, and statistics. Topics include consumer mathematics, key concepts in statistics and probability, sets of numbers, and geometry. Upon successful completion of this course, students will be able to apply mathematical tools and methods to solve real-world problems.*

Basically, it’s a functional numeracy class. Don’t get me wrong — it’s fun to teach and I enjoy the challenge of presenting math concepts in an interesting and accessible way. One of the ways I do this is start each class session with a ‘Math Minute’ where I run through a quick math concept to give my students a little something to ponder.

The Poisson distribution was developed by Siméon Denis Poisson in 1837. According to StatTrek:

A Poisson distribution is the probability distribution that results from a Poisson experiment.

Okay, a Poisson something-or-other comes from a Poisson something-else. Could you vague that up for me?:

*A Poisson experiment is a statistical experiment that has the following properties:*

*The experiment results in outcomes that can be classified as successes or failures.**The average number of successes (μ) that occurs in a specified region is known.**The probability that a success will occur is proportional to the size of the region.**The probability that a success will occur in an extremely small region is virtually zero.*

Note that the specified region could take many forms. For instance, it could be a length, an area, a volume, a period of time, etc.

That’s a little better. Let’s look at the classic example of a Poisson distribution: bus arrivals.

On your route, a bus comes by, on average, every ten minutes. On this particular day, you arrive at the bus stop and there is no bus to be seen. How long, on average, do you think you’ll have to wait for the next bus?

At this point, your audience is expected to say “Why, five minutes, of course.” The standard answer is actually ten minutes. The problem with this example is that it doesn’t give the right picture so the actual answer doesn’t fit our mathematical intuition. That’s why at this point the explainer hauls out the equation:

*Poisson Formula: Suppose we conduct a Poisson experiment, in which the average number of successes within a given region is μ. Then, the Poisson probability is:*

*P(x; μ) = (e**-μ**) (μ**x**) / x!*

*where x is the actual number of successes that result from the experiment, and e is approximately equal to 2.71828.*

And everyone’s eyes glaze over.

I’m not complaining but I’m looking for a way to explain this to a non-technical crowd. So the bus stop example can be much more effective if we just tweak a few parameters. A Poisson distribution is generated out of experimental data so let’s adjust our mental picture.

Instead of being the person waiting for the bus, let’s instead picture ourselves sitting on a park bench across the street from the bus stop. In one hand we have a stopwatch and the other a spreadsheet. We can start our experiment any time but let’s say that we begin when a bus arrives. As we observe busses and passengers, we record two things:

- The time the bus arrives
- The time that each person arrives at the bus stop.

We need to gather enough data for a useful answer, so we’re going to be here for a while. It could be an hour, two hours, twelve, whatever. At some point we get bored and go home to crunch our data. Lo and behold, we discover that the bus interarrival time does in fact average ten minutes but the average wait time is greater than five minutes.

Why is that? Well, from our perspective as an outside observer, this phenomenon makes a lot more sense. Let’s look at our bus arrivals over a timeline:

**bus 1** – *5 mins*. – **bus 2** – *15 mins*.- **bus 3** – 1*0 mins.* – **bus 4** – *20 mins.* – **bus 5** and so on…

As an individual waiting at the bus stop we didn’t have this perspective. But when we look at it from outside, we have a better picture of what’s going on. More passengers arrive during the longer intervals than during the short ones. As a result, the wait times for these passengers are going to skew our average wait time, giving us our previously unintuitive result of greater than five minutes.

Uncertainty is part of our lives, so it makes sense that it would show up in math as well. Probability (and statistics) don’t eliminate uncertainty but rathergive us a useful way to acknowledge and honor uncertainty.