The sum of the squares of the first ten natural numbers is,
12 + 22 + … + 102 = 385
The square of the sum of the first ten natural numbers is,
(1 + 2 + … + 10)2 = 552 = 3025
Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 – 385 = 2640.
Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.
I’m very well acquainted too with matters mathematical,
I understand equations, both the simple and quadratical,
About binomial theorem I’m teeming with a lot of news–
With many cheerful facts about the square of the hypotenuse.
Gilbert & Sullivan, “The Pirates of Penzance”
At first glance this seems to be just arithmetic which is kind of insulting. As we’ve discussed, arithmetic is only about getting to an answer and that’s the kind of boring stuff that is only fit for machine work. Clearly there must be something else going on here or I’m just going to be terribly annoyed. So let’s take a look at the math (which is much more interesting) behind sum of squares and then see if we can figure out a cunning way of getting our answer without all of that boring arithmetic. It turns out that there are at least two interesting things about the sum of squares.
The most interesting (to me, at least, dork that I am) has to do with storing cannonballs.
Let me explain.
Imagine you’re on vacation and you visit an historical site and you come across a cannon, complete with ammunition:
Pretty standard stuff, but what if you wanted to figure out how many cannonballs are in that pile? Better still, could you do it without actually counting any cannonballs? It turns out that you can. Note that the top of the pile has one cannonball, the layer underneath has four and so on. In other words, if we count the layers from the top down, our four layer pile here has:
12 + 22 +32 + 42 or 30 cannonballs (1 + 4 + 9 + 16). In other words, our pile of cannonballs can be expressed as a sum of squares. So we don’t need X-ray vision, we just count the number of layers in the pile and do a little arithmetic. Not bad! Note that this only works if the pile is in the shape of a square based pyramid, hence the name square pyramidal numbers for the number sequence. Of course, if there was an easy way to calculate the sum of squares than by doing all that boring arithmetic that would be better but we’ll talk more about that in a bit.
Okay, let’s look at an example that’s a bit more real world and not so much ‘Battle of Liechtenstein’. Sum of squares is more often used in statistics to analyze a set of data points. First let’s get the boring arithmetic over with so we can get on with our lives. Fortunately I recently learned about a programming language called Haskell that lets me do stuff like this pretty easily.
let x = sum(map (^2) [1..100]) (sum of squares of 1 to 100) print x 338350 let y = foldl (+) 0 [1..100] (sum of 1 to 100) print y 5050 [5050^2] 25502500 (square of the sum of 1 to 100)
When mathematician Carl Friedrich Gauss was eight years old (in 1785, by the way), his teacher asked the class to give the sum of the numbers 1 to 100. The idea was to keep the little buggers busy and quiet for at least a few minutes but Carl piped up almost immediately with the answer 5050. Not only did he do the sum in his head, he was able to prove his answer was correct.
Here’s how he did it (in his head, by the way):
Picture the numbers 1 to 100 in a line:
1 2 3 4 5 6 7 … 100
Now picture another row, with the same digits arranged in the opposite order:
100 … 7 6 5 4 3 2 1
Add the two rows together:
1 + 100 = 101, 2 + 99 = 101, 3 + 98 = 101 ...
Since there are 100 pairs of numbers, each adding up to 101, then the total sum is 101 x 100 or 10100. But the sum of the original sequence (which is what we wanted all along) is half of that or 5050. Sometimes you don’t need a computer if you look at the problem creatively enough.
So it turns out that the sum of the first n natural numbers (for any n) is:
Let’s apply that to our first 100 numbers and see if we really needed to write a computer program to sum 1 to 1000. Plugging 100 in for n we get (100(100 + 1))/2 or 5050.
So is there an easy way to get the sum of squares? It turns out that there is indeed:
Okay so the sum of the squares of 1 to 100 is 338,350 and the square of the sums is 25,502,500. The difference, therefore, is 25,164,150. At this point one could realistically ask, “So what?” (and who could blame you?)
It turns out that the sum of squares is an important technique in statistics for determining sample variance. That is to say, the amount that a set of data numbers varies from the average value. The bigger the variance, the more likely you’re going to have go back to the old drawing board and re-think your experiment. The traditional way to calculate variance is:
Find the average (mean) of the sample numbers.
Subtract each value from the mean and square the difference.
Add up all of the squared differences.
So if all of your data were perfect then the differences from the mean would be zero, which sums up to a total variance of zero. Such is the way of a perfect world. But the more your actual data points vary (see what I did there?), the bigger the difference from the mean and the bigger your variance.
That’s all well and good but where am I going with this? It turns out that there’s a shortcut way to get sample variance:
Sum up your sample data values and square the sum
Sum the squares of your sample data
Subtract the square of the sum from the sum of the squares
Divide by the number of data values
You don’t have to calculate the mean or the squares of the differences from the mean AND we’re back to our original technique of subtracting the square of sums from the sum of the squares. As your sample size gets bigger, this saves us a TON of work.
Thus endeth the lesson….