Card Probability

Suppose that a deck of 52 cards has been shuffled and placed face down. You draw 1 card but do not look at it. You place it face down and off to the side. What is the probability that you drew the Ace of Hearts?

Since there is 1 Ace of Hearts in the deck and 52 cards total, the probability is 1/52.

Now you prepare to draw a second card. What is the probably that the new card is an Ace of Hearts? Is it more likely than before? Less likely? The same? Impossible to know?

To determine the answer, we need to use a probability tree. Here are all the different options that could occur:

pic 1

If we first drew the Ace of Hearts and placed it face down, there is no chance of drawing the Ace of Hearts for our second card. However, if we didn’t draw the Ace of Hearts, then our second card could or could not be the Ace of hearts.

To compute probabilities, we multiply. We want to determine the probability of drawing an Ace of Hearts on the second card. Hence, we multiply the probability of “Not Ace” by the probability of “Ace.”

pic 2

The above fraction reduces to 1/52. This is the same probability as before! We could apply this logic to drawing a third card (assuming we didn’t peak at the second card). We would find that the probability of drawing the Ace of Hearts is still only 1 in 52. No matter how many cards you remove, the probability doesn’t change. Why is this?

One way of understanding this enigma is to consider the information you have. At the beginning, you know that there are 52 cards and 1 specific card you are looking for. Once you start drawing cards, although you keep removing the cards from the deck, you don’t know which cards you are removing. You could be removing the Ace of Hearts, or you could be removing random cards. You don’t gain any new information with each card you remove. So each time you have the same probability as in the beginning.

Or consider another scenario. Suppose I draw a card from the bottom of a full deck. What is the probability that it is an Ace of Hearts? Since the deck is shuffled, the probability should be 1/52. But to get to this card, I could have physically removed all 51 cards above it (without looking at them), placed them in a pile, and then selected the bottom card. Having a pile of unknown cards does not change the probability of the remaining card being an Ace of Hearts.



Filed under Uncategorized

Noodles Part 1

A dish contains 100 strands of cooked spaghetti. See the picture below:Capture

Two ends of spaghetti are chosen at random and tied together. For example:


If this process continues until there are no more loose ends, what is the probability that all strands will form one big loop? In other words, what are the chances that this big mess is actually just a circle?


Seriously, take a guess. When I first read a version of this problem,1 I thought the answer must surely be less than 1%. Watch this:

Now you may think that I took a particularly well behaved group of noodles for the above animation. However, mathematically speaking, the probability of getting a single big loop is 9%, much higher that you probably guessed.

I want to walk you through some of the interesting twists and turns that it took for me to get the answer. First, I had to find a formula for calculating the above probability. My first method was to try less noodles. With 2 noodles, it seemed like I could figure out all the possibilities.

Capture 5

For the first noodle, I started with the left end. I could tie it to itself:Capture6

Then I would have to tie the other noodle to itself and I would not have a big loop:


Another option would be to tie the left end to the end below it:


Then I would tie the remaining 2 ends together to make a loop!


My last option would be to tie the top left end to the right bottom end:


Then I would have to tie the remaining 2 ends together:


Do some untwisting, and make a loop!


To recap, with 2 noodles, we have a 2/3 probability of getting a one big loop. How about 3 noodles?


Instead of going through all the probabilities, we can use a trick. If I tie the top left end of the noodle to the right end of the same noodle, I create an isolated loop like before. That is not good. So instead, consider tying the top left end to any other noodle end:


By doing this, we create one long noodle, which we can move around to form:


Voila! We are back to the 2-noodle case. Since there are 4 places I could tie the top left end (green), and 1 place I cannot (red, since it would make an isolated loop) I have a 4/5 chance.


We multiply this by the 2-noodle case to get

\frac{2}{3} \times \frac{4}{5}

You can continue this logic and find the probability for 5 noodles is:

\frac{2}{3} \times \frac{4}{5} \times \frac{6}{7} \times \frac{8}{9}

Now I noticed a pattern. Combine the top and bottom numbers:

\frac{2 \times 4 \times 6 \times 8}{3 \times 5 \times 7 \times 9}

We are actually just multiplying even numbers and dividing them by odd numbers! I was so excited! I went into excel, punched in the formula, and was thoroughly disappointed.


Apparently, multiplying 100 numbers got excessively big for my computer to handle. I needed a new approach. The noodle probability formula can be written as follows:

f(n) = \cfrac{2 \cdot 4 \cdot 6 \cdot 8 \dots (2n-2)}{ 3 \cdot 5 \cdot 7 \cdot 9 \dots (2n-1)}

Using double factorial notation, we can rewrite it as:

f(n) = \cfrac{(2n-2)!!}{(2n-1)!!}

Using basic double factorial identities2, we can convert the double factorials into regular factorials:

f(n) = \cfrac{2^n n!}{2n} \times \cfrac{2^n n!}{(2n)!}

And use some algebra to simplify:

f(n) = \cfrac{2^{2n} n! n!}{2n (2n)!}

Enter, the Stirling Approximation!3

n! \approx \sqrt{2n\pi} \left( \cfrac{n^n}{e^n} \right)

f(n) \approx \cfrac{2^{2n}}{2n} \times \cfrac{\sqrt{2n\pi} \left( \frac{n^n}{e^n} \right) \times \sqrt{2n\pi} \left( \frac{n^n}{e^n} \right)}{\sqrt{4n\pi} \left( \cfrac{(2n)^{2n}}{e^{2n}} \right)}

We can cancel a bunch of terms to get:

f(n) \approx \cfrac{\sqrt{\pi n}}{2n} = \sqrt{\cfrac{\pi }{4n}}

Finally, a formula my computer can handle! This formula will work for any number of noodles. Maybe you were wondering what number of noodles you would need for the probability to be below 1%. The formula states that we would need 7854 noodles! Personally, if you would have told me that before analyzing the problem, I would have thought you were out to lunch 😛

1The Mathematics Teacher, Volume 109, Number 3, October 2015, page 201.




Leave a comment

Filed under Uncategorized

Matching Tests Proof

Warning, this post uses some fierce math to prove why the pattern continues in the last post. If you want to know why the pattern continues, then welcome to the explanation.

In the last post we counted how many ways to hand student back their tests so that no student had his or her own test. This type of matching has a special name, a derangement. For a group of n students we denote the derangement as


To determine this quantity we need a formula. From before, we know that !3 = 2 and !4 = 9. Now consider the general case. How many ways are there of handing back a test to the first student? We certainly can’t give him his own test. However, we could give him any of the other n – 1 tests. Let’s say we give him test i.

We could hand out the next test to anyone. Let’s consider handing it out to student i (the student with test i). We have two options. First, student i gets student 1’s test. Here is a picture:

pic 1

Once these 2 students have been assigned their tests, we have n – 2 students who still need tests. However, assigned them tests is analogous to our original problem, and we can denote this as !(n – 2).

Second, student i does not get student 1’s test. Thus, student i has 1 forbidden test. All the other students also have 1 forbidden test (namely their own test). Again, this is analogous to our original problem with n – 1 students (since we have not given student i a test yet). We can denote this as !(n – 1).

To summarize, to hand out n tests, we have the following recursive formula:

!n = (n-1)(!(n-1) + !(n-2))

We can use this formula to calculate derangements as high as we like, provided we proceed in order.

!5 = 4(!4 + !3) = 4(9 + 2) = 44

!6 = 5(!5 + !4) = 5(44 + 9) = 265

These are exactly the numerators we found when computing the fractions in the previous post!

To determine what would happen in a large class, we need a closed form for derangements. To simplify the notation for the next section, we will denote !n as D(n).

Observe the following pattern:



D(4) = 9

9 – 4*2 = 1

D(5) = 44

44 – 5*9 = -1

D(6) = 265

265 – 6*44 = 1

D(7) = 1854

1854 – 7*265 = -1

We can prove the above pattern continues using induction. The base case has been dealt with above. Our inductive hypothesis is:

D(n) - nD(n-1) = (-1)^n

Consider n+1:

D(n+1) - (n+1)D(n) = n(D(n)+D(n-1)) - (n+1)D(n) using the recurrence relation

= nD(n) + nD(n-1) - nD(n) - D(n)

= nD(n-1) - D(n)

= -(D(n) - nD(n-1))

= -(-1)^n by the inductive hypothesis

= (-1)^{n+1}

Using this information, we have a new recurrence relation for D(n):

D(n) = nD(n-1) + (-1)^n

Unfortunately, this is still not a closed form solution. However, we will use induction again to get us the rest of the way. The following is the closed form solution for D(n):

D(n)=n!\sum_{k=0}^n\frac{(-1)^k}{k!} = n! \left( \frac{1}{0!} -\frac{1}{1!} +\frac{1}{2!} - \frac{1}{3!} + \dots + \frac{(-1)^n}{n!} \right)

Checking the OEIS, we find that D(0) = 1 and D(1) = 0. Using the formula for D(3), we can see that the base case will hold:

D(3) = 3! \left( \frac{1}{0!} -\frac{1}{1!} +\frac{1}{2!} - \frac{1}{3!} \right)

= 6 \left( 1 -1 + \frac{1}{2} - \frac{1}{6} \right)

= 6 \left(\frac{3}{6} - \frac{1}{6} \right)

= 6 \left(\frac{2}{6} \right)

D(3) = 2 as expected.

Suppose the formula is true for D(n). Consider D(n+1):

D(n+1)= (n+1)D(n) + (-1)^{n+1} from the new recursive formula

= (n+1)n!\sum_{k=0}^n\frac{(-1)^k}{k!} + (-1)^{n+1} using the inductive hypothesis

= (n+1)!\sum_{k=0}^n\frac{(-1)^k}{k!} + (-1)^{n+1} definition of factorial

= (n+1)!\sum_{k=0}^n\frac{(-1)^k}{k!} + (-1)^{n+1} \frac{(n+1)!}{(n+1)!}

= (n+1)! \left[ \sum_{k=0}^n\frac{(-1)^k}{k!} + \frac{(-1)^{n+1}}{(n+1)!} \right]

= (n+1)! \sum_{k=0}^{n+1}\frac{(-1)^k}{k!}

Thus, the formula is true for any n. This formula is closed, but complicated. The last step in the process is to relate derangements to our original problem. Since we were looking for a probability, we were really interested in \frac{D(n)}{n!}. Using the Taylor series approximation for e-1 we get the following:

\frac{D(n)}{n!} = n! \left( \frac{1}{0!} -\frac{1}{1!} +\frac{1}{2!} - \frac{1}{3!} + \dots + \frac{(-1)^n}{n!} \right) \frac{1}{n!}

= \left( \frac{1}{0!} -\frac{1}{1!} +\frac{1}{2!} - \frac{1}{3!} + \dots + \frac{(-1)^n}{n!} \right)

\approx e^{-1}

Finally! We have our answer! Since the number of students determines how many terms of the sequence for e we use, the approximation is valid for even small classes of only 10 or 15 students. The probability of handing out tests properly is approximately e^{-1} = \frac{1}{e} = \frac{1}{2.71828} \approx 37\%

1 Comment

Filed under Uncategorized

Matching Tests

An interesting coincidence happened today in class. My co-teacher was handing the students back their tests. She wanted each student to mark another’s test. However, she accidentally handed a student back her own test. She casually said to me “weird, what are the chances of that?” Her question implied that this was a rare event. The answer is quite the opposite.

My co-teacher had a 63%, or \frac{e-1}{e} to be precise, chance of handing a student back his or her own test. If you are wondering, e stands for a special number:

e = 2.71828…

This percentage is not good. It means that, if you randomly distribute tests to students to mark, about two thirds of the time at least 1 student will end up with his or her own test!

How in the world did I come up with that answer? Let me back up a bit. Suppose we had 3 students and we wanted to give them back their tests to mark so that no student had his or her own test. What options do we have? One option is:

Student A: test B

Student B: test C

Student C: test A

Another option is:

Student A: test C

Student B: test A

Student C: test B

Hence, we have 2 options where we will hand back the tests with no matches. How many options do we have in total?

3! = 3*2*1 = 6

Thus, there is a 2/6 = 33.3% chance that everyone has a different test. In other words, there is a 66.6% chance that at least one student receives his or her own test to mark (which would not be good for academic standards).

What if we had 4 students? You can do the math, and you will find that there are 9 options where the students have different tests. For example:

Student A: test B

Student B: test C

Student C: test D

Student D: test A

Again, basic counting principles dictate that there will be 4! = 24 different possible combinations. Hence, there is a 9/24 = 38% chance each student will have a proper test to mark. We could continue on like this; painstakingly grinding out the probabilities for each situation until we reached a class size of 25. However, there is a pattern we can exploit. Consider the following table:

Number of Students

Fraction of proper distributions


2/6 = 33%


9/24 = 38%


44/120 = 37%


265/720 = 37%


1854/5040 = 37%


14833/40320 = 37%

Do you see the pattern? Once the class size gets to 5, the percentage becomes stable. Based on the above data, we could predict that a class of 25 students will have roughly the same percentage, 37%. Indeed, one can prove mathematically that this percentage will be true for any size class, large or small.1 Thus, my advice to teachers is as follows. Be careful distributing tests to students to mark. You are playing a game with a 37% chance of winning. And any gambler knows, those are not good odds.


1See next blog post.

1 Comment

Filed under Uncategorized


The other day my friend impressed me with her knowledge of perfect squares. She had all the squares up to 30 committed to memory, whereas I struggle to calculate 162! As she listed the squares, I started to notice a pattern:


202 400
212 441
222 484
232 529
242 576


Can you see it? The last digit of the answer can be found by squaring the last digit of the question. For example, 22 = 4, so the last digit of 222 must be 4. This trick works for even large numbers. You may not know what 6232 is, but I guarantee you the last digit is 9.1

However, I found another pattern that allowed me to calculate squares in my head. Here is how it works (using 22).

First, subtract 10: 22 – 10 = 12

Second, multiply the number by 4: 12 x 4 = 48

Third, add a zero: 48 → 480

Fourth, add the square of the last digit of the question: 480 + 4 = 484

Cool eh?2 After practicing a few times I was able to perform this process in a few seconds. However, we need to be careful as we extend the strategy. Let’s do 29:

First, subtract 10: 29 – 10 = 19

Second, multiply the number by 4: 19 x 4 = 76

Third, add a zero: 76 → 760

Fourth, add the square of the last digit of the question: 760 + 81 = 841

In this case, 19 x 4 is a bit trickier to do mentally. In addition, 760 + 81 requires a mental carry, which takes some practice. After thinking about it for a while, I realized that this pattern only applies to the numbers 20-30. However, there is a way to extend the method to the numbers 10-20. Consider the following example 17:

First, subtract 10: 17 – 10 = 7

Second, multiply the number by 4: 7 x 4 = 28

Third, add a zero: 28 → 280

Fourth, add the square of the difference from 20 (20 – 17 = 3): 280 + 9 = 289

To recap, here is my method for calculating squares to 30.

  • For the squares from 1-10, simply remember your multiplication facts.
  • For the squares from 10-20, use the process, remembering to add the square of the difference from 20.
  • For the squares from 20-30, use the process, remembering to add the square of the last digit.3

My friend criticized my method; calling it overly complicated. She said that memorizing the squares was easy and I should do the same. While I agree that memorizing the squares is faster and more reliable than working through my, somewhat tedious method, I quite enjoyed finding the pattern. I hope the pattern will aid in your memorization. I know it has aided in mine.


1The reason that the trick works can be explained as follows.

Break the number up into two pieces (the last digit and the remaining number): 623 = 620 + 3

Square the number: (620 + 3)2 = (620 + 3)(620 + 3)

Expand the brackets using FOIL: (620)(620) + (620)(3) + (3)(620) + (3)(3)

Calculate each term: 384400 + 1860 + 1860 + 9

Observe that every term has a 0 in the 1’s place expect the last term. This will always be the case. Hence, the last digit of the final answer is always determined by the last term of FOIL. The last term of FOIL is calculated by square the last digit of the original question (3×3).


2Similar to the note above, the pattern works because of the distributive property. Consider 222. The multiply by 4 and add a zero step can be combined by simply multiplying by 40.

(22 – 10) x 40 + 22

= 12 x 40 + 22                                                  (simplifying inside the brackets)

= 10 x 40 + 2 x 40 + 22                                     (splitting off 2 groups of 40)

= 20 x 20 + 2 x 40 + 22                                     (rewriting the first multiplication)

= 20 x 20 + 20 x 4 + 22                                     (rewriting the second multiplication)

= 20 x 20 + 20 x 2 + 20 x 2 + 22                                 (splitting off 2 groups of 20)

= (20 + 2)2                                                       (backwards foil)

= (22)2                                                             (simplifying inside the brackets)


3I leave it as an exercise for the reader to show why the trick works from 10 – 20.

1 Comment

Filed under Uncategorized

Averages of Averages

Averages of Averages

My friend recently graduated from high school. After the ceremony, he was comparing his transcript with a fellow student and notice something strange.

Below are the results he saw (names changed):

Bob Eve
Sem 1 83 81
Sem 2 93 92.8


Both Bob and Eve struggled in the first semester but Bob came out with a 2 percent lead. In the second semester, they stepped up their game and scored in the 90s. Again, Bob beat Eve. From these results, one would expect Bob to have the higher overall average for the year, no questions asked. Here are the final marks:

Bob Eve
Final Average 88 89.8


What!? He had a higher average than her in both the first and second semester. However, her overall average for the year was higher than his. How could this possibly happen?? I’ll give you a minute to speculate…

Here are the marks for the individual courses Bob and Eve completed:

Bob Eve
English 83 80
Biology 85 83
History 81 80
Chemistry 80 99
Physics 86 94
Math 83 99
Band 100 90
Gym 93 93
Art 91 91
IT 94 81
Programming 95 95
Design 85 93


The bold marks are first semester and the non-bold marks are second semester. As you can see, Eve took fewer courses in the first semester. Hence, her poor initial performance, was not weighted as heavily in her overall average. On the other hand, Bob took equal semesters. Hence, his poor performance had a greater negative influence on his final average. This is called Simpson’s Paradox1. The paradox is just what my friend experienced. He “won” both the first and second semester individually. However, he “lost” in the aggregate; the overall year average.

This paradox teaches a valuable lesson: you cannot blindly average averages. Eve cannot calculate her final mark by averaging 81 and 92.8 to get 86.9. Instead, she must go back to her individual marks and calculate her average from the original data.


Leave a comment

Filed under Uncategorized

Polynomial Area

I was doodling the other day when a thought occurred to me: “what is the area of a polynomial?” In particular, consider the following two graphs:

poly1  poly3

In the first graph, the blue line is straight. Algebraically, this is a first degree polynomial.

In the second graph, the blue line is curved. Algebraically, this is a parabola; a second degree polynomial.

The area of the first shape is common knowledge:

\frac{1}{2} * base * height = \frac{1}{2} * 1 * 1 = \frac{1}{2}

The area of the second shape can be calucated using calculus and is:

\frac{2}{3} * base * height = \frac{2}{3} * 1 * 1 = \frac{2}{3}

What if we considered a more complicated polynomial? A third degree polynomial looks like this:


The new curve bends even more and takes up more area. Using calculus, we find that the area is  \frac{3}{4}

Below are the graphs of 10th, 20th, and 100th degree polynomials:




The larger and larger degree polynomials start to look like a square! This means, that we would expect the area to get closer and closer to 1. This is exactly what I found. The area of a polynomial with degree 100 is 0.99. In general, the area of a polynomial with degree n is:


Taking the limit, we find that:

\lim_{x \to \infty} \frac{n}{n+1} = 1

I found it very interesting that a large degree polynomial becomes so similar to a square. Click the link to view an animation displaying this fact

Leave a comment

Filed under Uncategorized