Introduction
In introductory data science, we introduce students to a wide range of topics as a gentle introduction to computational statistics. One of the topics we introduce students to is combinatorics, an area of mathematics that explores counting. Sounds easy, right? How hard can counting be?
Unfortunately, it’s not all that simple. What if I were to ask you to count the number of unique passwords there are where a password contains exactly 4 digits, and each digit could be the number 0 through 9? Maybe you could feasibly imagine sitting there and writing down every single password combination. Well, what if instead, I told you that the password contains exactly 10 digits? That will be a lot of writing down! In fact, if you were to attempt to write down all possible passwords at a rate of 1 password each second, it would take you just over 317 years to write down all the combinations…
Clearly, trying to brute force every single possible combination is not feasible when our questions involve counting a massive number of outcomes. Hence, combinatorics involves leveraging mathematical principles to easily work out these problems.
In this post we will be talking specifically about counting with and without replacement. The aim of this post will not necessarily be developing an in-depth intuition of how this all works (although, we will attempt to do this casually). The aim is that I want you to identify the differences behind the calculations when we do questions with replacement and without replacement.
This topic can be quite confusing to students when they first see it. I hope this guide/post will provide some useful tools for students to understand how to approach combinatorics topics. In particular, I did write this post with my DATA1001 students that I teach at the University of Sydney in mind. Hopefully this resource will be of benefit in preparing for their end of semester exam!
Counting With Replacement
Question: iPhones typically contain 4-digit passwords, where each digit can be a number from 0 through 9. How many unique passwords are there?
For those who have already seen this topic before, this question might sound familiar. It is one of the stock-standard questions that is used when first introducing this topic, however, I do think it has a lot of merit, and it is hence why we will start our discussions with it here.
So, breaking down the question, we want to see how many passwords with a length of 4 can be made, where for each digit, we have 10 choices -> i.e. 0,1,2,3,4,…,9.
Now, we’ve already mentioned that you could start by writing every combination that there is, and counting up how many that you write. For example, you could start with “0000”, then “0001”, then “0002”, and so on. This would take a very long time!
Luckily for us, there is a very quick mathematical way to accomplish this. We can use the following calculation to work out the answer:
Where does this come from? There are 10 ways to choose the first digit, 10 ways to choose the second digit, 10 ways to choose the third, and 10 ways to choose the fourth. We multiply together these four numbers to give all the different possible combinations.
The heading of this blog post mentions “permutations”, but we are yet to introduce this term. The dictionary (Oxford) defines permutations as “[the] several possible ways in which a set or number of things can be ordered or arranged.” Hence, this question is a permutation question, as we are interested in the different ways that the numbers can be arranged to create unique passwords.
As an aside, this value of 10000 should line up with what we would expect. If instead of thinking about the passwords as passwords, but instead thought about them as numbers, you can clearly see that using the 4 digits, we can start at the number 0 (password “0000”), and work our way up to the number 9999 (password “9999”). In this regard, there are 10000 unique numbers, meaning there are 10000 unique passcodes.
If instead we were to say that we had a n-digit password, the number of unique passcodes is:
$$10^n$$For example, in the case where we have passwords of size 4:
$$10^4=10\times10\times10\times10=1000$$Counting Without Replacement
Now, what if we were to change our question so that we were still interested in 4-digit passcodes, but we had the additional requirement that each digit can only appear in the passcode once? Let’s check out the answer to the question:
Question: iPhones typically contain 4-digit passwords, where each digit can be a number from 0 through 9. How many unique passwords are there, with the restriction that each digit can appear at most 1 time in the passcode?
This question is different from the one before. Previously, we could have the password “0000”, however, we cannot now, as the digit “0” is only allowed to occur a single time and here it appears 4 times! Also, it is important to recognise that despite adding this restriction, this is still a permutation question as we are interested in the ways things can be ordered or arranged.
Even though this question seems a little bit more complicated, solving it is still relatively straightforward:
$$10\times9\times8\times7 = 5040$$We can see that now there are much less possible passcodes. In fact, the number of passcodes has gone from 10000 before the restriction, to 5040 after the restriction. This is an almost 50% decrease!
How can we think conceptually about this formula? At first, we have 10 digits available. After picking one of the digits, we now have 9 available, followed by 8, and then followed by 7.
What are the Chances? – Expanding to Probability
A natural extension of counting problems is to ask probability questions. For example:
Question: A person randomly enters a 4-digit iPhone password. What is the probability that for the password entered, each digit is unique?
In general, to find the probability, we want to find the number of combinations for the restricted case, and the number of combinations for the unrestricted case. In general, the probability will then be:
$$P = \frac{\text{Restricted Count}}{\text{Unrestricted Count}}$$
The numerator (or “Restricted Count”) represents the 4-digit passwords where each digit is unique (that is, each digit can only occur a maximum of 1 time in the password). Conveniently, we already calculated this in the “Counting without Replacement” section, where we found the value is 5040.
For this question, the denominator (or “Unrestricted Count”) represents all valid 4-digit iPhone passwords. We have already found this in the “Counting With Replacement” section, where we found the value is 10000.
Now, we can substitute in these values to easily find the probability:
Main Update: 28 October 2024
Thumbnail created using Gemini.