##### Introduction

In introductory data science, we introduce students to a wide range of different topics as a gentle introduction into the realm of computational statistics. One of the topics we introduce students to is combinatorics, which is an area of mathematics which explores counting. Sounds easy right? How hard can counting be?

Unforunately, it’s not all that simple. What if I was to ask you to count the number of unique passwords there are where a password contains exactly 4 digits, and each digit could be the number 0 through 9? Maybe you could feasibly imagine sitting there and writing down every single password combination. Well, what if instead I told you that the passwords contains exactly 10 digits? That will be a lot of writing down! In fact, if you were to attempt to write down all possible passwords at a rate of 1 password each second, it would take you just over 317 years to write down all the combinations…

Clearly, trying to brute force every single possible combination is not feasible when our questions involve counting a massive number of outcomes. Hence, combinatorics involves leveraging mathematical principles in order to easily work out these problems.

In particular, in this post we will be talking specifically about counting with and without replacement – more on this soon. The aim of this post will not necessarily be developing an in-depth intuition of how this all works (although, we will attempt to do this in a more casual fashion). The aim is that I want you to identify the differences behind the calculations when we do questions with replacement and without replacement.

This topic can be quite confusing to students when they first see it. To be clear, I am by no means an expert at combinatorics – it is a massive field of mathematics. I am however quite confident with the material that is required for first year data science. I hope this guide/post will provide some useful tools for students to understand how to approach combinatoric topics. In particular, I did write this post with my DATA1001 students that I teach at the University of Sydney in mind. Hopefully this resource will be of benefit in preparing for the end of semester exam!

##### Counting With Replacement

Question: iPhone’s typically contain 4-digit passwords, where each digit can be a number from 0 through 9. How many unique passwords are there?

For those that have already seen this topic before, this question might sound familiar. It is one of the stock-standard questions that is used when first introducing this topic, however, I do think it has a lot of merit, and it is hence why we will start our discussions with it here.

So, breaking down the question, we want to see how many passwords with a length of 4 can be made, where for each digit, we have 10 choices -> i.e. 0,1,2,3,4,…,9.

Now, we’ve already mentioned that you could start by writing every combination that there is, and counting up how many that you write. For example, you could start with “0000”, then “0001”, then “0002”, and so on. This would take a very long time!

Luckily for us, there is a very quick mathematical way to accomplish this. We can use the following calculation to work out the answer:

Where does this come from? There are 10 ways to choose the first digit, 10 ways to choose the second digit, 10 ways to choose the third, and 10 ways to choose the fourth. We multiply together these four numbers to give all the different possible combinations.

As an aside, this value 10000 should line up with what we would expect. If instead of thinking about the passwords as passwords, but instead thought about them as numbers, you can clearly see that using the 4 digits, we can start at the number 0 (password “0000”), and work our way up to the number 9999 (password “9999”). In this regards, there are 10000 unique numbers, meaning there are 10000 unique passcodes.

If instead we were to say that we had a n-digit password, the number of unique passcodes is:

For example, in the case where we hade passwords of size 4:

##### Counting with Replacement

Now, what if we were to change our question so that we were still interested in 4-digit passcodes, but we had the additional requirement that each digit can only appear in the passcode once? Let’s check out the answer to the question:

Question: iPhone’s typically contain 4-digit passwords, where each digit can be a number from 0 through 9. How many unique passwords are there,

with the restriction that each digit can appear at most 1 time in the passcode?

This question is different to the one before. Previously, we could have the password “0000”, however we cannot now, as the digit “0” is only allowed to occur a single time and here it appears 4 times!

Even though this question seems a little bit more complicated, solving it is still relatively straightforward:

We can see that now there are much less possible passcodes. In fact, the number of passcodes has gone from 10000 before the restriction, to 5040 after the restriction. This is an almost 50% decrease!

How can we think conceptually about this formula? At first, we have 10 digits available. After picking one of the digits, we now have 9 available, followed by 8, and then followed by 7.

And that’s that!

**More Coming Soon…**

*I’ll update this post at a later date. I am currently *publishing a more simpler one so that my students who are about to sit a university data exam can benefit from this resource now before it is polished and expanded!