Generating events to count probabilities with Python
In the previous article we have talked about confirming solution for simple probabilistic problems. What if we have slightly complex environment and we cannot calculate all the possible outcomes? We can still make a bunch of experiments and by the law of large numbers the average of the results will be close to expected value. As usually Python will help us generate vast amount of possible events and we will compare that result to our initial guess. Let’s try to solve some examples using this approach!
Example 1
A baseball team wins each game with probability 40% and the games are all independent. What is the probability that they get their first win of the season in their fifth game?
Solution
Losing probability is a complement probability so it’s equal to 60%. Therefore by multiplication rule for independent events we need to count losing 4 games in a row and then winning fifth game. 0.6*0.6*0.6*0.6*0.4 = 0.05184
which is a 5% chance for such an outcome.
Confirmation
First let’s define a function which will tell us whether we win a game.
1 | win_probability = 0.4 |
Next we introduce a concept of Bernoulli trial - it’s just a random experiment with exactly two possible outcomes. Our experiment is about winning in the fifth game only.
1 | games = 5 |
Then we conduct a lot of experiments to see whether our empirical probability will eventually converge to correct solution.
1 | trials = [100, 10**3, 10**6] |
Number of events; success outcomes; probability
- 100; 9; 9.0
- 1000; 47; 4.7
- 1000000; 51711; 5.1711
To visually confirm convergence we can build a plot showing relation between probability deviation and number of events.
That’s exactly what we have expected, let’s confirm other problems as well.
Example 2
In a certain game of tennis, Alex has a 60% probability to win any given point against Blake. The player who gets to 4 points first wins the game, and points cannot end in a tie. What is Alex’s probability to win the game?
Solution
The are four different possible outcomes for Alex to win:
- Alex wins
4-0
. It’s a0.6^4
probability of winning. - Alex wins
4-1
. Blake can win any of the 5 games except the last one. So it’s a combination of 1 from 4 possible multiplied by winning 4 games and losing one.
- Alex wins
4-2
. The same logic applied here: we multiply combinations of 2 out of 5 by winning 4 games probability and losing 2 of them.
- Alex wins
4-3
. Instead of writing the same formula I provide a generalization for arbitraryN
(number of points to win) andp
(probability of having a point)
Adding up all of this partial probabilities we end up with 71% result.
Confirmation
Following the same approach we have
1 | win_probability = 0.6 |
and our trial
1 | def trial(): |
Code for repeating experiments is the same as above
1 | trials = [100, 10**3, 10**6] |
Number of events; success outcomes; probability
- 100; 73; 73.0
- 1000; 715; 71.5
- 1000000; 710356; 71.0356
Chart shows the same probability as we calculated before
Example 3
You and a friend take turns rolling a fair six-sided die, and the first person to roll a 6 wins. What is the probability that the person who makes the first roll wins the game?
Solution
Let’s start by iterating on possible outcomes. Suppose you win in a first roll. Then the probability of winning is 1/6
. In order to win next time (5/6
for losing in a first roll) you need your friend to roll something other than 6 (it’s a 5/6
probability) and then to have a winning roll (which is a 1/6
).
Following the same logic we encounter infinite sum 1/6 + 5/6*5/6*1/6 + 5/6*5/6*5/6*5/6*1/6 + ... +
. We can calculate it easily using a formula for geometric series. 1/6 / (1 - 25/36) = 0.545
which is 54.5% in result.
Confirmation
At this point you can try and implement a solution by yourself because everything is the same as previously
1 | def winroll(): |
Number of events; success outcomes; probability
- 100; 52; 52.0
- 1000; 543; 54.30
- 1000000; 545099; 54.5099
Additional a chart confirming convergence to the same value
Example 4
Two people, Alice and Bill, each roll a fair 20-sided die. What is the probability that Alice’s roll is higher than Bill’s roll?
Solution
For each number a player has equal 1/20
probability. It’s impossible for Alice to win with 1
(it’s not higher than any other number). For 2
there are one possible outcome from Bill’s side which brings a victory to Alice. For 3
there are two possible outcomes and the same goes up to 20
which has 19 positive responses.
The only thing we need is to add all the probabilities together 1/20 * (1/20 + 2/20 + ... + 19/20) = 1/20 * 190/20 = 19/40 = 0.475
. To quickly calculate the sum within brackets we can use sum of arithmetic sequence. This gives us 47.5% as a result.
Confirmation
Code should look very familiar
1 | def roll(): |
Number of events; success outcomes; probability
- 100; 44; 44.0
- 1000; 461; 46.1
- 1000000; 474146; 47.4146
Resulting plot is also behaves as expected
Summary
Probabilities might be tricky but following simple principles and confirming our guesses with a code we can increase confidence for our solution and empirically verify its correctness. You can apply same technique to almost any probabilistic problem, so do not stop here and go for new adventures. See you soon.