This is a Jupyter Notebook with my notes on randomness, probability, and the sensation of "coincidence." The notebook has written notes and Python code that you can execute. (You don't need to be able to read or write Python code to follow along with the notebook.)

These notes are still a work in progress! I originally wrote these notes to go along with an in-class presentation, and they're not quite "freestanding" at this point.

From Does your iPod play favorites?:

Last spring it dawned on Apple CEO Steve Jobs that the heart of his hit iPod digital music player was the "shuffle." [...] [Jobs] used the idea as the design principle of the new low-cost iPod Shuffle. Its ad slogan celebrates the serendipity music lovers embrace when their songs are reordered by chance--"Life is random." [...] But just about everyone who has an iPod has wondered how random the iPod shuffle function really is. [...] More than a year ago, I outlined these concerns to Jobs; he dialed up

an engineer who insisted that shuffle played no favorites.Since then, however, millions of new Podders have started shuffling, and the question has been discussed in newspapers, blogs and countless conversations. It's taking on Oliver Stone-like conspiracy buzz. [...] Apple execs profess amusement. "It's part of the magic of shuffle," says Greg Joswiak, the VP for iPod products. Still, I asked him last week to double-check with the engineers. They flatly assured him that"Random is random,"and the algorithm that does the shuffling has been tested and reverified.

Compare with e.g. Divination: Shufflemancy:

Shufflemancy is a type of divination that works by receiving messages in the forms of songs in a playlist. In this form of divination the practitioner will ask a question, or for guidance, and then activate shuffle in a playlist of some type. This playlist will then begin to play a song and from the song the practitioner will get a message....

From Sid Meier and Rob Pardo on Probability and Player Psychology:

When designing the combat system in Civilization: Revolution, Sid Meier found himself up against some interesting design problems. [...] In Civ Rev, the strength of units were displayed up front to players before battle to show the odds of victory. For example, an attacking unit might be rated at 1.5 with the defending unit at 0.5. This is a 3-to-1 situation. [...] [T]he testers

expected to win this battle every time despite there being a 25% chance of losingeach time. [...] When the player was presented with 3-to-1 or 4-to-1 odds, they expect to win. With 2-to-1 odds, the player will accept losing some of the time, but expect to win at 20-to-10, which is just a larger expression of 2-to-1. When the numbers get larger, the perceived advantage grows.

To adjust for this, Sid actually changed the math again so that the

outcomes of previous battles are taken into account.He found that if a player lost too many 2-to-1 battles in a row, they would get frustrated. Instead of risking a player shutting the game down, Sid changed the math.

[Blizzard Designer] Rob Pardo discussed ... the perception of bonuses versus punishments. [...] [P]layers would get into cold streaks where they went a long time without getting a quest item drop.

Instead of correctly attributing this to randomness, they would blame the math or random number generator.To get around this, Rob and his team actually changed the code to increase the drop rate after each kill until it hits 100% and then reset it for the next one.

It's interesting that players don't want to accept math when it doesn't work out for them, but are more than happy to accept it when it rewards them. Purists might dislike Sid Meier changing the match to appease players, but the game became more fun. Blizzard opts to package systems as a reward over penalties and takes a little randomness out of the equation.

Below, I use Tarot in order to demonstrate some elementary principles of probability—and use elementary principles of probability in order to talk about Tarot. I'm especially concerned with the idea of *coincidence* in Tarot. Often in my own Tarot practice, I see patterns emerge—certain cards that come up over and over, for example, or certain suits that seem to be overrepresented—and attribute meaning to those patterns. My hope is that in understanding the nature of how likely these patterns are, I can be better at assigning them meaning.

Terminology note: In the text below, I refer to a sequence of cards drawn from a Tarot deck as a "spread." This is just a convenience—spreads are, of course, more complicated than that, and their meaning derives not just from the sequence of cards but also from their spatial relationships between each other, etc.

In [1]:

```
from tqdm.notebook import tqdm
import json
```

In [2]:

```
cards_raw = json.load(open("tarot_interpretations.json"))['tarot_interpretations']
```

In [3]:

```
cards = [{k: item[k] for k in ['name', 'rank', 'suit']} for item in cards_raw]
```

In [4]:

```
cards
```

Out[4]:

[{'name': 'The Fool', 'rank': 0, 'suit': 'major'}, {'name': 'The Magician', 'rank': 1, 'suit': 'major'}, {'name': 'The Papess/High Priestess', 'rank': 2, 'suit': 'major'}, {'name': 'The Empress', 'rank': 3, 'suit': 'major'}, {'name': 'The Emperor', 'rank': 4, 'suit': 'major'}, {'name': 'The Pope/Hierophant', 'rank': 5, 'suit': 'major'}, {'name': 'The Lovers', 'rank': 6, 'suit': 'major'}, {'name': 'The Chariot', 'rank': 7, 'suit': 'major'}, {'name': 'Strength', 'rank': 8, 'suit': 'major'}, {'name': 'The Hermit', 'rank': 9, 'suit': 'major'}, {'name': 'The Wheel', 'rank': 10, 'suit': 'major'}, {'name': 'Justice', 'rank': 11, 'suit': 'major'}, {'name': 'The Hanged Man', 'rank': 12, 'suit': 'major'}, {'name': 'Death', 'rank': 13, 'suit': 'major'}, {'name': 'Temperance', 'rank': 14, 'suit': 'major'}, {'name': 'The Devil', 'rank': 15, 'suit': 'major'}, {'name': 'The Tower', 'rank': 16, 'suit': 'major'}, {'name': 'The Star', 'rank': 17, 'suit': 'major'}, {'name': 'The Moon', 'rank': 18, 'suit': 'major'}, {'name': 'The Sun', 'rank': 19, 'suit': 'major'}, {'name': 'Judgement', 'rank': 20, 'suit': 'major'}, {'name': 'The World', 'rank': 21, 'suit': 'major'}, {'name': 'ace of wands', 'rank': 1, 'suit': 'wands'}, {'name': 'two of wands', 'rank': 2, 'suit': 'wands'}, {'name': 'three of wands', 'rank': 3, 'suit': 'wands'}, {'name': 'four of wands', 'rank': 4, 'suit': 'wands'}, {'name': 'five of wands', 'rank': 5, 'suit': 'wands'}, {'name': 'six of wands', 'rank': 6, 'suit': 'wands'}, {'name': 'seven of wands', 'rank': 7, 'suit': 'wands'}, {'name': 'eight of wands', 'rank': 8, 'suit': 'wands'}, {'name': 'nine of wands', 'rank': 9, 'suit': 'wands'}, {'name': 'ten of wands', 'rank': 10, 'suit': 'wands'}, {'name': 'page of wands', 'rank': 'page', 'suit': 'wands'}, {'name': 'knight of wands', 'rank': 'knight', 'suit': 'wands'}, {'name': 'queen of wands', 'rank': 'queen', 'suit': 'wands'}, {'name': 'king of wands', 'rank': 'king', 'suit': 'wands'}, {'name': 'ace of cups', 'rank': 1, 'suit': 'cups'}, {'name': 'two of cups', 'rank': 2, 'suit': 'cups'}, {'name': 'three of cups', 'rank': 3, 'suit': 'cups'}, {'name': 'four of cups', 'rank': 4, 'suit': 'cups'}, {'name': 'five of cups', 'rank': 5, 'suit': 'cups'}, {'name': 'six of cups', 'rank': 6, 'suit': 'cups'}, {'name': 'seven of cups', 'rank': 7, 'suit': 'cups'}, {'name': 'eight of cups', 'rank': 8, 'suit': 'cups'}, {'name': 'nine of cups', 'rank': 9, 'suit': 'cups'}, {'name': 'ten of cups', 'rank': 10, 'suit': 'cups'}, {'name': 'page of cups', 'rank': 'page', 'suit': 'cups'}, {'name': 'knight of cups', 'rank': 'knight', 'suit': 'cups'}, {'name': 'queen of cups', 'rank': 'queen', 'suit': 'cups'}, {'name': 'king of cups', 'rank': 'king', 'suit': 'cups'}, {'name': 'ace of swords', 'rank': 1, 'suit': 'swords'}, {'name': 'two of swords', 'rank': 2, 'suit': 'swords'}, {'name': 'three of swords', 'rank': 3, 'suit': 'swords'}, {'name': 'four of swords', 'rank': 4, 'suit': 'swords'}, {'name': 'five of swords', 'rank': 5, 'suit': 'swords'}, {'name': 'six of swords', 'rank': 6, 'suit': 'swords'}, {'name': 'seven of swords', 'rank': 7, 'suit': 'swords'}, {'name': 'eight of swords', 'rank': 8, 'suit': 'swords'}, {'name': 'nine of swords', 'rank': 9, 'suit': 'swords'}, {'name': 'ten of swords', 'rank': 10, 'suit': 'swords'}, {'name': 'page of swords', 'rank': 'page', 'suit': 'swords'}, {'name': 'knight of swords', 'rank': 'knight', 'suit': 'swords'}, {'name': 'queen of swords', 'rank': 'queen', 'suit': 'swords'}, {'name': 'king of swords', 'rank': 'king', 'suit': 'swords'}, {'name': 'ace of coins', 'rank': 1, 'suit': 'coins'}, {'name': 'two of coins', 'rank': 2, 'suit': 'coins'}, {'name': 'three of coins', 'rank': 3, 'suit': 'coins'}, {'name': 'four of coins', 'rank': 4, 'suit': 'coins'}, {'name': 'five of coins', 'rank': 5, 'suit': 'coins'}, {'name': 'six of coins', 'rank': 6, 'suit': 'coins'}, {'name': 'seven of coins', 'rank': 7, 'suit': 'coins'}, {'name': 'eight of coins', 'rank': 8, 'suit': 'coins'}, {'name': 'nine of coins', 'rank': 9, 'suit': 'coins'}, {'name': 'ten of coins', 'rank': 10, 'suit': 'coins'}, {'name': 'page of coins', 'rank': 'page', 'suit': 'coins'}, {'name': 'knight of coins', 'rank': 'knight', 'suit': 'coins'}, {'name': 'queen of coins', 'rank': 'queen', 'suit': 'coins'}, {'name': 'king of coins', 'rank': 'king', 'suit': 'coins'}]

The `len()`

method tells us how many cards are in the list:

In [5]:

```
len(cards)
```

Out[5]:

78

`n`

that I can reuse below. Whenever you see `n`

in the cells below, it's referring back to the number of cards in the deck.

In [6]:

```
n = len(cards)
```

Python lets us pick things at random from a list in a few ways. All of them require the `random`

module, which we can make available in the notebook like so:

In [7]:

```
import random
```

To pick one card, use `random.choice()`

:

In [8]:

```
random.choice(cards)
```

Out[8]:

{'name': 'king of cups', 'rank': 'king', 'suit': 'cups'}

`n`

is the number of cards in the deck. Here's how to calculate this in Python:

In [9]:

```
1 / n
```

Out[9]:

0.01282051282051282

In other words, there's a slightly better than 1% chance that the card you're thinking of will be on the top of the deck. (Assuming that the deck has been shuffled fairly, etc.)

To make sure this math is right, I like to run "simulations" in Python. In a simulation, I'm going to perform the task whose probability I want to calculate over and over, then increment a counter whenever the event in question happens. After, I can divide the number of successful trials by the total number of trials in order to get the frequency of successes. The code below does this for our simple one-card spread:

In [10]:

```
trials = 100000
matches = 0
# do the following 100k times...
for i in tqdm(range(trials)):
# pick a card
drawn = random.choice(cards)
# increment matches if name matches the card we're
# looking for
if drawn['name'] == 'The Tower':
matches += 1
print(matches / trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.01325

When you run this cell, you should see a number that looks more or less similar to the number that we calculated above (~0.0128). As you increase the number of trials, the resulting frequency should theoretically converge on the calculated probability. Try

Python syntax notes: Comments begin with

`#`

. Code blocks are indicated not with curly braces (`{`

and`}`

) as in JavaScript, C, etc., but with indentation.

The chance of drawing a card that belongs to a particular category can be calculated in a similar way. If `c`

is the number of cards in that category, the probability that a single card you draw will belong to that category is $\frac{c}{n}$.

To demonstrate, the code below makes a new list of cards that contains *only* the major arcana:

In [11]:

```
major_arcana = [item for item in cards if item['suit'] == 'major']
```

There are this many major arcana:

In [12]:

```
len(major_arcana)
```

Out[12]:

22

A value I'll assign to the variable `n_major`

:

In [13]:

```
n_major = len(major_arcana)
```

In [14]:

```
n_major / n
```

Out[14]:

0.28205128205128205

Python has a function `random.sample()`

that lets us pick items at random from a list *without replacement*. "Without replacement" here means that once an item has been picked, it won't be picked subsequently in the same sampling process, so we won't see any duplicate cards. This is a good way of simulating a simple Tarot spread:

In [15]:

```
random.sample(cards, 3)
```

Out[15]:

[{'name': 'queen of coins', 'rank': 'queen', 'suit': 'coins'}, {'name': 'ace of wands', 'rank': 1, 'suit': 'wands'}, {'name': 'four of wands', 'rank': 4, 'suit': 'wands'}]

(Change the number after the comma to pick a different number of cards.)

Now think of *two* Tarot cards. The chance that a two-card spread is calculated as the chance of drawing one of the cards from the full deck (i.e., $\frac{1}{n}$), multiplied by the chance of drawing the other card from a deck that is missing one card (because, in this scenario, the first card is no longer in the deck)—i.e., $\frac{1}{n-1}$. Multiplying across gives us $\frac{1}{n(n-1)}$, which can be expressed in Python like so:

... can be written like so in Python:

In [16]:

```
1 / (n * (n - 1))
```

Out[16]:

0.0001665001665001665

In [17]:

```
trials = 100000
target = ['The Star', 'The Pope/Hierophant']
matches = 0
for i in tqdm(range(trials)):
if [item['name'] for item in random.sample(cards, 2)] == target:
matches += 1
print(matches/trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.00014

A Tarot spread is an example a *permutation without repetition*, meaning a set of items drawn at random from a discrete set, where the order in which the items are drawn is important. Python comes with a function called `permutations`

that shows all of possible permutations for a given list of things.

In [18]:

```
from itertools import permutations
```

To make this demonstration a bit cleaner, I'm going to make a list of all of the card names:

In [19]:

```
card_names = [item['name'] for item in cards]
```

There is exactly one permutation of zero cards (i.e., no cards at all):

In [20]:

```
list(permutations(card_names, 0))
```

Out[20]:

[()]

And there are *n* permutations of one card (i.e., every card is its own single-card spread):

In [21]:

```
print(list(permutations(card_names, 1)))
```

But there are 6006 permutations of *two* cards, of which I show just the first 100 below:

In [22]:

```
print(list(permutations(card_names, 2))[:100])
```

The number 6006 results from multiplying our `n`

by `n - 1`

:

In [23]:

```
n * (n - 1)
```

Out[23]:

6006

Here's a chunk of the permutations of three cards:

In [24]:

```
print(list(permutations(card_names, 3))[:100])
```

`permutations`

function, we can find the total number of permutations for spreads of any size:

In [25]:

```
for i in range(4):
print(i, "cards ->", len(list(permutations(cards, i))), "permutations")
```

`permutations()`

function gives an exhaustive list of all possible permutations, and with 78 cards in the deck, the number of permutations in a spread of five cards is so large that it hangs the CPU on my computer just to generate it. If we're just interested in the number of permutations, we don't have to list them. Instead, we can calculate them with the following formula:

$$\frac{n!}{(n-k)!}$$

`!`

is the *factorial* operator, which evaluates to `n`

multiplied by every integer smaller than it down to one (i.e., 4! = 4 × 3 × 2 × 1 = 24). The following code implements a function `npr()`

that takes the number of items in the deck and the number to sample at once, and returns the total possible number of permutations:

In [26]:

```
from math import factorial
def npr(n, r):
return factorial(n) // factorial(n-r)
```

In [27]:

```
for i in range(10):
print(i, "cards ->", npr(n, i), "permutations")
```

In [28]:

```
1 / npr(n, 3)
```

Out[28]:

2.190791664475875e-06

Or without scientific notation:

In [29]:

```
print(f'{1/npr(n,3):.10f}')
```

0.0000021908

*feels* like they tell me the same thing over and over. But it's unlikely you'll see the exact same three-card spread twice, even if you did half a million of them.

In [32]:

```
trials = 2000000
target = ['The Star', 'Temperance', 'The Devil']
matches = 0
for i in tqdm(range(trials)):
if [c['name'] for c in random.sample(cards, 3)] == target:
matches += 1
print(matches, "match(es);", matches/trials, "frequency")
```

HBox(children=(FloatProgress(value=0.0, max=2000000.0), HTML(value='')))

1 match(es); 5e-07 frequency

There are this many ten-card spreads:

In [33]:

```
npr(n, 10)
```

Out[33]:

4566176969818464000

*this* probability of drawing any particular sequence of ten Tarot cards:

In [34]:

```
print(f'{1/npr(n,10):.25f}')
```

0.0000000000000000002190016

Meaning that any ten card spread is almost certainly unique.

There are some patterns in Tarot spreads that aren't about the order of the cards. We might, for example, observe that there are an unusual number of cards of a particular suit in a spread, or that major arcana have been showing up with unusual frequency in daily readings over the course of a week. But how do we know if these frequencies are actually unusual?

Let's start with the following question: How likely is it that *every card* in a three-card spread belongs to the suit of cups? The probability that a single card belongs to cups is the number of cards in the suit of cups divided by the number of cards total ($\frac{14}{78}$):

In [35]:

```
14 / n
```

Out[35]:

0.1794871794871795

(Meaning that about 18% of cards you draw will be cups.)

The probability of drawing *two* cards that are cups is $\frac{14}{78} \times \frac{13}{77}$ (because after having drawn the first card, there are only 13 cups and 77 cards left in the deck):

In [36]:

```
(14 / n) * (13 / (n - 1))
```

Out[36]:

0.030303030303030304

In [37]:

```
(14 / n) * (13 / (n-1)) * (12 / (n-2))
```

Out[37]:

0.004784688995215311

In [38]:

```
trials = 100000
all_cup_count = 0
for i in tqdm(range(trials)):
# get just the suit for three random cards
c_suits = [c['suit'] for c in random.sample(cards, 3)]
# if all are cups, up the count
if c_suits == ['cups', 'cups', 'cups']:
all_cup_count += 1
print(all_cup_count / trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.00464

In [39]:

```
(all_cup_count / trials) * 365
```

Out[39]:

1.6936

Another way to state this problem is this: there are a certain number of different ways that three cards can be drawn from a Tarot deck. In how many of those are all of the cards cups? In this case, we don't care about the *order* of the cards, so we're talking about *combinations* (rather than permutations). The formula for calculating the number of combinations of items of a particular size is:

*ncr()* (short for "given *n* items, choose *r*):

In [40]:

```
import operator as op
from functools import reduce
def ncr(n, r):
r = min(r, n-r)
numer = reduce(op.mul, range(n, n-r, -1), 1)
denom = reduce(op.mul, range(1, r+1), 1)
return numer // denom
```

The function tells us how many combinations of two cards there are:

In [41]:

```
ncr(n, 2)
```

Out[41]:

3003

This is, notably, exactly half of the number of permutations:

In [42]:

```
npr(n, 2)
```

Out[42]:

6006

... which makes sense. If you aren't worrying about the order of cards, then half of the two-card permutations will be identical to each other.

The number of three-card permutations:

In [43]:

```
ncr(n, 3)
```

Out[43]:

76076

`ncr()`

function lets us calculate how many combinations of three cards from a group of fourteen there are:

In [44]:

```
ncr(14, 3)
```

Out[44]:

364

In [45]:

```
ncr(14, 3) / ncr(n, 3)
```

Out[45]:

0.004784688995215311

The probability of dealing a five-card spread with all major arcana:

In [46]:

```
ncr(22, 5) / ncr(n, 5)
```

Out[46]:

0.0012474012474012475

An empirical trial of same:

In [47]:

```
trials = 100000
matches = 0
for i in tqdm(range(trials)):
c_suits = [c['suit'] for c in random.sample(cards, 5)]
if c_suits.count('major') == 5:
matches += 1
print(matches/trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.00133

There are a certain number of three-card spreads:

In [48]:

```
ncr(n, 3)
```

Out[48]:

76076

If the deck had one fewer card, there would be fewer possible spreads:

In [49]:

```
ncr(n-1, 3)
```

Out[49]:

73150

In [50]:

```
ncr(n-1, 3) / ncr(n, 3)
```

Out[50]:

0.9615384615384616

*not* show up in a spread. Subtract this from one to get the probability that the opposite occurs, i.e., the probability that any given card *will* show up in a three-card spread:

In [51]:

```
1 - (ncr(n-1, 3) / ncr(n, 3))
```

Out[51]:

0.038461538461538436

In [52]:

```
trials = 100000
star_count = 0
for i in tqdm(range(trials)):
c_names = [c['name'] for c in random.sample(cards, 3)]
if 'The Star' in c_names:
star_count += 1
print(star_count / trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.03918

In [53]:

```
def chance_in_spread(n, r):
return 1 - (ncr(n-1, r) / ncr(n, r))
```

In [54]:

```
chance_in_spread(n, 10)
```

Out[54]:

0.1282051282051282

In [55]:

```
chance_in_spread(n, 3) * chance_in_spread(n, 3)
```

Out[55]:

0.0014792899408284004

There's a nearly 2% chance that the same card will show up in two 10-card spreads in a row.

In [57]:

```
chance_in_spread(n, 10) * chance_in_spread(n, 10)
```

Out[57]:

0.016436554898093356

Empirical test of same:

In [58]:

```
trials = 100000
star_count = 0
for i in tqdm(range(trials)):
c_names1 = [c['name'] for c in random.sample(cards, 10)]
c_names2 = [c['name'] for c in random.sample(cards, 10)]
if 'The Tower' in c_names1 and 'The Tower' in c_names2:
star_count += 1
print(star_count/trials)
```

HBox(children=(FloatProgress(value=0.0, max=100000.0), HTML(value='')))

0.01677

Often when we think of random numbers, we think of rolling a die. Each side has the same probability of coming up. If you rolled a six-sided die a million times, for example, you'd expect the each side to come up roughly the same number of times ($\frac{1000000}{6}$, or about 166667). The following cell performs this experiment, then counts up which sides of the die came up most frequently:

In [59]:

```
from collections import Counter
rolls = [random.randrange(6)+1 for i in range(1000000)]
Counter(rolls).most_common()
```

Out[59]:

[(5, 166937), (3, 166833), (4, 166828), (2, 166534), (6, 166531), (1, 166337)]

As you can see, even with a million rolls, the numbers don't quite match our estimate, but they're pretty close.

It turns out that this is just one kind of randomness. Other phenomena in the world also produce random outcomes, but they don't always look like the outcomes of rolling dice.

The following cell has some code to help display graphs, which I'm going to do fairly frequently in the rest of the notebook.

In [60]:

```
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.rcParams["figure.figsize"] = (10, 4)
```

The kind of randomness produced by rolling dice is called *uniform randomness*. It's called uniform because no outcome is more likely than any other. The following function produces ten uniformly distributed random numbers between 0 and 1:

In [61]:

```
np.random.uniform(size=(10,))
```

Out[61]:

array([0.80959912, 0.38372388, 0.24769758, 0.17516367, 0.96174321, 0.46559622, 0.76599657, 0.09233533, 0.19830138, 0.54953235])

In [63]:

```
for i in range(1, 7):
plt.figure(figsize=(6,2))
plt.title("with 10^%d samples" % i)
plt.hist(np.random.uniform(size=(10**i,)), bins=20)
plt.show()
```

By the time we've generated $10^6$ samples, the histogram looks flat.

Notably, the *sum* of two dice does *not* follow a uniform distribution. The code in the following cell rolls two dice one thousand times, then counts up the most common numbers that the values of the dice add up to:

In [64]:

```
rolls = [sum([random.randrange(6)+1 for j in range(2)]) for i in range(10000)]
Counter(rolls).most_common()
```

Out[64]:

[(7, 1632), (8, 1406), (6, 1390), (5, 1140), (9, 1068), (10, 874), (4, 871), (3, 561), (11, 487), (12, 290), (2, 281)]

In [65]:

```
for i in range(1, 6):
plt.figure(figsize=(6,2))
plt.title("with 10^%d samples" % i)
plt.hist(np.random.uniform(size=(10**i,)) + np.random.uniform(size=10**i,),
bins=20)
plt.show()
```

This distribution begins to resemble a pyramid, with the most likely values in the center of the range. This should be intuitive to anyone who has played a board game that uses the combined values of two dice as a gameplay mechanic. In games like this, you'll often find that dice rolls of 2 or 12 have special rules attached to them (because they're so rare).

We don't have to stop at summing the rolls of two dice. The code in the following cell sums up the values of an arbitrary number of dice with an arbitrary number of sides, and then calculates the most common outcomes for the given number of trials:

In [66]:

```
def get_rolls(sides, dice, trials):
rolls = [sum([random.randrange(sides)+1 for j in range(dice)]) for i in range(trials)]
return Counter(rolls).most_common()
```

So, for example, the most common sums of three six-sided dice over a few thousand rolls:

In [67]:

```
get_rolls(6, 3, 10000)
```

Out[67]:

[(10, 1245), (11, 1243), (12, 1172), (9, 1160), (13, 996), (8, 974), (14, 684), (7, 683), (6, 453), (15, 434), (16, 296), (5, 275), (4, 138), (17, 135), (18, 67), (3, 45)]

In [68]:

```
def plot_int_counts(counts):
nums = np.zeros(shape=(max([item[0] for item in counts])+1,))
for k,v in counts:
nums[k] = v
plt.bar(range(nums.shape[0]), nums)
plt.show()
```

In [69]:

```
plot_int_counts(get_rolls(6, 3, 100000))
```

Plotting with even more dice gives us an even curvier curve:

In [70]:

```
plot_int_counts(get_rolls(6, 6, 100000))
```

The sums of uniform random numbers approximate another distribution of random numbers, called *normal* (or Gaussian) distribution. Random numbers with a normal distribution cluster around a particular value (the *center* of the distribution), and that cluster has a particular density (the *variance* of the distribution).

There's a function in Python to generate normal random numbers:

In [71]:

```
np.random.normal(0, 1, size=24)
```

Out[71]:

array([-7.03719486e-01, 1.39347461e+00, -1.26143104e-01, 3.70384056e-01, 1.09026272e+00, 1.47959873e-01, 5.16520702e-01, -8.75107472e-01, 8.03322998e-02, -8.44208525e-01, 4.90862107e-01, -5.11388946e-01, 2.78629045e-01, -5.23632508e-01, 1.99569565e-03, 1.45798574e+00, -1.65684094e+00, 8.55570749e-01, -6.13381022e-01, -9.53401332e-02, 9.96937665e-01, 2.08361804e+00, 8.77619973e-01, 3.05505836e-01])

The first parameter is the center and the second is the variance. You can switch these up:

In [72]:

```
np.random.normal(6, 6, size=24)
```

Out[72]:

array([ 6.88660325, 4.68496737, 5.55270995, 18.851693 , 18.38041328, 2.39389233, 11.68465526, 0.98454816, 3.97634027, 0.39761589, 5.04222476, 9.18881071, 11.66353467, 9.7790319 , 12.72331521, 3.23051402, -0.75138677, 1.46676276, 9.21546184, 14.95504318, 12.95271009, 7.15533676, 8.75825494, 6.17434599])

In [73]:

```
num_points = 10000
plt.figure(figsize=(5,5))
plt.scatter(np.random.uniform(size=(num_points,)), np.random.uniform(size=(num_points,)))
plt.show()
```

In [74]:

```
num_points = 10000
plt.figure(figsize=(5,5))
plt.scatter(np.random.normal(size=(num_points,)), np.random.normal(size=(num_points,)))
plt.show()
```

In [75]:

```
name_data = json.load(open("firstNames.json"))['firstNames']
```

In [76]:

```
name_lengths = [len(name) for name in name_data]
```

In [77]:

```
plot_int_counts(Counter(name_lengths).most_common())
```

In [78]:

```
words = open("frankenstein.txt").read().split()
```

In [79]:

```
word_lengths = [len(word) for word in words]
```

In [80]:

```
plot_int_counts(Counter(word_lengths).most_common())
```

In [81]:

```
word_counts = Counter(words)
```

In [82]:

```
word_counts_indexed = [(i, count) for i, (word, count) in enumerate(word_counts.most_common(250))]
```

In [83]:

```
plot_int_counts(word_counts_indexed)
```

class activities: length of names, last digit of dob

In [84]:

```
len("AllisonParrish")
```

Out[84]:

14

In [85]:

```
name_lengths = [
14,
14,
20,
12,
12,
11,
15,
12,
14,
10,
7,
11,
14,
10,
7
]
```

In [86]:

```
plot_int_counts(Counter(name_lengths).most_common())
```

In [87]:

```
digits = [
8, 1, 8, 7, 3, 8, 4, 9, 4, 5, 0, 7, 4, 8, 8
]
```

In [88]:

```
plot_int_counts(Counter(digits).most_common())
```

From Gilovich, Thomas, et al. “The Hot Hand in Basketball: On the Misperception of Random Sequences.” Cognitive Psychology, vol. 17, no. 3, 1985, pp. 311:

[T]he tendency to perceive a sequence as streak shooting decreases with the probability of alternation. [...] The sequences selected as best examples of chance shooting had probabilities of alternation of 0.7 and 0.8 rather

than 0.5. Furthermore, the sequence with the probability of alternation of 0.5 (the proper example of chance shooting) was classified as chance shooting only by 32% of subjects, whereas 62% identified it as an example of streak shooting.

Evidently,

people tend to perceive chance shooting as streak shooting,and they expect sequences exemplifying chance shooting to contain many more alternations than would actually be produced by a random (chance) process. Thus, people "see" a positive serial correlation in independent sequences, and they fail to detect a negative serial correlation in alternating sequences. Hence, people not only perceive random sequences as positively correlated, they also perceive negatively correlated sequences as random.