Drawing Statistical Conclusions¶

Download¶

Case Study A: Motivation for creative writers¶

Creative writing students randomly assigned to intrinsic vs. extrinsic priming questionnaires.

In [1]:

require(ggplot2)
creativity = read.csv('https://raw.githubusercontent.com/StanfordStatistics/stats191-data/main/Sleuth3/creativity.csv', header=TRUE)
salaries = read.csv('https://raw.githubusercontent.com/StanfordStatistics/stats191-data/main/Sleuth3/salaries.csv', header=TRUE)
set.seed(0)

Loading required package: ggplot2

In [2]:

head(creativity)

A data.frame: 6 × 2
	Score	Treatment
	<dbl>	<chr>
1	5.0	Extrinsic
2	5.4	Extrinsic
3	6.1	Extrinsic
4	10.9	Extrinsic
5	11.8	Extrinsic
6	12.0	Extrinsic

Summarizing the groups¶

Extrinsic Group¶

In [3]:

extrinsic = creativity$Score[creativity$Treatment == 'Extrinsic']
summary(extrinsic)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   5.00   12.15   17.20   15.74   18.95   24.00

Intrinsic Group¶

In [4]:

intrinsic = creativity$Score[creativity$Treatment == 'Intrinsic']
summary(intrinsic)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  12.00   17.43   20.40   19.88   22.30   29.70

Histogram of `Score` stratified by `Sex`¶

In [5]:

# this plot is for visualization only
# students not expected to reproduce
fig <- (ggplot(creativity, aes(x=Score, fill=Treatment)) +
        geom_histogram(aes(y=after_stat(density)),
	color="#e9ecef",
	alpha=0.6,
	position='identity',
	bins=10) +
        labs(fill=""))
fig

Case Study B: Difference in salaries between male and female employees¶

Salaries from Harris Trust and Bank over years 1969-1977

In [6]:

head(salaries)

A data.frame: 6 × 2
	Salary	Sex
	<int>	<chr>
1	3900	Female
2	4020	Female
3	4290	Female
4	4380	Female
5	4380	Female
6	4380	Female

Females¶

In [7]:

female = salaries$Salary[salaries$Sex == 'Female']
summary(female)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3900    4800    5220    5139    5400    6300

Males¶

In [8]:

male = salaries$Salary[salaries$Sex == 'Male']
summary(male)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   4620    5400    6000    5957    6075    8100

Histogram of `Salary` stratified by `Sex`¶

In [9]:

# this plot is for visualization only
# students not expected to reproduce

fig <- (ggplot(salaries, aes(x=Salary, fill=Sex)) +
        geom_histogram(aes(y=after_stat(density)),
                       color="#e9ecef",
                       alpha=0.6,
                       position='identity',
                       bins=10) +
        labs(fill=""))
fig

Boxplot of `Salary` stratified by `Sex`¶

In [10]:

boxplot(Salary ~ Sex, data=salaries, col='orange')

Key differences between the studies¶

Creative writing study was a randomized experiment.
Salary dataset was an observational study.

Implications¶

Differences in strength of conclusions: randomized experiments like creativity can admit causal conclusions
Generalizability: what population are the data from?
1. If we consider Harris a typical bank, then salaries represents a sample of starting salaries.

Making statistical inferences¶

{width=600 fig-align="center"}

What sort of conclusions are we entitled to make?

Modelling uncertainty in `creativity`¶

The observed difference (i.e. treatment_effect) is not 0. Is there a real difference?
We need a (statistical) model to draw statistical inferences!

Mental model: the world before randomization¶

{width=600 fig-align="center"}

Potential outcomes before randomization

Mental model: the null hypothesis¶

{width=400 fig-align="center"}

$H_0$: green outcome identical to red

Mental model: the world after randomization¶

{width=800 fig-align="center"}

Observed outcomes after randomization

Computing difference via `t.test`¶

In [11]:

estimates = t.test(Score ~ Treatment, data=creativity)$estimate
estimates

mean in group Extrinsic: 15.7391304347826
mean in group Intrinsic: 19.8833333333333

Effect¶

In [12]:

treatment_effect = treatment_effect=estimates[2] - estimates[1]
treatment_effect

mean in group Intrinsic: 4.14420289855072

Null hypothesis: no difference in `Score` between the groups¶

In [13]:

null_treatment = sample(creativity$Treatment, 47, replace=FALSE)
null_data = data.frame(Score=creativity$Score,
                       Treatment=null_treatment)
null_estimates = t.test(Score ~ Treatment, data=null_data)$estimate
null_estimates

mean in group Extrinsic: 17.3782608695652
mean in group Intrinsic: 18.3125

In [14]:

null_treatment_effect = null_estimates[2] - null_estimates[1]
null_treatment_effect

mean in group Intrinsic: 0.934239130434783

Repeated 10000 times¶

In [15]:

# this code/plot is for visualization only
# students not expected to reproduce

estimates = t.test(Score ~ Treatment, data=creativity)$estimate
treatment_effect = treatment_effect=estimates[2] - estimates[1]

null_treatment_effect = rep(NA, 10000)
for (i in 1:10000) {
    null_treatment = sample(creativity$Treatment, 47, replace=FALSE)
    null_data = data.frame(Score=creativity$Score,
                           Treatment=null_treatment)
    null_estimates = t.test(Score ~ Treatment, data=null_data)$estimate
    null_treatment_effect[i] = null_estimates[2] - null_estimates[1]
}

treatment_effect = data.frame(treatment_effect=treatment_effect)
fig <- (ggplot(data.frame(null_treatment_effect),
               aes(x=null_treatment_effect)) +
    geom_histogram(aes(y=after_stat(density)),
                   color="#e9ecef",
                   alpha=0.6, bins=30) +
    geom_vline(aes(xintercept=treatment_effect), treatment_effect,
               color='red', linewidth=2) +
    geom_vline(aes(xintercept=-treatment_effect), treatment_effect,
               color='red', alpha=0.5, linewidth=1) +
    geom_density() +  
    labs(fill=""))
fig
treatment_effect = treatment_effect[1,]

In [16]:

length(null_treatment_effect)
p_value = mean(abs(null_treatment_effect) > treatment_effect)
p_value

10000

0.006

Modelling uncertainty in `salaries`¶

The difference is not 0. Is the difference real?
We need a model to draw statistical inferences!

Mental model: `Male` and `Female` salaries¶

{width=600 fig-align="center"}

There are two populations of salaries

Mental model: `Male` and `Female` salaries¶

{width=600 fig-align="center"}

$H_0$: distribution of orange box identical to

purple

Computing difference via `t.test`¶

In [17]:

sex_estimates = t.test(Salary ~ Sex, data=salaries)$estimate
sex_effect = sex_estimates[2] - sex_estimates[1]
sex_effect

mean in group Male: 818.022540983607

Repeated 10000 times¶

In [18]:

# this code/plot is for visualization only
# students not expected to reproduce

sex_estimates = t.test(Salary ~ Sex, data=salaries)$estimate
sex_effect = sex_estimates[2] - sex_estimates[1]

null_sex_effect = rep(NA, 10000)
for (i in 1:10000) {
    null_sex = sample(salaries$Sex, length(salaries$Sex), replace=FALSE)
    null_data = data.frame(Salary=salaries$Salary,
                           Sex=null_sex)
    null_estimates = t.test(Salary ~ Sex, data=null_data)$estimate
    null_sex_effect[i] = null_estimates[2] - null_estimates[1]
}

sex_effect = data.frame(sex_effect=sex_effect)
fig <- (ggplot(data.frame(null_sex_effect),
               aes(x=null_sex_effect)) +
        geom_histogram(aes(y=after_stat(density)),
                       color="#e9ecef",
	               alpha=0.6, bins=30) +
    geom_vline(aes(xintercept=sex_effect), sex_effect,
               color='red', linewidth=2) +
    geom_vline(aes(xintercept=-sex_effect), sex_effect,
               color='red', alpha=0.5, linewidth=1) +
    geom_density() +  
    labs(fill=""))
fig
sex_effect = sex_effect[1,]

In [19]:

length(null_sex_effect)
p_value = mean(abs(null_sex_effect) > sex_effect)
p_value

10000

0

Other issues¶

We used the same method even for these different studies... does this make sense?
Terminology:
1. Parameter: a property of the probability model (often written $\theta$)
2. Estimate: a function of the sample data (often written $\hat{\theta}$)
3. Goal of statistical inference is to learn about the parameter $\theta$ from the estimate $\hat{\theta}$

Other issues¶

Experimental design:
1. Randomization: individuals were randomly assigned Treatment in creative study
2. Simple random sample: a way of sampling $n$ from a population such that every $n$ points are equally likely.
3. Other sampling mechanisms: systematic sampling, cluster sampling.

Drawing Statistical Conclusions¶

Download¶

Case Study A: Motivation for creative writers¶

Summarizing the groups¶

Extrinsic Group¶

Intrinsic Group¶

Histogram of Score stratified by Sex¶

Case Study B: Difference in salaries between male and female employees¶

Females¶

Males¶

Histogram of Salary stratified by Sex¶

Boxplot of Salary stratified by Sex¶

Key differences between the studies¶

Implications¶

Making statistical inferences¶

Modelling uncertainty in creativity¶

Mental model: the world before randomization¶

Mental model: the null hypothesis¶

Mental model: the world after randomization¶

Computing difference via t.test¶

Effect¶

Null hypothesis: no difference in Score between the groups¶

Repeated 10000 times¶

Modelling uncertainty in salaries¶

Mental model: Male and Female salaries¶

Mental model: Male and Female salaries¶

Computing difference via t.test¶

Repeated 10000 times¶

Other issues¶

Other issues¶

Histogram of `Score` stratified by `Sex`¶

Histogram of `Salary` stratified by `Sex`¶

Boxplot of `Salary` stratified by `Sex`¶

Modelling uncertainty in `creativity`¶

Computing difference via `t.test`¶

Null hypothesis: no difference in `Score` between the groups¶

Modelling uncertainty in `salaries`¶

Mental model: `Male` and `Female` salaries¶

Mental model: `Male` and `Female` salaries¶

Computing difference via `t.test`¶