Notebook

In [28]:

%matplotlib inline
import pylab as pl

import warnings
warnings.filterwarnings(action = "ignore", category=DeprecationWarning)

pl.xkcd()

Out[28]:

<matplotlib.rc_context at 0x7f751b15c790>

t-test for comparison of sample means¶

source
steps:
1. get a hunch from plotting
2. use statistics to quantify the evidence backing the hunch

In [3]:

import numpy as np
from scipy import stats

## data comparing averages of heights from two twons
town1_hts = [5, 6, 7, 6, 7.1, 6, 4]
town2_hts = [5.5, 6.5, 7, 6, 7.1, 6]

In [29]:

## plotting - use boxplot to show the difference between two samples
fig, ax = pl.subplots(1, 1)
_ = ax.boxplot([town1_hts, town2_hts])
_ = ax.set_xticklabels(["town1", "town2"])
_ = ax.set_ylim((3.5, 7.5))

** Observations: **

since most statistic tests are based on a normal distribution assumption (e.g. t-test), we need to notice whether the population (or sample) is roughly normally distributed
generally, if the boxes of each distribution overlap, and you haven't taken something on the order of a buttload (metric units) of measurements, you should doubt the differences in distribution averages, like in the case here
however, the conclusions of the first two should consider the influcence of outliers, if any

** Run a statistical test**

In statistics, what we are asking is whether differences we observed are reliable indicators of some trend or just happend by lucky chance.
Statisticans usually do this by comparing the sample statistics (effect size - the average difference in this case) with the standard error and use a model, e.g. a t-test to estiamte the probability of this happening by pure chance
Different flavors of t-test - paired or unpaired:
- Paired v.s. Unpaired
  1. datasets are paired (also called dependent), e.g. when you measure the same set of itmes twice, usually before and after some change.
  2. datasets are unpaired: e.g., random samples from differet/same populations
- Equal v.s. Unequal size: whether the sizes of the sets are the same
- Equal v.s. Unequal variances: whether the variance of the two samples under comparison are equal or not. If we made no assumption about the sizes of the datasets
- In this case, we are running an unpaired, unequal size, unequal variance test. That's Welch's T-Test

In [32]:

## scipy implemented paired/unpaired
## scipy.stats.ttest_rel implements paired test
## scipy.stats.ttest_ind implements unpaired test
%pdef stats.ttest_rel
%pdef stats.ttest_ind
print "Welch's T-test p-value", stats.ttest_ind(town1_hts, town2_hts, equal_var=False, )[1]

 stats.ttest_rel(a, b, axis=0)
  stats.ttest_ind(a, b, axis=0, equal_var=True)
 Welch's T-test p-value 0.347028503558

Check the assumption for t-test

The big assumption for most t-test is that the data we used came from a normal distribution
SO, actually the thing we should do before the t-test is to check the normality of the data, which is implemneted as Shapiro-Wilk test

In [34]:

%pdef stats.shapiro
print 'Town1 Shapiro-Wilks p-value', stats.shapiro(town1_hts)[1]
print 'Town2 Shapiro-Wilks p-value', stats.shapiro(town2_hts)[1]

 stats.shapiro(x, a=None, reta=False)
 Town1 Shapiro-Wilks p-value 0.380458295345
Town2 Shapiro-Wilks p-value 0.562481462955

But what if the data sets are NOT normally distributed

simple answer: IT does NOT matter. t-tests are resilient to breaking of the normality assumption
less simple answer: use the nonparametric equivalents that don't make normality assumptions. But be aware, nonparametric methods generally need more data than parametric version.
the NONparametric version of Welch's t-test is Mann-Whiteney U test

In [35]:

%pdef stats.mannwhitneyu
print 'Mann-Whitney U p-value', stats.mannwhitneyu(town1_hts, town2_hts, )[1]

 stats.mannwhitneyu(x, y, use_continuity=True)
 Mann-Whitney U p-value 0.253597522173

In [ ]: