Easy way to perform one-sample t-test with Python.


Sometimes you might want to know how your sample mean compares to the population mean. In that case you need one-sample t-test which is a statistical hypothesis test used to determine whether an unknown population mean is different from a specific value. You usually run a one sample t test when you don't know the population standard deviation or you have a small sample size.

For this test there should be no significant outliers in data and dependent variable should be approximately normally distributed. Also the data must be independent (i.e., not correlated), which means that there is no relationship between the observations.



Generate the data for one-sample t-test:



import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats

# parameters
N = 20  # sample size
popMu = .5 # true population mean
data  = np.random.randn(N) + popMu

# let's see what the data look(s) like
plt.plot(data,'ko',markerfacecolor='w',markersize=10)
plt.xlabel('Data index')
plt.ylabel('Data value')
plt.show()

Generate the data for one-sample t-test

Manual t-test - long way:



# the null-hypothesis value
H0val = 0

# compute the t-value
t_num = np.mean(data) - H0val
t_den = np.std(data, ddof=1) / np.sqrt(N)
tval = t_num / t_den

# degrees of freedom
df = N-1

# p-value
pval = 1-stats.t.cdf(abs(tval),df)

# show the H0 parameter distribution and observed t-value
x = np.linspace(-4,4,1001)
tdist = stats.t.pdf(x,df) * np.mean(np.diff(x))

plt.plot(x,tdist,linewidth=2)
plt.plot([tval,tval],[0,max(tdist)],'r--')
plt.legend(('H_0 distribution','Observed t-value'))
plt.xlabel('t-value')
plt.ylabel('pdf(t)')
plt.title('t(%g) = %g, p=%g'%(df,tval,pval))
plt.show()

Manual t-test results


T-test using the Python function - best way:


IMPORTANT: please note that Python function gives p-value for two-tailed test, compare to manual method !!!



t,p = stats.ttest_1samp(data,H0val)

print(t,p)

OUT: 1.334861245367219 0.19769758234172718



See also related topics: