# Easy way to demonstrate central limit theorem with Python.

The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution - the “bell curve” - as the sample size gets larger — no matter what the shape of the population distribution.
In other words more samples you takes, especially large ones, your graph of the sample means will look more like a normal distribution.
Sample sizes equal to or greater than 30 are often considered sufficient for the CLT to hold, may differ in some cases.A sufficiently large sample size can predict the characteristics of a population more accurately.

## Creating data with a power-law distribution:

``````
import matplotlib.pyplot as plt
import numpy as np

# data
N = 1000000
data = np.random.randn(N)**2
# alternative data
# data = np.sin(np.linspace(0,10*np.pi,N))

# show the distribution
plt.plot(data,'.')
plt.show()

plt.hist(data,40)
plt.show()
``````

## Distribution of samples means:

``````
## repeated samples of the mean

samplesize   = 30
numberOfExps = 500
samplemeans  = np.zeros(numberOfExps)

for expi in range(numberOfExps):
# get a sample and compute its mean
sampleidx = np.random.randint(0,N,samplesize)
samplemeans[expi] = np.mean(data[ sampleidx ])

# and show its distribution
plt.hist(samplemeans,30)
plt.xlabel('Mean estimate')
plt.ylabel('Count')
plt.show()
``````

## Mixing 2 non-Gaussian datasets to get Gaussian combined signal:

IMPORTANT: 2 datasets should be properly scaled !!!

``````
# create two datasets with non-Gaussian distributions
x = np.linspace(0,6*np.pi,10001)
s = np.sin(x)
u = 2*np.random.rand(len(x))-1

fig,ax = plt.subplots(2,3,figsize=(10,6))
ax[0,0].plot(x,s,'b')
ax[0,0].set_title('Signal')

y,xx = np.histogram(s,200)
ax[1,0].plot(y,'b')
ax[1,0].set_title('Distribution')

ax[0,1].plot(x,u,'m')
ax[0,1].set_title('Signal')

y,xx = np.histogram(u,200)
ax[1,1].plot(y,'m')
ax[1,1].set_title('Distribution')

ax[0,2].plot(x,s+u,'k')
ax[0,2].set_title('Combined signal')

y,xx = np.histogram(s+u,200)
ax[1,2].plot(y,'k')
ax[1,2].set_title('Combined distribution')

plt.show()
``````