Easy way to calculate conditional probability with Python.


Conditional probability is actually the likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome.
Conditional probability has many applications such as Data science, insurance, politics, and many different fields of mathematics.

Conditional probability can be contrasted with unconditional probability. Unconditional probability refers to the likelihood that an event will take place irrespective of whether any other events have taken place or any other conditions are present.

Probabilities are classified as either conditional, marginal, or joint.
Conditional probability: p(A|B) is the probability of event A occurring, given that event B occurs. Marginal probability: Marginal probability is the probability of an event irrespective of the outcome of another variable. Joint probability: p(A and B). The probability of event A and event B occurring. It is the probability of the intersection of two or more events. The probability of the intersection of A and B may be written p(A ∩ B).



Generating two time series:



import matplotlib.pyplot as plt
import numpy as np

N = 10000
spikeDur  = 10  # a.u. but must be an even number
spikeNumA = .01 # in proportion of total number of points
spikeNumB = .05 # in proportion of total number of points

# initialize to zeros
spike_tsA = np.zeros(N)
spike_tsB = np.zeros(N)


### populate time series A
spiketimesA = np.random.randint(0,N,int(N*spikeNumA))

# flesh out spikes (loop per spike)
for spikei in range(len(spiketimesA)):
    
    # find boundaries
    bnd_pre = int( max(0,spiketimesA[spikei]-spikeDur/2) )
    bnd_pst = int( min(N,spiketimesA[spikei]+spikeDur/2) )
    
    # fill in with ones
    spike_tsA[bnd_pre:bnd_pst] = 1


# ### repeat for time series 2
spiketimesB = np.random.randint(0,N,int(N*spikeNumB))
spiketimesB[:len(spiketimesA)] = spiketimesA # induce strong conditional probability

# flesh out spikes (loop per spike)
for spikei in range(len(spiketimesB)):
    
    # find boundaries
    bnd_pre = int( max(0,spiketimesB[spikei]-spikeDur/2) )
    bnd_pst = int( min(N,spiketimesB[spikei]+spikeDur/2) )
    
    # fill in with ones
    spike_tsB[bnd_pre:bnd_pst] = 1

plt.plot(range(N),spike_tsA, range(N),spike_tsB)
plt.ylim([0,1.2])
# plt.xlim([2000,2500])
plt.show()

time series for conditional probability

Compute series probabilities and intersection:


It is often stated as the probability of B given A and is written as P(B|A), where the probability of B depends on that of A happening.


P(B|A) = P(A∩B) / P(A)

Where:

P = Probability
A = Event A
B = Event B



# probabilities
probA = sum(spike_tsA==1) / N
probB = np.mean(spike_tsB)

# joint probability
probAB = np.mean(spike_tsA+spike_tsB==2)

print(probA,probB,probAB)

OUT: 0.0927 0.3958 0.0927


Compute the conditional probabilities:



# p(A|B)
pAgivenB = probAB/probB

# p(B|A)
pBgivenA = probAB/probA

# print a little report
print('P(A)   = %g'%probA)
print('P(A|B) = %g'%pAgivenB)
print('P(B)   = %g'%probB)
print('P(B|A) = %g'%pBgivenA)

OUT:
P(A) = 0.0927
P(A|B) = 0.234209
P(B) = 0.3958
P(B|A) = 1




See also related topics: