Home › Topics › Basic Statistics & Probability › Bayes-theorem

Bayes Theorem

Introduction Reading Time: 12 min

Description
Prerequisites
Examples
Real-World Applications
Where topic Is Applied
Resources
Interview Questions

Description

Bayes' Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It updates the probability as more evidence or information becomes available.
Formula:
P(A∣B)=( P(B∣A)⋅P(A))/P(B)
Where:
𝑃(𝐴∣𝐵)
P(A∣B): Posterior (probability of A given B)
𝑃(𝐵∣𝐴)
P(B∣A): Likelihood
𝑃(𝐴)
P(A): Prior probability of A
𝑃(𝐵)
P(B): Marginal probability of B

Prerequisites

Basic probability
Conditional probability
Concept of prior and posterior
Python basics (for implementation)

Examples

Here's a simple example of a data science task using Python:


# Basic Example (Email Spam Detection)
# P(Spam) = 0.2, P(Having 'buy now' | Spam) = 0.8, P(Having 'buy now') = 0.32

P_spam = 0.2
P_word_given_spam = 0.8
P_word = 0.32

# Bayes Theorem
P_spam_given_word = (P_word_given_spam * P_spam) / P_word
print(f"Probability the email is spam: {P_spam_given_word:.2f}")

#Advanced Example (Using Scikit-learn for Naive Bayes Classifier)
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Create model
model = GaussianNB()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred) * 100:.2f}%")

Real-World Applications

Email Filtering

Classifying spam emails using word likelihoods

Healthcare

Diagnosing diseases from symptoms (medical tests)

Finance

Credit scoring and loan risk evaluation

Where topic Is Applied

Machine Learning: Naive Bayes classifier

AI Decision-Making: Bayesian Networks

Medical Testing: Diagnostic test accuracy

Resources

Data Science topic PDF

Download

Harvard Data Science Course

Free online course from Harvard covering data science foundations

Visit

Interview Questions

➤ It updates the probability of a hypothesis as more evidence is available.

➤ Prior is the initial belief before seeing evidence; posterior is the updated belief after seeing evidence.

➤ A classification algorithm based on Bayes' Theorem assuming independence between features.

➤ Because it assumes all features are independent, which is rarely true in real data.

➤ Helps compute the probability of a disease given a test result, improving diagnostic accuracy.

Data Science in my style