Linear Regression (Simple & Multiple)
Description
Linear Regression is one of the simplest and most widely used algorithms in supervised machine learning. It models the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a linear equation to observed data.
Simple Linear Regression
Simple linear regression involves modeling the relationship between one independent variable (X) and one dependent variable (Y) using a straight line:
Y = β₀ + β₁X + ε
- β₀: Intercept (value of Y when X=0)
- β₁: Slope (change in Y for a one-unit change in X)
- ε: Error term (difference between predicted and actual Y)
Multiple Linear Regression
Multiple linear regression extends simple linear regression by using multiple independent variables to predict the dependent variable:
Y = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ + ε
- Allows modeling more complex relationships involving multiple features.
- Each coefficient represents the effect of the corresponding feature on the target variable.
Examples
Simple Linear Regression Example in Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
# Sample data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([3, 4, 2, 5, 6])
# Create and train model
model = LinearRegression()
model.fit(X, y)
# Predictions
y_pred = model.predict(X)
# Plot
plt.scatter(X, y, color='blue')
plt.plot(X, y_pred, color='red')
plt.title('Simple Linear Regression')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Multiple Linear Regression Example in Python
import numpy as np
from sklearn.linear_model import LinearRegression
# Sample data (3 features)
X = np.array([
[1, 2, 3],
[2, 0, 1],
[3, 5, 2],
[4, 3, 3],
[5, 4, 5]
])
y = np.array([14, 8, 19, 18, 25])
# Create and train model
model = LinearRegression()
model.fit(X, y)
# Coefficients
print("Intercept:", model.intercept_)
print("Coefficients:", model.coef_)
# Predict for new data
new_data = np.array([[6, 3, 4]])
prediction = model.predict(new_data)
print("Prediction for", new_data, "is", prediction)
Real-World Applications
Linear Regression Applications
- Economics: Predicting consumer spending, forecasting sales, or analyzing economic growth.
- Healthcare: Estimating patient risk scores, predicting disease progression.
- Real Estate: Predicting housing prices based on features like size, location, and age.
- Marketing: Analyzing the impact of advertising spend on sales.
- Manufacturing: Predicting equipment failure or maintenance needs based on sensor data.

Resources
The following resources will be manually added later:
Video Tutorials
PDF/DOC Materials
Interview Questions
1. What is the difference between simple and multiple linear regression?
Simple linear regression involves one independent variable to predict the dependent variable, while multiple linear regression uses two or more independent variables.
2. What assumptions does linear regression make?
- Linearity: Relationship between features and target is linear.
- Independence: Observations are independent.
- Homoscedasticity: Constant variance of errors.
- Normality: Residuals are normally distributed.
- No multicollinearity in multiple regression.
3. How do you evaluate the performance of a linear regression model?
Common evaluation metrics include Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (coefficient of determination).
4. What is multicollinearity and why is it a problem in multiple linear regression?
Multicollinearity occurs when independent variables are highly correlated, which can make it difficult to estimate the coefficients accurately and affect model stability.
5. Can linear regression be used for classification problems?
Linear regression is not suitable for classification because it predicts continuous values. Logistic regression or other classification algorithms are used instead.