Metrics: MAE, MSE, RMSE, R² Score

Description

Evaluation metrics are crucial for measuring the performance of regression models. Each metric provides a different perspective on the errors or accuracy of the predictions made by the model.

Mean Absolute Error (MAE)

MAE measures the average absolute difference between the actual and predicted values. It gives an idea of how much the predictions deviate from the true values, on average.

  • MAE = (1/n) ∑ |yᵢ - ŷᵢ|
  • Simple to interpret, same units as target variable
  • Less sensitive to outliers than MSE and RMSE

Mean Squared Error (MSE)

MSE calculates the average squared difference between actual and predicted values. It penalizes larger errors more heavily because of squaring.

  • MSE = (1/n) ∑ (yᵢ - ŷᵢ)²
  • Useful for emphasizing larger errors
  • Units are squared compared to the target variable, which can make interpretation harder

Root Mean Squared Error (RMSE)

RMSE is the square root of MSE and provides error in the same units as the target variable, making it more interpretable.

  • RMSE = √MSE
  • Heavily penalizes large errors
  • Commonly used metric for regression problems

R² Score (Coefficient of Determination)

R² Score measures the proportion of variance in the dependent variable that is predictable from the independent variables.

  • Ranges from 0 to 1 (or can be negative for poor models)
  • R² = 1 - (SS_res / SS_tot), where SS_res is residual sum of squares, and SS_tot is total sum of squares
  • Closer to 1 indicates a better fit

Examples

Python Code to Calculate MAE, MSE, RMSE, and R²

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np

# True and predicted values
y_true = np.array([3, -0.5, 2, 7])
y_pred = np.array([2.5, 0.0, 2, 8])

# Calculate MAE
mae = mean_absolute_error(y_true, y_pred)
print(f"MAE: {mae}")

# Calculate MSE
mse = mean_squared_error(y_true, y_pred)
print(f"MSE: {mse}")

# Calculate RMSE
rmse = np.sqrt(mse)
print(f"RMSE: {rmse}")

# Calculate R² Score
r2 = r2_score(y_true, y_pred)
print(f"R² Score: {r2}")

Real-World Applications

Regression Model Evaluation

  • Healthcare: Predicting patient outcomes where accuracy and error rates are critical.
  • Finance: Stock price forecasting, where error metrics help in tuning predictive models.
  • Manufacturing: Predicting equipment failure and maintenance needs based on sensor data.
  • Energy: Forecasting energy consumption and load balancing.
Data analysis dashboard

Resources

The following resources will be manually added later:

Video Tutorials

Interview Questions

1. What is the difference between MAE and MSE?

Show Answer

MAE measures average absolute errors and is less sensitive to outliers, while MSE squares the errors, giving more weight to larger errors.

2. Why is RMSE often preferred over MSE?

Show Answer

RMSE is in the same unit as the target variable, making it easier to interpret compared to MSE, which is in squared units.

3. What does an R² score of 0.8 indicate?

Show Answer

It indicates that 80% of the variance in the dependent variable is explained by the model, which generally reflects a good fit.

4. Can R² be negative? What does it mean?

Show Answer

Yes, a negative R² indicates that the model performs worse than a horizontal line (mean prediction), suggesting a poor fit.

5. When should you prefer MAE over RMSE?

Show Answer

MAE is preferred when you want a metric less sensitive to outliers, focusing on average magnitude of errors regardless of direction.