Table of contents
- 1. Mean Squared Error (MSE)
- 2. Root Mean Squared Error (RMSE)
- 3. Mean Absolute Error (MAE)
- 4. Mean Absolute Percentage Error (MAPE)
- 5. R-squared (Coefficient of Determination)
- 6. Adjusted R-squared
- 7. Root Mean Squared Logarithmic Error (RMSLE)
- 8. Explained Variance Score
- 9. Mean Squared Logarithmic Error (MSLE)
- 10. Log-Cosh Loss
- 11. Quantile Loss
- 12. Relative Absolute Error (RAE)
- 13. Relative Squared Error (RSE)
- 14. Symmetric Mean Absolute Percentage Error (SMAPE)
- 15. Mean Bias Deviation (MBD)
- 16. Mean Directional Accuracy (MDA)
- 17. Huber Loss
1. Mean Squared Error (MSE)
Explanation
Mean Squared Error (MSE) measures the average squared difference between estimated and actual values. It gives higher weight to larger errors due to squaring.
Mathematical Formula
MSE = (1/n) * ∑[(y_i - ŷ_i)^2]
Where:
n is the number of data points.
y_i is the actual value.
ŷ_i is the predicted value.
When to Use
Sensitive to outliers
Regression model performance evaluation
Requires normally distributed errors
Penalizes large errors more heavily
Python Implementation
from sklearn.metrics import mean_squared_error
import numpy as np
def calculate_mse(y_true, y_pred):
return mean_squared_error(y_true, y_pred)
Pros
Simple to understand
Mathematically convenient
Differentiable
Symmetric error measurement
Cons
Sensitive to outliers
Uses squared values, making interpretation challenging
Units are squared original units
Fun Facts
Commonly used in machine learning optimization
Fundamental in least squares regression
Minimized by linear regression
2. Root Mean Squared Error (RMSE)
Explanation
RMSE is the square root of MSE, bringing the metric back to the original data scale.
Mathematical Formula
RMSE = √[(1/n) * ∑[(y_i - ŷ_i)^2]]
When to Use
Need error in original units
Want to penalize large errors
Regression model comparison
Python Implementation
from sklearn.metrics import mean_squared_error
import numpy as np
def calculate_rmse(y_true, y_pred):
return np.sqrt(mean_squared_error(y_true, y_pred))
Pros
Interpretable in original units
Sensitive to large errors
Symmetric error measurement
Cons
Sensitive to outliers
Squares errors before taking root
Less intuitive for non-technical audiences
Fun Facts
Standard deviation of residuals
Often preferred over MSE for reporting
3. Mean Absolute Error (MAE)
Explanation
MAE calculates the average absolute difference between predicted and actual values.
Mathematical Formula
MAE = (1/n) * ∑|y_i - ŷ_i|
When to Use
Less sensitive to outliers
Linear error measurement
Robust regression evaluation
Python Implementation
from sklearn.metrics import mean_absolute_error
def calculate_mae(y_true, y_pred):
return mean_absolute_error(y_true, y_pred)
Pros
Less sensitive to outliers
Interpretable
Linear error measurement
Uses absolute value
Cons
Doesnt differentiate overestimation vs underestimation
Less mathematically convenient for optimization
Less penalization of large errors
Fun Facts
Also known as L1 loss
More robust for non-Gaussian error distributions
4. Mean Absolute Percentage Error (MAPE)
Explanation
MAPE represents the average percentage difference between predicted and actual values.
Mathematical Formula
MAPE = (1/n) ∑|(y_i - ŷ_i) / y_i| 100
When to Use
Percentage-based error comparison
Similar scale data
Forecasting and time series
Python Implementation
import numpy as np
def calculate_mape(y_true, y_pred):
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
Pros
Scale-independent
Easy percentage interpretation
Comparable across different scales
Cons
Undefined when true value is zero
Biased towards smaller values
Asymmetric error treatment
Fun Facts
Commonly used in financial forecasting
Can be problematic with small true values
5. R-squared (Coefficient of Determination)
Explanation
R-squared represents the proportion of variance in the dependent variable predictable from independent variables.
Mathematical Formula
R² = 1 - (SS_res / SS_tot)
Where:
SS_res is the sum of squares of residuals
SS_tot is the total sum of squares
When to Use
Model goodness-of-fit assessment
Linear regression evaluation
Comparing model predictive power
Python Implementation
from sklearn.metrics import r2_score
def calculate_r2(y_true, y_pred):
return r2_score(y_true, y_pred)
Pros
Easy interpretation (0-1 range)
Indicates model explanatory power
Normalized measure
Cons
Doesn't indicate model accuracy
Can be misleading with non-linear relationships
Increases with more predictors
Fun Facts
Ranges from 0 to 1
Not always best for model selection
6. Adjusted R-squared
Explanation
Adjusted R-squared penalizes adding unnecessary predictors to the model.
Mathematical Formula
Adjusted R² = 1 - [(1 - R²) * (n - 1) / (n - p - 1)]
Where:
n is the number of data points
p is the number of predictors
When to Use
Complex models with multiple predictors
Preventing overfitting
Model complexity comparison
Python Implementation
import numpy as np
from sklearn.linear_model import LinearRegression
def calculate_adjusted_r2(X, y):
model = LinearRegression()
model.fit(X, y)
r2 = model.score(X, y)
n, p = X.shape
adj_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)
return adj_r2
Pros
Prevents overfitting
Accounts for model complexity
More reliable for complex models
Cons
Still has limitations
Assumes linear relationship
Not suitable for non-linear models
Fun Facts
Used in feature selection
Developed to improve R-squared limitations
7. Root Mean Squared Logarithmic Error (RMSLE)
Explanation
RMSLE calculates root mean squared error after log transformation, reducing impact of large errors.
Mathematical Formula
RMSLE = √[(1/n) * ∑[(log(p_i + 1) - log(y_i + 1))^2]]
When to Use
Percentage errors matter more than absolute errors
Skewed data
Predicting exponential growth
Python Implementation
import numpy as np
def calculate_rmsle(y_true, y_pred):
return np.sqrt(np.mean((np.log1p(y_pred) - np.log1p(y_true))**2))
Pros
Less sensitive to outliers
Handles exponential growth
Logarithmic scale benefits
Cons
Complex interpretation
Less intuitive
Requires log transformation
Fun Facts
Popular in Kaggle competitions
Useful for price and volume predictions
8. Explained Variance Score
Explanation
Measures the proportion of variance explained by the model compared to total variance.
Mathematical Formula
Explained Variance = 1 - (Var(y - ŷ) / Var(y))
When to Use
Model performance assessment
Variance explanation
Prediction quality evaluation
Python Implementation
from sklearn.metrics import explained_variance_score
def calculate_explained_variance(y_true, y_pred):
return explained_variance_score(y_true, y_pred)
Pros
Provides variance explanation
Sensitive to prediction errors
Normalized score
Cons
Similar to R-squared
Assumes linear relationships
Limited interpretability
Fun Facts
Used in signal processing
Indicates model's explanatory power
9. Mean Squared Logarithmic Error (MSLE)
Explanation
Mean Squared Logarithmic Error (MSLE) is a loss function that applies logarithmic scaling to reduce the impact of large errors while preserving relative differences.
Mathematical Formula
MSLE = (1/n) * ∑[(log(y_i + 1) - log(ŷ_i + 1))^2]
When to Use
When dealing with data with exponential growth
Useful for metrics where relative errors are more important than absolute errors
Recommended for scenarios with wide range of target values
Particularly effective for financial, economic, or scientific data with exponential characteristics
Python Implementation
import numpy as np
def mean_squared_log_error(y_true, y_pred):
return np.mean(np.square(np.log1p(y_true) - np.log1p(y_pred)))
Pros
Reduces impact of large outliers
Handles wide range of scales effectively
Emphasizes relative prediction accuracy
Cons
Not suitable for negative predictions
Can be sensitive to small changes in log-scaled values
Less interpretable compared to MSE
Fun Facts
Logarithmic transformation is similar to log-based normalization
Often used in competitions like Kaggle for certain prediction tasks
Closely related to log transformation in statistical modeling
10. Log-Cosh Loss
Explanation
Log-Cosh Loss is a smooth approximation of Mean Absolute Error (MAE) that provides better numerical stability.
Mathematical Formula
Log-Cosh Loss = (1/n) * ∑[log(cosh(y_i - ŷ_i))]
When to Use
When you want a smooth loss function
Suitable for regression problems with potential outliers
Provides a balance between MSE and MAE
Python Implementation
import numpy as np
def log_cosh_loss(y_true, y_pred):
return np.mean(np.log(np.cosh(y_pred - y_true)))
Pros
Smooth and differentiable
Less sensitive to outliers compared to MSE
Computationally efficient
Cons
Can be less interpretable
Performance depends on specific dataset characteristics
Might not capture all error nuances
Fun Facts
Mathematically similar to Huber loss
Provides a good compromise between MAE and MSE
Commonly used in deep learning optimization
11. Quantile Loss
Explanation
Quantile Loss allows prediction of specific quantiles of the target variable, providing more flexible regression modeling.
Mathematical Formula
Quantile Loss = (1/n) ∑[max(q (y_i - ŷ_i), (q - 1) * (y_i - ŷ_i))]
Where:
- q is the quantile
When to Use
Predicting specific percentiles of target variable
Asymmetric prediction scenarios
Risk assessment and financial modeling
Capturing uncertainty in predictions
Python Implementation
import numpy as np
def quantile_loss(y_true, y_pred, quantile=0.5):
errors = y_true - y_pred
return np.mean(np.maximum(quantile * errors, (quantile - 1) * errors))
Pros
Allows flexible probabilistic predictions
Can model different parts of the distribution
Useful for risk-aware modeling
Cons
More complex to interpret
Computationally more expensive
Requires careful quantile selection
Fun Facts
Used in financial risk modeling
Enables prediction of confidence intervals
Powerful technique in machine learning uncertainty estimation
12. Relative Absolute Error (RAE)
Explanation
Relative Absolute Error (RAE) measures prediction accuracy relative to a naive baseline prediction.
Mathematical Formula
RAE = ∑|y_i - ŷ_i| / ∑|y_i - mean(y)|
When to Use
Comparing different model performances
Normalizing error across different scales
Assessing relative prediction quality
Python Implementation
import numpy as np
def relative_absolute_error(y_true, y_pred):
numerator = np.sum(np.abs(y_true - y_pred))
denominator = np.sum(np.abs(y_true - np.mean(y_true)))
return numerator / denominator
Pros
Scale-independent metric
Easy to interpret
Provides relative performance assessment
Cons
Sensitive to extreme values
Can be misleading with small datasets
Doesn't provide absolute error magnitude
Fun Facts
Part of the family of relative error metrics
Used in academic and research model evaluations
Helps compare models across different domains
13. Relative Squared Error (RSE)
Explanation
Relative Squared Error (RSE) compares squared prediction errors to squared errors of a baseline model.
Mathematical Formula
RSE = ∑[(y_i - ŷ_i)^2] / ∑[(y_i - mean(y))^2]
When to Use
Comparing model performances
Normalizing error across different datasets
Providing relative error assessment
Python Implementation
import numpy as np
def relative_squared_error(y_true, y_pred):
numerator = np.sum((y_true - y_pred)**2)
denominator = np.sum((y_true - np.mean(y_true))**2)
return numerator / denominator
Pros
Provides normalized error metric
Penalizes larger errors more
Useful for model comparison
Cons
Squares errors, amplifying outlier impact
Less interpretable than absolute metrics
Can be misleading with small datasets
Fun Facts
Closely related to R-squared metric
Common in statistical modeling
Helps assess model improvement over baseline
14. Symmetric Mean Absolute Percentage Error (SMAPE)
Explanation
SMAPE provides a symmetric percentage error metric that handles both overestimation and underestimation equally.
Mathematical Formula
SMAPE = (1/n) ∑[|y_i - ŷ_i| / ((|y_i| + |ŷ_i|) / 2)] 100
When to Use
Time series forecasting
Comparing models with different scales
Handling both positive and negative predictions
Python Implementation
import numpy as np
def symmetric_mean_absolute_percentage_error(y_true, y_pred):
return np.mean(np.abs(y_true - y_pred) / ((np.abs(y_true) + np.abs(y_pred)) / 2)) * 100
Pros
Symmetric handling of errors
Percentage-based, easy to interpret
Works with both positive and negative values
Cons
Can be unstable with values close to zero
Sensitive to small absolute differences
Might not be suitable for all datasets
Fun Facts
Recommended by many forecasting competitions
More robust than traditional MAPE
Widely used in demand forecasting
15. Mean Bias Deviation (MBD)
Explanation
Mean Bias Deviation measures the average bias in predictions, indicating systematic over or under-estimation.
Mathematical Formula
MBD = (1/n) * ∑(ŷ_i - y_i)
When to Use
Checking systematic model biases
Quality control in predictive models
Understanding model prediction tendencies
Python Implementation
import numpy as np
def mean_bias_deviation(y_true, y_pred):
return np.mean(y_pred - y_true)
Pros
Simple to calculate
Directly shows model bias direction
Helps identify systematic errors
Cons
Positive and negative errors can cancel out
Less informative for complex models
Doesn't capture error magnitude
Fun Facts
Important in scientific and engineering modeling
Helps improve model calibration
Commonly used in climate and environmental prediction
16. Mean Directional Accuracy (MDA)
Explanation
Mean Directional Accuracy measures the model's ability to predict the correct direction of change.
Mathematical Formula
MDA = (1/n) * ∑[sign(y_i - y_{i-1}) == sign(ŷ_i - ŷ_{i-1})]
When to Use
Time series and sequential predictions
Financial market forecasting
Trend prediction models
Python Implementation
import numpy as np
def mean_directional_accuracy(y_true, y_pred):
direction_true = np.diff(y_true)
direction_pred = np.diff(y_pred)
return np.mean(np.sign(direction_true) == np.sign(direction_pred))
Pros
Focuses on directional prediction
Useful for trend-based forecasting
Simple to understand and implement
Cons
Ignores magnitude of predictions
Less informative for non-sequential data
Can be misleading with noisy data
Fun Facts
Critical in financial and economic modeling
Used in technical analysis of stock markets
Complements other accuracy metrics
17. Huber Loss
Explanation
Combines MSE and MAE, being less sensitive to outliers while maintaining quadratic behavior for small errors.
Mathematical Formula
Huber Loss = (1/n) ∑[0.5 (y_i - ŷ_i)^2 if |y_i - ŷ_i| <= δ else δ (|y_i - ŷ_i| - 0.5 δ)]
Where:
- δ is a threshold parameter
When to Use
Robust regression
Outlier-prone datasets
Machine learning optimization
Python Implementation
import numpy as np
def huber_loss(y_true, y_pred, delta=1.0):
error = y_true - y_pred
is_small_error = np.abs(error) <= delta
squared_loss = 0.5 * error**2
linear_loss = delta * np.abs(error) - 0.5 * delta**2
return np.mean(np.where(is_small_error, squared_loss, linear_loss))
Pros
Robust to outliers
Combines quadratic and linear loss
Smooth transition
Cons
Requires hyperparameter tuning
More complex implementation
Computationally expensive
Fun Facts
Named after Peter Huber
Used in robust statistics