Regression Models
Regression models are supervised learning algorithms that predict continuous numerical values. They're used when the output variable is a real or continuous value, such as "price" or "weight."
What is Regression?
Regression is a supervised learning technique where the algorithm learns from labeled training data to predict a continuous output variable. The goal is to find the relationship between independent variables (features) and a dependent variable (target) by estimating how the target changes as the features change.
Unlike classification models that predict discrete categories, regression models predict continuous values. This makes them suitable for problems where the output is a quantity rather than a category.
Key Characteristics
- Predicts continuous numerical values
- Requires labeled training data with numerical target values
- Models the relationship between independent and dependent variables
- Evaluated using metrics like MSE, RMSE, MAE, and R-squared
- Can handle simple linear relationships or complex non-linear patterns
Common Regression Algorithms
A fundamental supervised learning algorithm for predicting continuous values based on linear relationships between variables.
An extension of linear regression that can model non-linear relationships by adding polynomial terms to the regression equation.
Regularization methods that add penalty terms to the linear regression cost function to reduce model complexity and prevent overfitting.
Common Applications
Predicting house prices, stock prices, or product prices based on various features and historical data.
Estimating future sales based on historical sales data, marketing spend, seasonality, and other factors.
Predicting the likelihood of loan defaults or insurance claims based on customer attributes and behavior.
Evaluation Metrics
Regression models are evaluated using different metrics than classification models. Common evaluation metrics include:
- Mean Squared Error (MSE): The average of the squared differences between predicted and actual values.
- Root Mean Squared Error (RMSE): The square root of MSE, which provides an error measure in the same units as the target variable.
- Mean Absolute Error (MAE): The average of the absolute differences between predicted and actual values.
- R-squared (R²): The proportion of the variance in the dependent variable that is predictable from the independent variables.
- Adjusted R-squared: A modified version of R-squared that adjusts for the number of predictors in the model.