What are the linear models in machine learning?
The linear models are widely used in machine learning. These models are used to make the prediction with the help of the linear function of the input features. Today, we will talk about linear models in machine learning. Are you ready? Let’s get into it!
1 - Linear regression
Linear regression is one of the most popular supervised learning algorithms. It's also simple and among the best understood in statistics and machine learning.
Linear regression is a basic type that is used for the prediction analysis. The general concept of regression is to study two questions:
- Does a set of predictor variables predict an outcome variable?
- What are the most significant variables and have the most impact on the outcome variable?
Generally, the predictive regression analysis is used to explain the relationship between the independent variables (one or two) and dependent variables. The simplest form of linear regression equation with one independent variable and one dependent variable can be referred to as:
y = c + b * x
Here,
y= dependent variable
c = constant,
b = regression coefficient,
x = expected independent variable
This is the simple formula of linear regression. For multiple linear regressions, the formula will be like that:
y = c + b * x1 +… + n * xn
xn= independent variables, and b is the regression coefficient
It is one of the most popular modeling techniques. Linear regression is usually one of the first topics participants choose when learning predictive modeling. In this technique, the dependent variable is continuous, the independent variable (s) can be discrete or continuous, and the regression line will be linear. It builds a relationship between the independent variables X (one or more) and dependent variable Y using the regression line.
It is represented by an equation Y = a + b * X + c, where ‘a’ is the intercept, b is the slope of the line, and c is an error term. This equation can be used to predict the value of the target variable based on given predictor variables.
The difference between simple linear regression and multiple linear regressions is that multiple linear regressions have several independent variables, whereas simple linear regression has only one independent variable. We can write the multiple linear regression equation as follows: Y = a + b1 * X1 + b2 * X2 +… + bn * Xn + c.
Some key points about linear regression:
- Fast and easy to model
- Very intuitive and easy to understand and interpret
- Linear regression is very sensitive to outliers
2 - Logistic regression
The predictive values of linear regression are continuous (for instance, temperatures in degrees) while the predictive values of logistic regression are discrete (for instance, true or false). For binary classification, logistic regression is suitable than linear regression. For example, a dataset was y=1 or 0, where 1 shows the default class.
To illustrate, we can imagine that we want to predict whether it will rain or not. We will have 1 for if it rains and 0 otherwise.
Unlike linear regression, logistic regression provides the result in the form of probabilities. The result, therefore, belongs to the interval [0: 1].
Logistic regression is used to find the probability of an event. We want to determine the success or failure of an event. Logistic regression is used when the dependent variable has finite values (0/1, True / False, Yes / No). Here, the value of Y goes from 0 to 1.
The important points:
- Is widely used for classification issues
- Does not require a linear relationship between the dependent and independent variables
- Can handle different types of relationships
- To avoid overfitting and underfitting, it is best to include all significant variables.
- Requires large sample sizes because maximum likelihood estimates are less powerful for small sample sizes
- Independent variables should not be correlated with each other
- If the value of the dependent variable is ordinal, it is called ordinal logistic regression.
- If the dependent variable is multi-class, it is called multinomial logistic regression.
Ridge Regression- L1 Regularization
Ridge regression is also a linear regression, and its formula is also the same as linear regression. But this model has an extra to the coefficient w. The magnitude of the coefficients should be small, and all the entries of ‘w’ should be near to zero value. So, It has a minimum effect on the overall result. These constraints are known as L2 Regularization. Its training score is lower than the linear regression. However, its test score is higher than the Linear Regression. If the model is less- complex, then the performance will be a worse but better generalization on the training set. Its default values are 1.0. When the alpha increases, it forces the coefficients to move towards zero.
Lasso-L1 Regularization:
Like ridge regression, there is another type of regularization, which is L1 regularization. It sets some coefficients to zero, while this model ignores some features. With the default regularization parameter, it performs badly by using 4 of 105 features. Here, alpha controls how coefficients are forced toward zero. When the alpha is decreasing, we should increase the number of iterations.
Author: Vicki Lezama