Logistic Regression
Buildup
Let’s think about performing . Here, can be a categorical variable, particularly one with only two classes, such as male and female, success and failure, positive and negative, and , etc. For convenience, let’s just call it or . In cases where the dependent variable is binary, the interest is ‘what is when we look at independent variables ’.
However, since is a qualitative variable, it cannot be expressed by the linear combination of regression coefficients and variables as in ordinary regression analysis. Therefore, we aim to approach it by calculating the probability that it is .
Given , let’s set the probability that it is as follows:
- (i) The exponential function is always greater than , and since the denominator is greater than the numerator in , it is .
- (ii) Naturally, the probability that it is is and thus . Taking the natural log of both sides gives us .
Taking the log in this way is called a Logit Transformation, and is referred to as the Logit.
Model 1
A multiple regression analysis that takes the logit as the dependent variable is called Logistic Regression.
By applying the inverse transformation of the logit transformation to the values obtained from the logistic model, we can obtain the original probabilities we wanted to know. When a coefficient of is positive, it means that as increases, the probability of increases, and a negative coefficient means that as increases, the probability of increases.
Moreover, logistic regression can also be a classification technique by suggesting an appropriate Threshold for the probability, although it is also a prediction technique as it informs the probability of outcomes given certain conditions.
See Also
- The reason it is called logistic is due to the use of the logistic function.
- Results of logistic regression analysis in R
- Hosmer-Lemeshow goodness-of-fit test
- Multiple regression analysis
Hadi. (2006). Regression Analysis by Example(4th Edition): p318~320. ↩︎