Logistic regression is a statistical model that is used to predict the outcome based on binary dependent variables. It is mainly used to model the probability of events resulting from pass/win-win/losses or alive/death since the binary logistic model has a dependent variable with only two outputs. Logistical regression uses a function named logistic function to estimate the probability between one or more dependent variables and compare their relationships.
Various fields rely on logistic regression to effectively carry out their duties; examples of these fields are Machine learning, medical learning, engineering field (to predict the probability of a given system), and social sciences.
Pros of logistic regression
Lets discuss some of the pros that come along with logistic regression.
1. Simplest machine learning algorithm- logistic regression is one of the supervised machine learning algorithms that are super easy to implement. This is so because the algorithm doesn’t provide high computation power compared to other approaches, which makes it suitable for classification in the machine learning field.
2. Easy to update- the logistic algorithm allows users to easily update the models to get/reflect new data, unlike other approaches. In logistic regression, updating of data is mainly done using stochastic gradient descent.
3. Well-calibrated outputs- the probabilities resulting from this approach are well-calibrated. This makes it more reliable than other models or approaches that only give the final classification as results.
4. Less prone to over-fitting- in the low dimensional dataset, logistic regression is less prone to over-fitting. However, it can over-fit in high dimensional, and this can be controlled by using a technique referred to as regularization.
5. More accurate- it provides a more accurate result for many simple data sets than when any other approach is used. It, however, performs well when the data set has linearly separable features.
6. Easily extended- logistic regression can easily extend to multiple classes and a natural probabilistic.
Cons of logistic regression
1. Over-fitting – high dimensional datasets lead to the model being over-fit, leading to inaccurate results on the test set. A regularization technique is used to curb the over-fit defect. However, very high regularization may result in under-fit on the model, resulting in inaccurate results.
2. Not all problems are solvable using this approach- non-linear problems cannot be solved using the logistic regression technique. Therefore transforming these non-linear problems to linear may a time be challenging and a wastage of time.
3. Problem with complex relationships- since logistic regression is not as powerful as other algorithms like neural networks, it is likely to experience difficulty capturing complex relationships.
4. Requires a high number of observations- this technique is usually used where the observation number is higher than that of features used. Otherwise, when the number of observations is lesser, it may result in over-fitting.
5. High data maintenance- in logistic regression, data maintenance is higher as data preparation is tedious. This is brought about by data scaling and normalization.