Multivariate logistic regression
Multivariate logistic regression is a type of data analysis that predicts any number of outcomes based on multiple independent variables. It is based on the assumption that the natural logarithm of the odds has a linear relationship with independent variables.
Procedure
First, the baseline odds of a specific outcome compared to not having that outcome are calculated, giving a constant. Next, the independent variables are incorporated into the model, giving a regression coefficient and a "P" value for each independent variable. The "P" value determines how significantly the independent variable impacts the odds of having the outcome or not.It is desirable to use as few variables as necessary, and to have at least 10 - 20 times as many observations as independent variables.
Formula
Multivariate logistic regression uses a formula similar to univariate logistic regression, but with multiple independent variables.where v is the number of independent variables. The following formula shows that multivariate logistic regression is simply a standard linear regression model:
Types
The two main types of multivariate logistic regression are linear regression and logistic regression.Linear regression
Linear regression produces results that show a linear relationship with a single independent variable and can be plotted on a graph as a straight line.Logistic regression
In contrast, logistic regression produces results that show a nonlinear relationship. As a result, plotting the data on a graph produces a curved line called a sigmoid. Unlike linear regression, logistic regression produces results based on two or more independent variables.The odds ratio associated with a single independent variable can change when other independent variables are accounted for as well. However, the changes are usually insignificant, but they can indicate errors.
Assumptions
Multivariate logistic regression assumes that the different observations are independent. It also assumes that the natural logarithm of the odds ratio and the dependent variables show a linear relationship. However, it does not assume a normal distribution of the dependent variables.Null hypothesis
A null hypothesis is an assumption that the independent variables do not have any impact on the dependent variable.Dependent variables
There are three main types of logistic regression dependent variables : Binary, multi-class, and ordinal.Binary
A binary dependent variable is a variable with only two outcomes, and the possible values must be opposites of each other.Multi-class
A multi-class dependent variable is a variable with at least three qualitative outcomes, usually with a constant numerical stand-in.Ordinal
An ordinal dependent variable is a variable with at least three possible outcomes, which are numerically different.Models
Multivariate logistic regression produces the following models:Logit models
models distinguish independent and dependent variables.Log-linear models
Unlike logit models, log-linear models do not distinguish between categories of variables.Probit models
Probit models function similarly to logit models due to the similarities of normal and logistic distributions. However, since the independent variables are interpreted as standard deviations instead of odds ratios, these models are also more similar to linear models than logit models.Uses
Scientists
When scientists use logistic regression, they usually include as many independent variables as necessary.Doctors and physicians
Multivariate logistic regression is used by physicians to:- associate certain characteristics with certain outcomes
- determine the effects of certain techniques
- give people with certain conditions appropriate treatments
- develop appropriate models
Market