Variable selection in logistic regression models through the application of exact mathematical programming
Abstract
A linearised approximation of the log-likelihood objective function is presented as a potential alternative to iterative fitting methods employed by logistic regression. The log-likelihood objective function is solved using linear programming and a modified version of the linearised logistic regression model is presented, which facilitates best subset variable selection. The resulting model is a mixed integer linear programming problem which incorporates a cardinality constraint on the number of variables. The suggested approach maintains many attractive properties, such as its ability to quantify the quality of the resulting variable selection solution, its independence of the subjective choice of p-values inherent to typical stepwise variable selection approaches, and its capability to edge closer to optimality within increasingly reduced computing times when the correct settings are applied, even for large input datasets.
Computational results are presented to demonstrate the advantages of employing an exact mathematical programming approach towards variable selection in logistic regression applications
URI
http://hdl.handle.net/10394/34569https://hdl.handle.net/10520/EJC-1c22eea916
https://doi.org/10.37920/sasj.2020.54.1.6