A Feature Selection Approach to Predictive Modeling in Lumbar Fusion Surgery

Friday, February 21, 2025

Presenting Author(s)

Aman Singh, BS

Medical Student
University of Rochester School of Medicine and Dentistry
Rochester, NY, US

Disclosure(s):

Aman Singh, BS: No financial relationships to disclose

Introduction: Predictive modeling in lumbar fusion surgery can assist in clinical decision making. Herein, we have identified 20 key variable that were used to trained machine learning (ML) models to identify patients at risk of longer length of stays (LOS) after lumbar fusion.

Methods: The American College of Surgeons database was queried for all lumbar fusions from 2012 to 2022. Anterior Lumbar Interbody Fusion (ALIF), Posterolateral Interbody Fusion (PlatIF), Posterior Lumbar Interbody Fusion (PLIF), and Combined PLIF+PlatIF (Combo) codes were used to identify patients. Multivariate methods with Unbiased Variable selection in R (MUVR) and Boruta were used to select 20 variables that demonstrated the greatest importance in predicting hospital LOS. Hierarchical clustering and a 5-fold cross-validation of the methods including our selected variable were used to ensure the robustness and reliability of our findings. These 20 features were used to train the following machine learning classifiers: tree-based (random forest, XGBoost, CatBoost, LightGBM), kernel-based (SVM), neural networks, and ensemble methods (voting, stacking) to compare and optimize predictive performance, along with a linear regression model.

Results: A total of 114,892 patients were included. ALIF was used in 26,244 cases (19.8%), PlatIF in 45,631 (34.4%), PLIF in 21,973 (15.9%), and Combo in 39,756 (30.0%). The neural network marginally outperformed all other models, with an accuracy of 71.2% and a discriminative ability (AUC) of 71.2%. All models achieved an accuracy and AUC within 0.6% of the logistic regression model. The neural network also outperformed all other models in Recall and F1 score, with all models being within 5.0% and 2.0% or each other respectively. However, the neural network performed the poorest of all models in terms of precision. All models achieved a precision of within 1.2% of each other.

Conclusion : These results demonstrate that careful feature selection is the key to predictive modeling, regardless of the ML model or approach that is used. We show that when selecting the variables that demonstrate the greatest importance in predicting hospital LOS, all models approached a similar level of accuracy and discriminative ability.