Medical Student Columbia University Vagelos College of Physicians and Surgeons Columbia University, Vagelos College of Physicians and Surgeons
Introduction: Pseudarthrosis of the spine is a poorly understood and difficult to predict condition. Machine learning (ML) has the potential to aid in the clinical risk assessment for pseudarthrosis, although no clear ML tools currently exist in clinical practice.
Methods: We retrospectively reviewed 336 adult patients undergoing spinal deformity surgery. More than 100 variables were included (medical history, operative factors, labs, preoperative x-rays). A preoperative risk calculator to predict pseudarthrosis was developed using a stepwise ML approach. First, a random forest classifier model was trained on all variables. Then, Boruta, a feature selection algorithm, selected for the most important variables. Finally, a multivariate logistic regression model was trained and evaluated on the variables selected by Boruta. Model performances were evaluated using accuracy, sensitivity, specificity, and area under the receiver operating curve (AUROC) score metrics. Comparative statistics (p < 0.05) were conducted to evaluate differences in cohorts. Analysis was performed using scikit-learn (v.1.5.1) in Python (v.3.9.5).
Results: 45 of the 336 (13.4%) patients developed pseudarthrosis. Traditional comparative statistics determined that BMI, age, baseline Oswestry Disability Index (ODI) score, number of rods, amount of bone morphogenetic protein (BMP), decompression (yes/no), posterior cranial vertical line (PCVL) to sacrum, sacral slope, pelvic tilt, and fatty atrophy were significantly different in union and nonunion cohorts.
The Boruta feature selection method selected BMI, age, smoking history, osteopenia/osteoporosis, anti-inflammatory use, bone mineral density, BMP use, number of posterior column osteotomy levels, number of decompression levels, and number of transforaminal interbody fusion levels as important variables to be included in the final regression model. When evaluating the model on the separate test cohort, the final model achieved an accuracy of 91.1%, sensitivity of 60.0%, specificity of 96.6%, and AUROC score of 0.86.
Conclusion : This is the first study to prove ML can be used to predict pseudarthrosis in the preoperative period. Despite not including “significant” variables, the ML model maintained a strong predictive performance, highlighting its ability to look beyond traditional statistics to fine-tune decision-making and contribute meaningful information that humans may not be able to detect on face value.