A comprehensive investigation of morphological features responsible for cerebral aneurysm rupture using machine … – Nature.com

In this section, we discuss the outcomes produced by different machine learning models, aiming to compare and determine the most effective model for predicting cerebral aneurysm rupture based on 35 morphological and 3 clinical inputs. The evaluation criteria include accuracy for the train and test datasets, recall, precision, and accuracy for the test dataset and the receiver characteristic operation (ROC) curve. Following these evaluations for each model, we discuss the most significant features identified by the models. We aim to shed light on the correlation between each parameter and the rupture status of cerebral aneurysms. This analysis provides a comprehensive understanding of the influential factors contributing to the accurate prediction of aneurysm rupture.

The main metric for evaluating model performance and enabling comparisons between different models is accuracy, which is measured on both the train and test datasets. Accuracy is defined as the ratio of correctly predicted cases to all predicted cases. It is important to note that while high accuracy is desirable, achieving 100% accuracy is not optimal, as it may indicate overfitting and a lack of generalization to unseen data. Ideally, the train and test datasets should have similar accuracy, with a recommended maximum difference of 10%. In Fig.5, we present the accuracy results for all models. It is evident that all models can achieve an accuracy exceeding 0.70. XGB demonstrates the highest accuracy at 0.91, while KNN exhibits the lowest accuracy at 0.74. Assessing the generalizability of the models to new data, both MLP and SVM demonstrated superior performance, achieving an accuracy of 0.82 for the test dataset. This indicates that MLP and SVM outperform the other models in terms of predictive accuracy for unseen data.

Accuracy of train and test datasets.

In addition to accuracy, we included precision and recall as important metrics to comprehensively evaluate model performance. We made this decision due to the sensitivity of the medical data under consideration, emphasizing the importance of timely disease recognition. In simple terms, recall measures the models ability to correctly identify the presence of a disease. Recall is defined as the ratio of true positive predictions to the total number of actual positive cases. Similarly, precision reflects the models ability to accurately predict positive occurrences. Precision is defined as the ratio of true positive predictions to the total number of predicted positive cases.

In the medical context, recall holds particular significance, but accuracy and precision should not be overlooked, as they collectively contribute to overall model efficacy. Figure6 presents the evaluation of all three metrics (accuracy, precision, and recall) for the test dataset, with a specific focus on the ruptured class, representing the occurrence scenario in our study. SVM and MLP are the top-performing models once again. The results show that SVM and MLP have high recall rates of 0.92 and 0.90, respectively, in predicting the occurrence of cerebral aneurysm rupture. SVM also has an accuracy and precision of 0.82, whereas MLP has a precision of 0.83 and an accuracy of 0.82. In contrast, RF performed relatively poorly in all three criteria. However, it is noteworthy that even for RF, all performance metrics for the test dataset exceeded 0.75, indicating a high level of predictive capability.

Accuracy, precision, and recall for the test dataset.

Another metric used for evaluation is the ROC curve, which illustrates the true positive rate versus the false positive rate. Linear behavior, where the true and false positive rates are equal, represents a random classifier. As the model improves, the curve shifts toward the upper-left point. An ideal model would have a true positive rate of 1 and a false positive rate of 0. The area under the curve (AUC) is a representative measure of the models performance, with an AUC of 0.5 indicating a random classifier and an AUC of 1 indicating an ideal classifier. Figure7 presents the behavior of the ROC curve for each model, along with the corresponding AUC. Based on these criteria, SVM and MLP are the top-performing models engaged in close competition. Their ROC curves exhibit a favorable trajectory, and their AUC values affirm their strong performance. Conversely, RF demonstrates a comparatively poorer performance than the other models. In summary, all models demonstrate highly acceptable performance and scores. Optimizing these models to improve their reliability and effectiveness in predicting cerebral aneurysm rupture represents a valuable endeavor.

Receiver operating characteristic (ROC) curve for all models.

Given that each machine learning model employs a unique set of algorithms and mathematical relations, a difference in the weight assigned to each parameter for the final classification decision is expected. Figure8 displays the weights of each parameter for the two top-performing models in this study. The SVM model identifies the first five dominant features as EI (Ellipticity Index), SR (Size Ratio), I (Irregularity), UI (Undulation Index), and IR (Ideal Roundness), a new parameter introduced in this study. The MLP model, on the other hand, prioritizes EI, I, Location, NA (Neck Area), and IR, with IR once again demonstrating a significant impact.

Dominant Features for the two top-performing models.

Other novel parameters introduced in this study include NC, IS, ON, IRR, COD, ISR, and IOR, which occupy positions 6, 9, 13, 19, 27, 30, and 36, respectively, for SVM. For the MLP model, the order of these new parameters is IR (5), NC (7), ON (18), IS (24), ISR (27), COD (32), IOR (34), and IRR (38). Notably, some parameters for the MLP model exhibit negative values, indicating an inverse effect on the models prediction and an inverse correlation with the output. It is important to acknowledge that this pattern may vary depending on the architecture used for the MLP model.

One potential question that may arise in this study is whether bifurcation aneurysms are more prone to rupture than lateral aneurysms, based on physicians experience. However, our study does not show a significant contribution from this factor. This discrepancy does not imply that the bifurcation and lateral status are insignificant. Instead, it highlights that when other features are considered alongside this parameter, there is a stronger correlation among other parameters than with this specific one. Essentially, by expanding our input variables and making decisions based on more comprehensive information, we uncover the significance of parameters that may not have been previously considered. Thanks to modern machine learning models, it is now possible to compare several parameters simultaneously and discern the contribution of each in relation to others. This approach allows for more reliable decision-making by considering a broader set of factors and better understanding the complex interplay of variables that contribute to the prediction of cerebral aneurysm rupture.

We now undertake a brief comparison between prior research and the current study, focusing specifically on the testing datasets used across all studies. To facilitate this analysis, we refer to which presents the outcomes of six comparable studies alongside those of our own investigation. As previously indicated, we endeavored to incorporate a comprehensive array of morphological parameters to ensure the robustness of our findings.

As the scope of parameters considered expands, shifts in the relative importance assigned to each parameter are anticipated. Furthermore, increasing the size of the dataset can enhance the reliability of the results. Among the parameters of significance, the size ratio emerges as a recurrent focal point, underscoring its inherent importance in assessing the risk of rupture. Once more, we underscore the significance of the recall score, given the sensitivity inherent in medical data. It is noteworthy that our study achieves an outstanding recall score, a metric that is unfortunately absent from prior studies, thus limiting direct comparison.

Table 3, which presents the outcomes of six comparable studies alongside those of our own investigation. As previously indicated, we endeavored to incorporate a comprehensive array of morphological parameters to ensure the robustness of our findings.

As the scope of parameters considered expands, shifts in the relative importance assigned to each parameter are anticipated. Furthermore, increasing the size of the dataset can enhance the reliability of the results. Among the parameters of significance, the size ratio emerges as a recurrent focal point, underscoring its inherent importance in assessing the risk of rupture. Once more, we underscore the significance of the recall score, given the sensitivity inherent in medical data. It is noteworthy that our study achieves an outstanding recall score, a metric that is unfortunately absent from prior studies, thus limiting direct comparison.

Excerpt from:
A comprehensive investigation of morphological features responsible for cerebral aneurysm rupture using machine ... - Nature.com

Related Posts

Comments are closed.