Prediction of hepatic metastasis in esophageal cancer based on machine learning | Scientific Reports – Nature.com

Esophageal cancer is a remarkably fatal malignancy, with a prevalence of distant metastases reaching up to 42% in newly diagnosed patients, prominently affecting the liver as the most frequently involved organ26,27,28. The effective treatment and comprehensive management of metastatic esophageal cancer necessitate a multimodal strategy, which continues to pose significant challenges. Therefore, it is of crucial significance for clinical decision-making to identify high-risk factors of esophageal cancer and accurately predict whether patients will develop liver metastasis based on their individual and unique clinical and pathological characteristics.

Currently, the HM of advanced esophageal cancer remains understudied in the scientific literature. Prognostic research in this domain is predominantly focused on two key aspects. Firstly, there is a conspicuous paucity of exploratory investigations into the high-risk prognostic factors associated with esophageal cancer. Additionally, further exploration of the interrelationships among these independent prognostic factors is noticeably lacking. Secondly, there is a dearth of research on HM models for advanced esophageal cancer that leverage the immense potential of big data. Consequently, there is an urgent need for comprehensive studies in these areas to contribute to an improved understanding and accurate prognostication of advanced esophageal cancer.

Some studies believe that smoking and drinking are the most common risk factors for male esophageal cancer29. Some previous studies30 have also shown that for cancer patients, the degree of tissue differentiation, pathological N-stage, vascular invasion, and neuroinvasion are recognized factors that affect the prognosis of patients with esophageal cancer31,32,33,34. The conclusions of these studies lacked the support of big data and did not address the prediction on HM of advanced esophageal cancer. Based on big data analysis of SEER database, our study screened out independent high risk factors associated with HM by logistic regression analysis. This study included 15 clinically common relevant factors associated with advanced esophageal cancer with liver metastasis, which are: age, sex, Marital status, Race, Primary Site, Tumor histology, Tumor grade, T stage, N stage, Surgery, Radiation, Chemotherapy, Brain metastasis, Bone metastasis, Lung metastasis. To identify the independence between features, we obtained a correlation heat map by Spearman correlation analysis. There was no strong correlation among these 15 features by the Fig.2A. Moreover, 11 independent high risk factors related to liver metastasis were screened by logistic regression analysis, which were as follows: age, Primary Site, Tumor histology, Tumor grade, T stage, N stage, Surgery, Radiation, Chemotherapy, Bone metastasis, Lung metastasis.

Undoubtedly, the construction of prediction models for HM of advanced esophageal cancer is equally significant to the exploration of independent high risk factors in this context. Presently, there is a notable dearth of studies focused on risk factors in esophageal cancer patients with distant organ metastases35. For instance, Tang et al. previously constructed a nomogram to predict the survival of patients with metastatic esophageal cancer; however, this study encompassed metastases to all anatomical sites, without specifically exploring a prediction model for predicting the risk of distant metastasis36. Similarly, Cheng et al. established models for predicting both the risk and survival of esophageal cancer patients, albeit those specifically tailored to brain metastasis37. Furthermore, Guo et al. provided detailed characteristics and explored risk and prognostic factors for patients with liver metastasis, yet they did not develop any predictive tools38. Considering that liver metastasis represents the most common site of distant spread, conducting a comprehensive investigation specifically targeting esophageal cancer patients with liver metastasis assumes paramount clinical importance.

Previous studies have constructed nomograms to predict EC metastasis based on traditional logistic models. However, the limitations of this method in prediction accuracy and processing big data have made it difficult to make great breakthroughs in precision medicine9,10. And traditional research cannot exploration the interaction between different independent high risk factors18,19. In contrast, our study can better document complex associations between different independent high risk factors, thereby improving the accuracy of the model20. Previous studies have used nomogram methods to build a model for predicting the metastasis of patients with esophageal cancer based on the data of patients with esophageal cancer in the SEER database, but these studies did not involve the establishment of a predicting model for HM of advanced metastatic esophageal cancer by ML21.

We then constructed six prediction models using ML, Internal ten-fold cross-validation (Fig.3A) showed that GBM model performed best among the six models. Leveraging these findings, we have successfully devised an openly accessible online calculator (https://project2-dngisws9d7xkygjcvnue8u.streamlit.app/) based on the GBM model. The model we have developed accurately predicts patients' risk of HM based on various clinical indicators. Clinicians can access this model through the provided website to input patient information and obtain corresponding predictions of hepatic metastases, thereby facilitating clinical decision-making.

Our research has the following advantages. Firstly, this study established a statistical model based on machine learning that can predict the HM of patients with EC. To the best of our knowledge, we are the first to use ML to construct a prediction model of LM of EC. This model is more reliable than the traditional nomogram prediction model. And this work expanded our knowledge of advanced EC. Second, our study further explores the relationship between different independent high risk factors, which provides a new direction for future clinical research. In other words, clinical research should not only explore the metastasis of patients, but also explore the correlation between different independent high risk factors, so as to better find the relationship between these factors and further eliminate the factors that are not conducive to the metastasis of patients during perioperative period.

Meanwhile, this study has some limitations. First, Current machine learning is almost entirely statistical or black-box, bring severe theoretical limitations to its performance23. Second, this study is a single-center study with limited number of patients included, and the application of machine learning model on large data sets can obtain more stable results22. Therefore, in subsequent studies, multi-center data can be added for training and external verification, so as to obtain a more reliable prediction model. Third, this study did not include neoadjuvant therapy, surgical methods, circulating tumor DNA and other factors that may affect the long-term prognosis of patients with esophageal cancer. In the future, with the continuous improvement of the database, we will incorporate more correlation parameters associated with the HM of EC into the web predictor to improve its adaptability.

Read more here:
Prediction of hepatic metastasis in esophageal cancer based on machine learning | Scientific Reports - Nature.com

Related Posts

Comments are closed.