Page 888«..1020..887888889890..900910..»

Seismologists use deep learning to forecast earthquakes – University of California

For more than 30 years, the models that researchers and government agencies use to forecast earthquake aftershocks have remained largely unchanged. While these older models work well with limited data, they struggle with the huge seismology datasets that are now available.

To address this limitation, a team of researchers at the University of California, Santa Cruz, and the Technical University of Munich created a new model that uses deep learning to forecast aftershocks: the Recurrent Earthquake foreCAST (RECAST). In a paper published in Geophysical Research Letters, the scientists show how the deep learning model is more flexible and scalable than the earthquake forecasting models currently used.

The new model outperformed the current model, known as the Epidemic Type Aftershock Sequence (ETAS) model, for earthquake catalogs of about 10,000 events and greater.

The ETAS model approach was designed for the observations that we had in the 80s and 90s when we were trying to build reliable forecasts based on very few observations, said Kelian Dascher-Cousineau, the lead author of the paper who recently completed his Ph.D. at UC Santa Cruz. Its a very different landscape today. Now, with more sensitive equipment and larger data storage capabilities, earthquake catalogs are much larger and more detailed

Weve started to have million-earthquake catalogs, and the old model simply couldnt handle that amount of data, said Emily Brodsky, a professor of earth and planetary sciences at UC Santa Cruz and co-author on the paper. In fact, one of the main challenges of the study was not designing the new RECAST model itself but getting the older ETAS model to work on huge data sets in order to compare the two.

The ETAS model is kind of brittle, and it has a lot of very subtle and finicky ways in which it can fail, said Dascher-Cousineau. So, we spent a lot of time making sure we werent messing up our benchmark compared to actual model development.

To continue applying deep learning models to aftershock forecasting, Dascher-Cousineau says the field needs a better system for benchmarking. In order to demonstrate the capabilities of the RECAST model, the group first used an ETAS model to simulate an earthquake catalog. After working with the synthetic data, the researchers tested the RECAST model using real data from the Southern California earthquake catalog.

They found that the RECAST model which can, essentially, learn how to learn performed slightly better than the ETAS model at forecasting aftershocks, particularly as the amount of data increased. The computational effort and time were also significantly better for larger catalogs.

This is not the first time scientists have tried using machine learning to forecast earthquakes, but until recently, the technology was not quite ready, said Dascher-Cousineau. New advances in machine learning make the RECAST model more accurate and easily adaptable to different earthquake catalogs.

The models flexibility could open up new possibilities for earthquake forecasting. With the ability to adapt to large amounts of new data, models that use deep learning could potentially incorporate information from multiple regions at once to make better forecasts about poorly studied areas.

We might be able to train on New Zealand, Japan, California and have a model that's actually quite good for forecasting somewhere where the data might not be as abundant, said Dascher-Cousineau.

Using deep-learning models will also eventually allow researchers to expand the type of data they use to forecast seismicity.

Were recording ground motion all the time, said Brodsky. So the next level is to actually use all of that information, not worry about whether were calling it an earthquake or not an earthquake but to use everything."

In the meantime, the researchers hope the model sparks discussions about the possibilities of the new technology.

It has all of this potential associated with it, said Dascher-Cousineau. Because it is designed that way.

Follow this link:
Seismologists use deep learning to forecast earthquakes - University of California

Read More..

Prediction of lung papillary adenocarcinoma-specific survival using … – Nature.com

The accurate prediction of survival in patients with LPADC is essential for patient counseling, follow-up, and treatment planning. Previous studies have revealed multiple prognostic factors that affect the survival time of patients with pulmonary papillary carcinoma, including patient age, grade classification, lymph node status, tumor size, distant metastases, and surgical treatment9, 11. Machine learning is increasingly utilized in research for the prediction of survival of patients with cancer25,26,27, with relatively favorable results. Although CoxPH is the classical method utilized for the analysis of survival data, the use of this method requires linear relationships between variables. As a result of the continuous advances achieved in recent years, machine learning is widely applied to the medical field28,29,30. In this study, we used ensemble machine learning models to accurately predict CSS in patients with LPADC, and obtained satisfactory results.

Consistent with the findings reported by You et al., the four models developed in this study confirmed that surgery is an important prognostic factor for patients with lung adenocarcinoma3. Similarly, distant metastases have an important impact on the prognosis of patients with LPADC. In conjunction with previous analyses, the findings demonstrate that patients who developed distant metastases had poorer survival rates than other patients26, 27. A higher N-stage also plays a crucial role in the model, indicating poor prognosis28. Other characteristics (e.g., tumor size, grade, sex, chemotherapy, primary site, etc.) have different degrees of importance in various models11, 23, 27. These results suggest that the selection of appropriate treatment modalities (e.g., surgery, radiotherapy, and chemotherapy) may be more important for predicting CSS in patients with LPADC than TNM staging alone.

Interestingly, the ensemble models (i.e., GBS, EST, and RSF) did not demonstrate a markedly better ability for predicting CSS in LPADC in the validation cohort compared with the CoxPH model. This indicates that the machine learning approach may only offer advantages when traditional models are limited. Therefore, there are several possible explanations for the comparable predictive performance observed between the ensemble and CoxPH models in this study. Firstly, the number of predictors used to construct the model was not sufficiently large, and the advantages of machine learning in analyzing large samples and multivariate data are not fully realized. Secondly, the SEER database collects variables derived from clinical experience; many of these variables are linearly correlated with outcomes. Therefore, the data may be better qualified for the application of parametric (CoxPH) models. The GBS, EST, and RSF models developed in this study achieved the predictive efficacy of the CoxPH model under a broader condition. The web calculator constructed for the study is based on the training dataset, and care should be taken when applying the EST model that may be overconfident. Hence, it is not recommended to use this algorithm for the prediction of survival. In this study, the CoxPH model had poorer long-term predictive power than the ensemble models. Therefore, use of the RSF model is recommended for the prediction of LPADC CSS beyond 10 years.

This study had several limitations. Firstly, in the SEER database, there was a lack of data regarding established predictors of survival in patients with LPADC (e.g., chemotherapy regimens and biological markers). Secondly, due to the retrospective nature of this study and data processing, samples with missing information were excluded; this may have led to considerable bias. Thirdly, the work related to the measurement of prediction model errors in the study is not yet complete. Finally, the results of this study were not externally validated; although we randomly split the study sample during the development of the models, the generalizability and reliability of this approach should be further validated with external datasets. The prognostic value of this approach should be improved in the future by adding more predictors, increasing external validation, and conducting prospective studies.

In conclusion, a geometric model and a CoxPH model were developed and evaluated for the prediction of CSS in patients with LPADC. Overall, all four models showed excellent discriminative and calibration capabilities; in particular, the RSF model and GBS model showed excellent consistency for long-term forecasting. The integrated web-based calculator offers the possibility to easily calculate the CSS of patients with LPADC, providing clinicians with a user-friendly risk stratification tool.

Continue reading here:
Prediction of lung papillary adenocarcinoma-specific survival using ... - Nature.com

Read More..

Stay ahead of the game: The promise of AI for supply chain … – Washington State Hospital Association

Everybody is talking about artificial intelligence and machine learning lately, but is the hype real? Finding supplies and keeping them stocked to be easily accessible can be daunting, but when minutes count, it becomes even more crucial. There are AI and machine learning tools designed to ease the workload. Join us for a webinar from 11 a.m. 12 p.m. Thursday, Oct. 12 to learn about this growing technology and a practical example of how AI and machine learning have impacted access to lifesaving equipment and supplies at a Washington hospital, enabling more patient-facing time. Register here.

This program is the first in a series of hospital supply chain webinars and roundtables where you will hear from hospital supply chain leaders and peers as they share practical tips and tools on creating supply chain efficiencies, cost savings and innovations. Feel free to share this invitation with your internal hospital networks. Unable to attend? Register anyway and we will send you the slides and webinar recording.

Learning objectives:

The hospital supply chain tips and tools programs are provided for WSHA hospital members in collaboration with the Western States Healthcare Materials Management Association (WSHMMA), the regional chapter of AHRMM.

For more information about this and future supply chain programs, please contact Cynthia Hay, cynthiah@wsha.org, (206) 216-2526. (Cynthia Hay)

Read the rest here:
Stay ahead of the game: The promise of AI for supply chain ... - Washington State Hospital Association

Read More..

Predicting Stone-Free Status of Percutaneous Nephrolithotomy … – Dove Medical Press

Introduction

Urolithiasis (or nephrolithiasis) is a relatively common disease affecting 113% of the global population and is more common in Jordan affecting 5.95% of the Jordanian population.1,2 It has a predilection for obese Caucasian men and carries significant morbidity, its prevalence being on the rise over the last four decades. Several procedures are currently in use for the management of kidney stones, including extracorporeal shockwave lithotripsy (ESWL), ureteroscopic lithotripsy (URSL), and percutaneous nephrolithotomy (PCNL). With each having its own indications, PCNL remains the golden standard for large renal stones measuring greater than 2 cm, staghorn stones, and partial staghorn stones.3

PCNL is not free of complications and its efficacy can be variable; therefore, few pre-operative nomograms are in place to help predict success rates, namely stone-free status, and possible complications, and nomograms help to systemize the reporting and interpretation of the surgerys outcomes.4 Examples of such nomograms are the S.T.O.N.E score, S-ReSC score, Guys Stone Score (GSS), and CROES nephrolithometry score. The GSS comprises 4 grades that rate the complexity of the future PCNL based on renal anatomy and stone location; the score is based on all stones detected and not only those amenable for PCNL; higher grades in GSS correlate with a lower chance of stone-free status (SFS).5 On the other hand, the S.T.O.N.E score is obtained from pre-operative radiological characteristics that include stone size, the topography of stone, obstruction with respect to the degree of hydronephrosis, number of stones, and the Hounsfield Unit (HU) value of the stone6 Studies have shown that these stone scoring systems (SSS) have similar accuracy in predicting the SFS of PCNL patients, although GSS shows a slight superiority in complication prediction.7

Recently, machine learning (ML) has been trialed as a possible alternative to traditional SSS in predicting the sequelae of PCNL, with five studies in the literature documenting the endeavor.812 All five studies described ML as promising, as it showed high sensitivity and accuracy along with efficiency in comparison to SSS. Each study used different ML methods, from artificial neural network (ANN) systems9 to support vector machine (SVM) models. However, none of the aforementioned studies were externally validated, and they all had smaller sample sizes compared to our study. In our study, we utilized the following three ML methods to predict the SFS of 320 PCNL patients: Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost). Using our results, we compared MLs performance to two stone scoring systems: Guys Stone Score and S.T.O.N.E score. We then externally validated our model, becoming the first study in the literature to do so, as well as the first study in Jordan that aims to create a machine learning model (MLM) for predicting the SFS of PCNL.

We conducted a retrospective, observational, single-center cohort study at King Abdullah University Hospital (KAUH), the main tertiary hospital in North Jordan. The Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement, which provides guidelines for reporting and developing predictive models, was followed in this study.13 The Research Committee of the Faculty of Medicine and the Institutional Review Board at Jordan University of Science and Technology (JUST) approved the study, and the Institutional Review Board provided the ethical approval (8402022). The ethics committee approved a waiver of consent from the patients because the study did not include any therapeutic intervention and the outcomes planned are routinely registered in patients with nephrolithiasis. All patients diagnosed with nephrolithiasis, confirmed by computed tomography (CT) scans, and who had undergone Percutaneous Nephrolithotomy (PCNL) between January 2017 and September 2022 at KAUH were included. A standard diagnostic and preoperative evaluation were performed on all patients, which included a complete blood count, full coagulation profile, urine culture, kidney function test, and second-generation prophylactic antibiotics.

The study included a total of three surgeons (A, B, and C) who performed PCNL procedures on patients with renal calculi. The assignment process involved a random allocation method, which allowed for an unbiased distribution of surgeons across the various groups. This approach aimed to explore the potential patterns or trends in PCNL outcomes without intentional surgeon classification. The study aimed to analyze the outcomes within this random assignment to gain insights that could contribute to the broader understanding of PCNL efficacy and surgeon impact on treatment success. All procedures were performed under general anesthesia. The patients were put in a prone position, and a small skin incision would be made around the nephrostomy tract, under fluoroscopy guidance a guide wire was inserted down to the urinary bladder. Dilatation is then done up to 11fr, using a double-lumen catheter, a safety guidewire would then be inserted. A balloon dilator (nephromax) is used to achieve maximum dilation reaching 12pa, then the working sheath is inserted. A rigid 26 Fr nephroscope was used in all patients, then stone fragmentation was performed using different methods depending on the preference of the treating urologist; ultrasonic was the most common method in this regard. A nephrostomy was placed in almost all cases. If necessary, the nephrostomy tube would be left in the renal pelvis for decompression and/or easy access. Plain radiography of the kidneys, ureters, and bladder (KUB) was obtained from postoperative day 1 according to the state of the patient.

The nephrostomy tubes were removed on postoperative day 1 or 2 when the radiological images show signs of SFS. SFS means either the absence of stones or clinically insignificant residual fragments (diameter less than 4 mm) in the kidney after the procedure. Various methods were used to determine whether stone-free status has been achieved, including imaging studies such as X-rays or CT scans, as well as direct inspection of the kidney using the nephroscope. Stone-free status is typically assessed immediately following the procedure, but in some cases, a follow-up evaluation may be required to confirm that no residual stones remain.

A set of input variables were collected from the hospital records at KAUH for all patients that included preoperative and postoperative variables. The preoperative variables were age, gender, hypertension, diabetes, hyperlipidemia, preoperative hemoglobin, renal insufficiency, recent urinary tract infections, previous surgeries on the target kidney, stone burden, stone location, and hydronephrosis. Postoperative variables included fever, septicemia, need for transfusion, length of hospital stay, ancillary procedures, and stone-free status. SFS was defined as either no residual stone fragments on a CT scan or X-ray as well as direct inspection of the kidney using the nephroscope or those with clinically insignificant residual fragments <4 mm. The results of the definition were entered as a binary number: 1 (stone residual, ie, Yes) or 0 (clinically insignificant residual fragments or no residual stone fragments).

Three ML ensembles were employed in this study: the Random Forest Classifier (RFC), Support Vector Classifier (SVC), and Extreme Gradient Boosting (XGBoost). These algorithms were selected due to their effectiveness in handling complex, multidimensional datasets and their capacity to model nonlinear relationships. The RFC model is a decision tree-based machine learning model. Each node of the decision tree divides the data into two groups using a cutoff value within one of the features. By creating an ensemble of randomized decision trees, each of which overfits the data and aggregates the results to achieve improved classification, the RFC technique mitigates the impact of the overfitting problem.14 SVC is a powerful supervised machine learning technique that aims to find the optimal hyperplane to separate data into different classes. It is well suited for both classification and regression tasks. XGBoost was also used and is constructed based on a decision tree-based gradient boosting regression method.15 In this approach, trees for prediction are sequentially built, with each subsequent tree designed to reduce errors from its predecessors.

After that, the machine learning models were trained on a dataset with a binary classification output predicting the target Stone-Free Status (SFS) using 26 features including demographic, clinical, renal, preoperative, and postoperative surgical variables. Then, dataset was split randomly into 7:3 for the training set (n = 224) and n = 96 for the testing set. Features contribution in predicting SFS status was calculated using the permutation importance method, in which a higher decrease in mean accuracy represents higher importance in models predictions. Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) scores were calculated to evaluate the discriminatory power of different models. The roc_curve function from the sklearn.metrics module was employed to compute the False Positive Rate (FPR) and True Positive Rate (TPR) for each model. The predicted probabilities of the positive class were obtained using the predict_proba method of each model. The AUC scores were calculated using the roc_auc_score function. A custom plotting function plot_roc_curve was defined to visualize the ROC curves of multiple models. The model was also evaluated using Mean Bootstrap Estimate with a 95% Confidence Interval, 10-fold cross-validation, and classification report for precision, recall, and F1-score.

All three models were externally validated by data extracted from a previous similar study by Zhao et al with compatible variables that included 224 patients.8 The algorithm generated predictions for the instances in the validation dataset, and these predictions were compared to the actual outcomes to assess the models accuracy, mean bootstrap estimate, and AUC. The results obtained from this evaluation provide an estimate of the models generalizability to unseen data, thus helping to validate its effectiveness and applicability in real-world scenarios. All ML implementations were processed using the scikit-learn 0.18 package in Python.

All data analyses were performed using the IBM Statistical Package for the Social Sciences (SPSS) software for Windows, version 26.0. Descriptive measures included means standard deviations for continuous data if the normality assumption was not violated, according to the ShapiroWilk test, and median with first and third quartiles (Q1Q3) if the assumption was violated. Categorical data were presented by frequencies and percentages (%). Continuous data were compared using the Student t test in normally distributed variables and the MannWhitney U-test if not normally distributed. Categorical data were compared using the 2 test or the Fisher's exact test if 1 cell had an expected count of less than 5. Variables included in the model were chosen based on a separate bivariate analysis, including all variables yielding a P value of <0.1. Nagelkerke R2 was used as a measure for the goodness-of-fit. The variables in the model were checked for multicollinearity using the variance inflation factor. Statistical significance was considered at a 2-sided P value of .05.

A total of 320 patients (222 males, 69.4%) were enrolled. The mean age was 46.03 14.7 years, and the median (IQR) stone burden was 208.1 231 mm2. Table 1 shows the preoperative variables including individual variables, renal and stone data. The patients comprised 92 non-stone-free cases and 228 stone-free cases. The distribution of GSS categories differed significantly between the Non-Stone-Free and Stone-Free groups (p < 0.0001). The Stone-Free group had higher proportions of patients in the GSS I, GSS II, and GSS III categories compared to the Non-Stone-Free group, which had a higher proportion of patients in the GSS IV category. The S.T.O.N.E score is another scoring system used to evaluate stone characteristics. Similar to GSS, the distribution of S.T.O.N.E score categories varied between the non-stone-free and stone-free groups. Higher S.T.O.N.E scores (9 and above) had a higher percentage in the non-stone-free group, compared to the stone-free group. The non-stone-free group had a higher median stone burden of 319.6 mm2, compared to 182.9 mm2 in the stone-free group. The stone burden within the non-stone-free group was nearly twice as large as that within the stone-free group, and this difference is statistically significant (p < 0.0001). Stone location accounted for statistically significant differences in the upper calyx, middle calyx, and lower calyx showing a higher percentage in non-stone-free compared with stone-free. Preoperative UTI had a higher percentage of 37% in the stone-free group compared with 22.4% in the non-stone-free group. No statistically significant differences were observed between the non-stone-free and stone-free groups for variables such as diabetes, hypertension, hyperlipidemia, unilateral kidney, renal insufficiency, anemia, and previous surgery on the target kidney (p > 0.05). Table 2 shows the postoperative data for these patients. The overall SFS was 71.3% (228/320). Table 3 presents the analysis of the impact of surgeon expertise on SFS in PCNL. The table provides a comparison of SFS across the three surgeon groups (A, B, and C). Among patients operated on by surgeon A, 65 (73%) were stone-free, indicating successful complete stone clearance. In contrast, 24 patients (27%) in this group were classified as non-stone-free, indicating the presence of residual stones >4 mm post-PCNL. In surgeon B, 75 (70%) of the patients were stone-free, while 32 (30%) were classified as non-stone-free. Surgeon C had 88 patients (71%) classified as stone-free, and 36 patients (29%) classified as non-stone-free. Figure 1 shows the stone-free rate in each subgroup of GSS grades and the S.T.O.N.E score systems.

Table 1 Preoperative Factors Including Individual Variables and Renal Stone Factors

Table 2 Postoperative Outcome Variable (n = 320)

Table 3 Analyzing the Impact of Surgeon Expertise on SFS in PCNL

Figure 1 The stone-free rate in each subgroup of GSS grades and the S.T.O.N.E score systems.

The RFC model was performed on the testing set with a mean bootstrap estimate of 0.75 and 95% CI: [0.650.85], 10-fold cross-validation of 0.744, an accuracy of 0.74, and an AUC of 0.761, while the XGBoost model predicted on the testing set with a mean bootstrap estimate of 0.74 and 95% CI: [0.630.85], 10-fold cross-validation of 0.759, the accuracy of 0.72, and AUC of 0.769. The SVM model performed with a mean bootstrap estimate of 0.70 and 95% CI: [0.600.79], 10-fold cross-validation of 0.725, an accuracy of 0.74, and an AUC of 0.751. On the other hand, Guys Score and S.T.O.N.E Score had an AUC of 0.666 and 0.71, respectively. The RFC model performed on the external validation set with a mean bootstrap estimate of 0.87 and 95% CI: [0.810.92], an accuracy of 0.70, and an AUC of 0.795, while the XGBoost model predicted on the external validation set with a mean bootstrap estimate of 0.84 and 95% CI: [0.780.91], an accuracy of 0.74, and an AUC of 0.84. The SVM model performed on the external validation set with a mean bootstrap estimate of 0.86 and 95% CI: [0.800.91], an accuracy of 0.79, and an AUC of 0.858. ROC curves of all MLMs are displayed in Figure 2. The most contributing features in predicting SFS in the RFC model are displayed in Figure 3. The highest contributing factor was stone burden, followed by the length of stay and age.

Figure 2 The ROC curves of the three MLMs including the externally validated set and the GSS and S.T.O.N.E score system.

Figure 3 Results of feature importance analysis in the RFC model for predicting SFS of PCNL patient.

Nephrolithiasis is a common kidney disease with 1%5% prevalence in Asia and 7%13% in America, with a male predominance of ratio 1.52.5:1 [1315]. Urinary lithiasis casts diagnostic, prognostic, and financial burdens, especially when a patient needs multiple imaging and surgical procedures.16 Therefore, we aimed to develop an MLM that can predict the postoperative outcome namely SFS in patients who underwent PCNL. Our models predicted the SFS with high accuracy and certainty using pre- and post-operative variables, marking the stone burden as the highest contributing predictor of SFS.

Among the factors considered in our predictive models, the length of hospital stay and age stand out as noteworthy contributors to the predictive capacity of the model. These findings underscore the dynamic nature of predicting SFS, revealing that beyond preoperative variables, factors associated with the postoperative trajectory can substantially influence outcome predictions. The inclusion of the length of stay, a postoperative parameter, lends valuable insights into the nuances of stone clearance efficacy and the recovery process. Our models demonstrate that patients who required a longer hospital stay were more likely to exhibit distinct stone burdens and procedural complexities, aligning with the clinical intuition that these patients may require additional care to achieve optimal outcomes. The predictive power of the length of hospital stay offers clinicians an early indicator of potential stone-related challenges, allowing for more targeted interventions and follow-up strategies.

Upon comparison of the stone burden between non-stone-free and stone-free groups, the non-stone-free group showed nearly double the stone burden, which was statistically significant. This disparity in stone burden between both groups could be due to the multifaceted challenges these larger stone burdens pose on PCNL procedures. These large burdens often indicate complex or multiple stone formations that hinder full access to the collecting system, therefore, preventing complete fragmentation which renders a lower rate of stone-free status.17

Upon closer examination, we observed a counterintuitive trend between age and SFS. Interestingly, the mean age of non-stone-free patients was lower than that of stone-free patients. It is important to note that the non-stone-free group consisted of 93 patients, whereas the stone-free group comprised 228 patients. This disparity in sample sizes could potentially influence the observed relationship, prompting us to consider the role of sample distribution in drawing conclusive insights. This observation prompts further investigation into the complex relationship between age, stone characteristics, and treatment outcomes. While age emerged as a contributing factor to our predictive models, the inverse relationship between age and SFS in our study warrants a deeper exploration of the underlying mechanisms. The relationship between age and SFS could potentially be attributed to a variety of factors. One possible explanation could be the differential distribution of stone types among different age groups. Age-related variations in stone composition, density, or structure might influence fragmentation behavior and clearance rates, subsequently affecting SFS outcomes. Moreover, physiological differences related to bone density, urinary dynamics, and kidney function across various age cohorts could contribute to the observed trend.

When compared to the conventional scoring systems, our model showed superior performance to Guys stone score and S.T.O.N.E score. The Guys stone score was first developed by Thomas et al and consists of four grades based on the location and number of stones.18 This score has been validated on many prospective PCNL procedures and was significantly correlated with the stone-free rate, whereas stone burden and patients demographic and clinical factors did not show any correlation19 and 0.69 for Guys stone score.20

A study by Zhao et al also assessed the predictive effect of demographic, pre- and post-operative renal variables on SFS using ML with similar performance to our model. The stone burden was also observed to be the highest contributing feature in their logistic regression model in addition to stone location.8 However, their model did not show superior performance to the S.T.O.N.E score but only to Guys score (RFC: 0.80, Guys score: 0.84, S.T.O.N.E score: 0.78). Our data was externally validated using this study, ensuring our results are generalized to the population. However, calculating the stone burden is not consistent across all studies, and there is no descriptive formula for its calculation. This presents heterogeneity and inconsistency in predicting SFS. In our study, the stone burden was calculated by the following formula: (maximum length Maximum width 0.25), which was also used by Smith et al.21 This, in turn, raises the need for a clearly defined model that considers interindividual variables and operative variables in addition to ethnic and racial. All studies have externally evaluated stone scoring systems in eastern and western societies, here we present the first study cohort which evaluated stone scores in a middle eastern society. Srivastava et al also evaluated the effect of inter-observer variability between surgeons and radiologists for Guys and S.T.O.N.E scores and studied the agreement using the Fleiss coefficients. The overall S.T.O.N.E score showed good agreement between surgeons and radiologists (Fleiss = 0.79) the same applies to Guys score for all grades with moderate to very good agreement (Fleiss : Grade I = 0.91; Grade 2: 0.53; Grade 3: 0.61; Grade 4: 0.84).22

We were the only publication in our field to externally validate our data, ensuring the accuracy and reliability of our findings. This process involved an independent third-party review to verify the methodology and results, and by taking this extra step, we were able to provide further creditability to our conclusions. Overall, the use of statistical methods and Python programming in external validation helps to ensure the robustness and generalizability of the data and model. A strength of the study is that it considered additional variables such as age and pre-operative UTI, which were not included in GSS and S.T.O.N.E score.

The ML techniques employed in this study were able to predict the rate of successful stone removal with higher accuracy than the established GSS and S.T.O.N.E score systems, moreover, this study considered other variables that were not considered in the aforementioned scoring systems. When evaluated, all three MLMs were externally validated and showed high accuracy rates. The accuracy of the system for predicting the stone-free rate was found to be between 70 and 79% with an AUC between 0.751 and 0.858, compared to the AUC of GSS and S.T.O.N.E which were 0.67 and 0.71, respectively. All ML methods found that the factors that had the greatest impact on stone-free status were the initial stone burden, length of stay, and patient age.

In accordance with the ethical guidelines and standards outlined in the Declaration of Helsinki, we hereby confirm that our study fully complies with these principles. The Research Committee of the Faculty of Medicine and the Institutional Review Board at Jordan University of Science and Technology (JUST) approved the study, and the Institutional Review Board provided the ethical approval (840-2022). The ethics committee approved a waiver of consent from the patients because the study did not include any therapeutic intervention and the outcomes planned are routinely registered in patients with nephrolithiasis.

The authors declare that they have followed all ethical and scientific standards when conducting their research and writing the manuscript and that all authors have approved the final version of the manuscript for submission.

We would like to thank Editage for English language editing.

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

This research received no external funding.

The authors of this article have carefully considered any potential conflicts of interest and have found none to report. They have no relevant financial or non-financial interests that could impact the articles content, and they have no affiliations or involvement with any organizations with a financial or proprietary interest in the material discussed. The authors declare that they have no competing interests related to the manuscripts subject matter and certify that they have no ties to any entity that could present a conflict.

1. Sorokin I, Mamoulakis C, Miyazawa K, Rodgers A, Talati J, Lotan Y. Epidemiology of stone disease across the world. World J Urol. 2017;35(9):13011320. doi:10.1007/s00345-017-2008-6

2. Abboud IA. Prevalence of Urolithiasis in Adults due to Environmental Influences: a Case Study from Northern and Central Jordan. Jordan J Earth Environ Sci. 2018;9(1):2938.

3. Ganpule AP, Vijayakumar M, Malpani A, Desai MR. Percutaneous nephrolithotomy (PCNL) a critical review. Int J Surgery. 2016;36(PD):660664. doi:10.1016/j.ijsu.2016.11.028

4. Kumar U, Tomar V, Yadav SS, et al. STONE score versus Guys Stone Score - Prospective comparative evaluation for success rate and complications in percutaneous nephrolithotomy. Urol Ann. 2018;10(1):7681. doi:10.4103/UA.UA_119_17

5. Wu WJ, Okeke Z. Current clinical scoring systems of percutaneous nephrolithotomy outcomes. Nat Rev Urol. 2017;14(8):459469. doi:10.1038/nrurol.2017.71

6. Zhernovoi I, Shchukin D, Jundi M, Grabs D, Maranzano J, Nayouf A. Comparison of four transdiaphragmatic approaches to remove cavoatrial tumor thrombi: a pilot study. Cent European J Urol. 2022;75(2):145152. doi:10.5173/ceju.2022.0277.R1

7. Jiang K, Sun F, Zhu J, et al. Evaluation of three stone-scoring systems for predicting SFR and complications after percutaneous nephrolithotomy: a systematic review and meta-analysis. BMC Urol. 2019;19:1. doi:10.1186/s12894-019-0488-y

8. Zhao H, Li W, Li J, Li L, Wang H, Guo J. Predicting the Stone-Free Status of Percutaneous Nephrolithotomy With the Machine Learning System: comparative Analysis With Guys Stone Score and the S.T.O.N.E Score System. Front Mol Biosci. 2022;9. doi:10.3389/fmolb.2022.880291

9. Aminsharifi A, Irani D, Pooyesh S, et al. Artificial Neural Network System to Predict the Postoperative Outcome of Percutaneous Nephrolithotomy. J Endourol. 2017;31(5):461467. doi:10.1089/end.2016.0791

10. Shabaniyan T, Parsaei H, Aminsharifi A, et al. An artificial intelligence-based clinical decision support system for large kidney stone treatment. Australas Phys Eng Sci Med. 2019;42(3):771779. doi:10.1007/s13246-019-00780-3

11. Aminsharifi A, Irani D, Tayebi S, Jafari Kafash T, Shabanian T, Parsaei H. Predicting the Postoperative Outcome of Percutaneous Nephrolithotomy with Machine Learning System: software Validation and Comparative Analysis with Guys Stone Score and the CROES Nomogram. J Endourol. 2020;34(6):692699. doi:10.1089/end.2019.0475

12. Hameed BMZ, Shah M, Naik N, Singh Khanuja H, Paul R, Somani BK. Application of Artificial Intelligence-Based Classifiers to Predict the Outcome Measures and Stone-Free Status Following Percutaneous Nephrolithotomy for Staghorn Calculi: cross-Validation of Data and Estimation of Accuracy. J Endourol. 2021;35(9):13071313. doi:10.1089/end.2020.1136

13. Collins GS, Reitsma JB, Altman DG, Moons K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med. 2015;13(1):1. doi:10.1186/s12916-014-0241-z

14. TIBCO Software. What is a Random Forest? Available from: https://www.tibco.com/reference-center/what-is-a-random-forest. Accessed August 19, 2023.

15. Pushkar Mandot. How exactly XGBoost Works? 2019. Available from: https://medium.com/@pushkarmandot/how-exactly-xgboost-works-a320d9b8aeef. Accessed August 19, 2023.

16. Ziemba JB, Matlaga BR. Epidemiology and economics of nephrolithiasis. Investig Clin Urol. 2017;58(5):299306. doi:10.4111/icu.2017.58.5.299

17. Rais-Bahrami S, Friedlander JI, Duty BD, Okeke Z, Smith AD. Difficulties with access in percutaneous renal surgery. Ther Adv Urol. 2011;3(2):5968. doi:10.1177/1756287211400661

18. Thomas K, Smith NC, Hegarty N, Glass JM. The Guys Stone ScoreGrading the Complexity of Percutaneous Nephrolithotomy Procedures. Urology. 2011;78(2):277281. doi:10.1016/j.urology.2010.12.026

19. Noureldin YA, Elkoushy MA, Andonian S. External validation of the S.T.O.N.E. nephrolithometry scoring system. J Canadian Urol Assoc. 2015;9(6):190195. doi:10.5489/cuaj.2652

20. Ingimarsson JP, Dagrosa LM, Hyams ES, Pais VM. External validation of a preoperative renal stone grading system: reproducibility and inter-rater concordance of the Guys stone score using preoperative computed tomography and rigorous postoperative stone-free criteria. Urology. 2014;83(1):4549. doi:10.1016/j.urology.2013.09.008

21. Smith A, Averch TD, Shahrour K, et al. A nephrolithometric nomogram to predict treatment success of percutaneous nephrolithotomy. J Urol. 2013;190(1):149156. doi:10.1016/j.juro.2013.01.047

22. Srivastava A, Yadav P, Madhavan K, et al. Inter-observer variability amongst surgeons and radiologists in assessment of Guys Stone Score and S.T.O.N.E. nephrolithometry score: a prospective evaluation. Arab J Urol. 2020;18(2):118123. doi:10.1080/2090598X.2019.1703278

Original post:
Predicting Stone-Free Status of Percutaneous Nephrolithotomy ... - Dove Medical Press

Read More..

What is Image Annotation, and Why is it Important in Machine … – Ground Report

Image annotation is the key to enabling machines to understand the language of pixels in a world where visual data is abundant, and images offer tales, information, and insights. In machine learning, particularly in computer vision, image annotation, a laborious effort comprising the categorization and contextualization of visual features inside images, has a significant impact.

Artificial visual intelligence is based on the accurate interpretation of images, which is necessary for machines to detect, understand, and respond to the visual world. Image annotation acts as this conduit. This fundamental procedure closes the gap between human perception and machine cognition, and it is more than just a technical step. Lets understand how.

The process of classifying or labeling an image in deep learning and machine learning is image annotation. Image annotation services include classifying the images using annotation tools, texts, or both. It helps in presenting the data that your model will recognize. Additionally, during image annotation, metadata is added to the dataset.

People often call image annotation processing, tagging, or transcribing. In the present day, besides images, videos can also be annotated easily. Image annotation is typically done to train your ML models to identify and recognize the images.

Once your machine learning model is used, you want it to recognize features in images that arent annotated. It is done for the model to decide what to do or take action as a result.

A lot of data is used to train, test and validate a machine learning model for achieving results. Image annotation is primarily done to make the models recognize boundaries and objects. After recognizing the ML models, segment these images for complete understanding or meaning.

Simple image annotation is the process of labeling an image using terms describing its object. For example, a cats image can be annotated as a domestic house cat. It is also known as tagging or image classification.

This annotation is used for identifying, counting, and tracking multiple objects in an image. Here the ML model is trained on multiple datasets to recognize between two objects in an image.

For instance, you may have an image of products in your warehouse, and you want the ML model to identify those products or machinery and label them accordingly. For this, you will need the help of data entry services, where all the data will be stored for training your ML model. Your ML model will use this data to understand the various names and classify them accordingly.

You can annotate standard and multi-frame images and videos for your machine-learning model. Below are the two types of data used in image annotation:

Here images, videos, and data from cameras or other technical devices such as single-lens reflex cameras (SLR) or optical microscopes are used.

Here the data is used from cameras and other technical instruments like ion, electron, or scanning probe microscopes for annotation.

Here are some of the reasons why image annotation is needed for ML models:

If you want your ML model to be effective in areas such as robots, drones, and autonomous vehicles, you would want it to identify the desired objects. Identifying it will help the ML model to make decisions and take necessary actions.

Image annotation helps the ML models to categorize and recognize the different objects in an image. Without image annotation, it can be difficult for the computer to identify and label many objects in a single image.

Hence, deep learning, a part of ML, annotate the different images. It is further used to identify these objects and make it easier for the computer to understand, locate and categorize them. It is especially needed when an image has both living and nonliving objects.

Image annotation is needed to integrate ML models and raw visual data efficiently. It gives companies the understanding that their models require for accurately predicting and making decisions. It is an essential part of the development of computer vision as it influences how well your ML model will perform and develop.

Follow Ground Report forClimate ChangeandUnder-Reported issuesin India. Connect with us onFacebook,Twitter,Koo App,Instagram,WhatsappandYouTube. Write us onGReport2018@gmail.com

See the original post:
What is Image Annotation, and Why is it Important in Machine ... - Ground Report

Read More..

Detection of diabetic patients in people with normal fasting glucose … – BMC Medicine

Data collection and processing

The physical examination data were derived from three hospitals, First Affiliated Hospital of Wannan Medical College, Beijing Luhe Hospital of Capital Medical University, and Daqing Oil field General Hospital. The three datasets were named as D1, D2, and D3, respectively. The first step was data cleaning, in which samples with missing values and abnormal values were excluded. According to the criteria for diagnosing prediabetes and diabetes from WHO, we screened the samples with normal fasting glucose (6.1mmol/L) and classified these samples into two groups by HbA1c level with threshold of 6.5%, diabetes patients (HbA1c6.5%) and normal/healthy samples. After preprocessing, 61,059, 369, and 3247 samples were retained in D1, D2, and D3, which separately contained 603, 3, and 21 subjects with diabetes, that is, the positive samples. Then, we split D1 into training set, validation set, and test set by 6:1:3 using randomly stratified sampling. D2 and D3 were used as newly recruited independent test sets.

All datasets contained 27 physical examination characteristics, including sex, age, height, body mass index (BMI), fasting blood glucose (FBG), white blood cell count (WBC), neutrophil (NEU), absolute neutrophil count (ANC), lymphocyte (LYM), absolute lymphocyte count (ALC), monocyte (MONO), absolute monocyte count (AMC), eosinophil (EOS), absolute eosinophil count (AEC), basophil (BASO), absolute basophil count (ABC), hemoglobin (HGB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), red cell distribution width (RDW), platelets (PLT), mean platelet volume (MPV), platelet distribution width (PDW), thrombocytopenia (PCT), red blood cell count (RBC), and mean corpuscular hemoglobin concentration (MCHC).

Given the severe class-imbalance of all datasets, the SMOTE (synthetic minority over-sampling technique) method was employed on training set for oversampling the positive samples. SMOTE could generate new samples for the minority class by interpolation based on k-nearest neighbors [22], which could make positive samples as large as negative samples on training set. The process was implemented by imblearn package in Python. Finally, we conducted Z-score normalization on all datasets, in which the mean and standard deviation values were calculated by the data of training set.

With the physical examination data, we presented a computational framework for identifying the diabetic patients with NFG, as shown in Fig.1. At first, we preprocessed three datasets of D1, D2, and D3 as introduced above, in which D1 was divided into training set, validation set, and test set by 6:1:3, while D2 and D3 as independent test set were used for the evaluation of final model. In view of the class-imbalance of datasets, we used an oversampling method on the training set. Then, multiple widely used machine learning methods including logistic regression (LR), random forest (RF), supported vector machine (SVM), and deep neural network (DNN) were exploited to construct the predictor. Next, we applied feature selection methods on the most superior one of four predictors to improve the feasibility of tool and assessed the performance with independent test sets. Finally, feature importance analysis was used to screen relevant variables with the incidence of diabetes. And we devised a framework for identifying the risk factors of diabetes at individual level and developed an online tool for boosting its clinical practice.

Overview of the DRING approach

In preliminary, in order to build the predictive model, four machine learning methods were employed, including LR, RF, SVM, and DNN. LR is a variation of linear regression prominently used in classification tasks [23], which finds the best fit to describe the linear relationship between responsible variables and input features and then covert the output to the probability by a sigmoid function. RF is composed of numerous decision trees, which are practically a collection of ifthen condition [24] The decision tree recursively split the data into subset based on the best feature and criterion until the stopping criterion is met. In RF, each decision tree is independently trained on random subset of samples and features, which reduces the risk of overfitting. The final decision is voted by all trees improving the overall accuracy and the robustness of the model. SVM, one of the most popular machine learning methods, classifies the samples by finding a hyperplane on the feature space to maximize the margin of points from different classes [25]. It can handle non-linearly separable data by using various kernels such as linear, polynomial, and radial basis function realizing the original feature space into high-dimensional space. The LR, RF, and SVM models were constructed by scikit-learn package in Python 3.8. And default parameters were used in the process of training models. DNN [26] contains input layer, hidden layer, and output layer, where there are plenty of neurons in each layer and the neurons from different layers are connected. For DNN, the connection is generally linear transformation followed by an activation function. Here, we used the ReLU function to activate the linear neurons and softmax function to output the prediction result. In addition, we used the dropout and L2 regularization strategy in the hidden layers to prevent the presence of overfitting. Moreover, the residual blocks also were added into the DNN for simplifying the training process. The DNN was implemented by Pytorch package. In this study, DNN model achieved the best performance when the number of layers at 6 and initial learning rate with 0.0018. Loss on the training set and validation set was depicted in Additional file 1: Fig. S1. And we chose the model with the best performance on validation set for further optimization.

Currently, machine learning models for classification task are evaluated by multiple well-established metrics, for example, sensitivity, accuracy, and area under the receiver operating characteristic curve (AUC), etc. Given the seriously unbalanced classes of validation set and test set, here, we exploited sensitivity, specificity, balanced accuracy, AUC, and area under the precision-recall curve (PR-AUC) to evaluate models, which were calculated as following formulas.

$$mathrm{Sensitivity}=mathrm{TPR}= frac{TP}{TP+FN}$$

(1)

$$mathrm{Specificity}=mathrm{TNR}=frac{TN}{TN+FP}$$

(2)

$$mathrm{Balanced accuracy}= frac{TPR+TNR}{2}$$

(3)

(TP), that is, true positive, is the number of correctly classified diabetes patients. (FP), false positive, denotes the number of normal subjects who were predicted as diabetes. (TN), true negative, represents the number of correctly classified health subjects. (FN), false negative, is the number of diabetes patients who were classified as health individuals. And all above metrics range from 0 to 1.

Although the predictive model based on 27 features had a considerable performance, there still exist several possible redundant information or noise features affecting the decision making. To maximize the effective information of features and simplify the model, we used manual curation and max relevance and min redundancy (mRMR) [27] to extract key features for the final model. Towards manual curation, we firstly selected the features with significant difference between the positive samples and the negative samples. To enhance the stability of the predictive model, we removed the features resulting in severe collinearity. As a result, 13 features were retained. For consistencys sake, the number of feature subset was set to 13 when performing mRMR analysis. In addition, feature selection was executed on the training set for reducing the risk of overfitting. Analysis of feature importance can interpret the prediction model and discover the most relevant features with diabetes. Here, the importance of each feature was measured by its corresponding weight coefficient of the LR model.

We developed an online tool, DRING (http://www.cuilab.cn/dring), based on the predictive models with 13 features filtered by manual curation and mRMR, where the former is the preferred option. The backend development of website was implemented by Python 2.7, and the interactive pages were constructed on the combination of HTML5, Boostrap 4, and JavaScript.

Feature importance analysis can help to explain the model; however, it fails to explore the risk factors for incident diabetes at individual level. To find out the potential risk factor for a specific individual, we learnt from the permutation feature importance (PFI) algorithm [24, 28], which is designed for quantifying the importance for each of the variables of a dataset. Here, we adapted PFI to assess the contributions of the features derived from an individual. Specifically, it contains the following 4 steps: (1) given a feature vector, we firstly create a series of random permutation for one of features based on the input dataset; (2) then, we calculate the prediction results for each of new feature vectors; (3) the contribution of the permutated feature is defined as formula 4:

$${P =| P}_{r}-frac{1}{k}{sum }_{i=1}^{k}{P}_{i} |$$

(4)

({P}_{r}) is the risk score for diabetes calculated with the initial feature vector, here referred to the predictive probability of diabetes; ({P}_{i}) is the prediction result of ith permutation, and (k) is the number of permutations; (4) perform the above steps iteratively for each of features. Here, we set k to 100,000. The feature with a higher value implies more contribution to the risk of diabetes.

More here:
Detection of diabetic patients in people with normal fasting glucose ... - BMC Medicine

Read More..

Addressing gaps in data on drinking water quality through data … – Nature.com

Input data

The analytical framework begins with input data and continues to data preparation, modeling and application (Fig. 5). The study uses the ESS. The survey is a collaboration between the Central Statistics Agency and the World Bank under the Living Standards Measurement Study- Integrated Surveys on Agriculture (LSMS-ISA) project. ESS began in 2011/12 and the first wave, ESS1 covered rural and small-town areas. The survey was expanded to include medium and large towns in 2013/14 (ESS2). The 2013/2014 sample households were again visited in 2015/16 (ESS3) during which the water quality module was implemented. The survey was fielded again in 2018/19 (ESS4) with a refreshed sample. This study is primarily based on the 2016 Survey (ESS3) and associated water quality survey18,28. In this study, ESS2 is the Earlier Survey, ESS3 is the Reference Survey, and ESS4 is the Latest Survey. ESS1 was not used because the survey did not cover medium and large towns. See the Data Availability section for further information on these data sources including metadata.

Methodological workflow from input data to model application.

ESS is a multi-topic household survey with several individual and household level socioeconomic and demographic information. These included basic individual-level demographic information on household structure, education, health, and labor market outcomes, as well as several household-level information such as household assets, consumption expenditure, dwelling characteristics, access to electricity, water, and sanitation facilities. ESS data also comes with a range of geospatial variables that are constructed by mapping the households location to other data available for the area. These include, among other things, rainfall, temperature, greenness, wetness, altitude, population density, the households closeness to the nearest major road, urban and market centers. In addition, the 2015/16 survey (ESS3) which is the main focus of this study, implemented a water quality module that included microbial and chemical tests to measure water quality. The microbial test included the presence of E. coli, WHOs preferred indicator of fecal contamination5.

The response variable in this study is the presence of E. coli contamination at the point of collection. Contaminated drinking water refers to the detection of E. coli in water samples collected from the households drinking water source.

The objective of this study was to develop a predictive model for drinking water contamination from minimal socioeconomic information. Therefore, only features that are often included in household surveys are considered. For example, the 2015/16 water quality module has some information on the chemical and physical characteristics of the water. These variables were not included in the training dataset because they are not usually available in other surveys. Therefore, the data preparation for this study considered only selected variables.

Data preparation activities included pre-processing, data splitting, and dimension reduction. The pre-processing step involved constructing some variables from existing variables, variable transformation, and treating missing values by imputation or dropping them from the analysis. Constructed variables included wealth index and open defecation in the area. The wealth index was constructed from selected assets using principal component analysis. Open defecation in the area is an enumeration area (EA) level variable and indicates the proportion of households in the EA who do not have a toilet facility. Variables that were transformed include the water source type. For example, we combined boreholes, protected springs and wells into a single category given the comparatively low number of respondents and in order to harmonize responses across the three waves of the survey. Similarly, unprotected springs and wells were combined. Consequently, the water source type list included in the model selection analysis had fewer categories than in the raw data.

To assess how the classifiers generalize to unseen data, the pre-processed data was split into training and test datasets stratified by the distribution of the response variable. Accordingly, 80% of the data is assigned to the training dataset and the remaining 20% is assigned to the test dataset. The training dataset was used to train the classifiers and estimate the hyperparameters, and the test dataset was used to evaluate the performance of the classifiers and get an independent assessment of how well the classifiers performed in predicting the positive class (contaminated drinking water source). To reduce the dimension of the processed data, the Boruta feature selection algorithm was used. The final list of features used in the analysis is presented in Supplementary Table 1.

We examined a few commonly used classification algorithms including GLM, GLMNET, KNN, SVM, and two decision tree-based classifiers: RF, and XGBoost. To obtain the optimal values of the classifiers hyperparameters that maximize the area under the ROC, we tuned the non-liner classifiers using regular grid search method.

The GLM uses a parametric model allowing for different link functions for the response variable. For classification purposes, the response values are categorical. Especially in this study, we have a binary classification problem; i.e., contaminated versus non-contaminated. Therefore, logistic regression is used as a reference model. The glm R package was used in this study30.

The GLMNET classifier uses GLM via penalized maximum likelihood. The lasso and elastic net are popular types of penalized linear regression (or regularized linear regression models) that add penalties to the loss function during training. It promotes simpler models with better accuracy and removes features that are highly correlated. We also used glmnet R package for the GLMNET classifier and tuned two hyperparameters penalty (regularization parameter) and mixture (representing relative amount of penalties).

KNN is one of the most widely used non-parametric classifiers. It defines similarity as being in close proximity. In other words, it classifies a new case or data point based on its distance or closeness to the majority of its k nearest neighbor points in the training set. We used kknn package in R and tuned two hyperparameters neighbors (nearest neighbors) and weight_func (distance weighting function).

SVM is another classification method that uses distance to the nearest training data points. It classifies data points by using hyperplanes with the maximum margin between classes in high dimensional feature space31. It works for cases not linearly separable. In this study, we used a non-linear kernel (kernlab) package in R and tuned two hyperparameters including cost and degree (polynomial degree).

RF is an ensemble method that builds multiple decision trees by sampling the original data set multiple times with replacement32. Therefore, it uses a subset of the original dataset to train the decision trees and to separate different classes as much as possible. RF combines the trees at the end by taking the majority of votes from those trees. Although large number of trees will slow the process, the greater number of trees in the forest help improve the overall accuracy and prevent the problem of overfitting. We used ranger package in R, which provides the importance of features as well. We tuned the following three hyperparameters: mtry (number of randomly selected predictors), min_n (minimal node size), and trees (1000).

XGBoost is another machine learning ensemble method which uses the gradient of a loss function that measures the performance33. Different than other ensemble methods, which train models in isolation of one another, XGBoost (or boosting) trains models sequentially by training each new model to correct the errors made by the previous ones. This continues until there is no scope of further improvements. XGBoost is fast to execute in general and gives good accuracy. In this study, we used XGBClassifier from xgboost package in R. The xgboost package has few tunable parameters and we tuned two of them: trees (trees) and tree_depth (tree depth).

The classification algorithms are evaluated using metrics that are calculated from the four predicted results of the confusion matrix: (i) true positive (TP) or correctly predicted as contaminated, (ii) true negative (TN) or correctly predicted as not contaminated, (iii) false positive (FP) or wrongly predicted as contaminated, and (iv) false negative (FN) or wrongly predicted as not contaminated. With our data being class-imbalanced, we used a combination of metrics to evaluate the models. We calculated accuracy, sensitivity (also known as recall or true positive rate (TPR)), specificity or true negative rate (TNR), F1 score, and area under the curve (AUC) of Receiver Operating Characteristics (ROC). The positive cases are more important than the negative cases and the goal is to make sure the best performing model maximizes the TPR. Finally, given the data we used is of imbalanced classes we have implemented resampling techniques17. These include upsampling the minority class and downsampling the majority class (See Supplementary Tables 3 and 4). However, there were no significant improvements in the prediction results. The AUC for the RF model using upsampling and downsampling techniques is 0.90 (95% CI 0.88, 0.93). Similarly, AUC for the XGBoost model is 0.90 (95% CI 0.87, 0.92) for upsampling and 0.89 (95% CI 0.86, 0.92). These are similar to the main results reported in Table 2.

The analyses were conducted with the R programming language.

See the rest here:
Addressing gaps in data on drinking water quality through data ... - Nature.com

Read More..

D-Wave Quantum Featured in The New Stack Article Discussing Potential Impact of Quantum Computing on AI – Yahoo Finance

LOS ANGELES, CA - (NewMediaWire) - September 8, 2023 - (InvestorBrandNetwork via NewMediaWire) - IBN, a multifaceted financial news, content creation and publishing company, is utilized by both public and private companies to optimize investor awareness and recognition.

D-Wave Quantum (NYSE: QBTS), a leader in quantum computing systems, software and services, was spotlighted in a recent article from The New Stack titled "D-Wave Suggests Quantum Annealing Could Help AI." The article notes that the effect of quantum computing on artificial intelligence ("AI") could be as understated as it is profound. According to the article, some say quantum computing is necessary to achieve general artificial intelligence, and "certain expressions of this paradigm, such as quantum annealing, are inherently probabilistic and optimal for machine learning."

The article points out that the most pervasive quantum annealing use cases center on optimization and constraints, which are challenges that traditionally involve nonstatistical AI approaches such as rules, symbols and reasoning. "With quantum computing, a lot of times we're talking about what will it be able to do in the future," observed Mark Johnson, D-Wave SVP of Quantum Technologies and Systems Products. "But no, you can do things with it today."

To view the full article, visit https://ibn.fm/cX09d

About D-Wave Quantum Inc.

D-Wave is a leader in the development and delivery of quantum computing systems, software and services, and is the world's first commercial supplier of quantum computers and the only company building both annealing quantum computers and gate-model quantum computers. The company's mission is to unlock the power of quantum computing today to benefit business and society. D-Wave does this by delivering customer value with practical quantum applications for problems as diverse as logistics, artificial intelligence, materials sciences, drug discovery, scheduling, cybersecurity, fault detection and financial modeling. D-Wave's technology is being used by some of the world's most advanced organizations, including Volkswagen, Mastercard, Deloitte, Davidson Technologies, ArcelorMittal, Siemens Healthineers, Unisys, NEC Corporation, Pattison Food Group Ltd., DENSO, Lockheed Martin, Forschungszentrum Jlich, University of Southern California and Los Alamos National Laboratory. For more information about the company, please visit http://www.DWaveSys.com.

Story continues

Forward-Looking Statements

Certain statements in this press release are forward-looking, as defined in the Private Securities Litigation Reform Act of 1995. These statements involve risks, uncertainties, and other factors that may cause actual results to differ materially from the information expressed or implied by these forward-looking statements and may not be indicative of future results. Forward-looking statements in this press release include, but are not limited to, statements regarding the relationship between quantum computing and AI, and annealing quantum computing and machine learning. These forward-looking statements are subject to a number of risks and uncertainties, including, among others, various factors beyond management's control, including general economic conditions and other risks; our ability to expand our customer base and the customer adoption of our solutions; risks within D-Wave's industry, including anticipated trends, growth rates, and challenges for companies engaged in the business of quantum computing and the markets in which they operate; the outcome of any legal proceedings that may be instituted against us; risks related to the performance of our business and the timing of expected business or financial milestones; unanticipated technological or project development challenges, including with respect to the cost and/or timing thereof; the performance of our products; the effects of competition on our business; the risk that we will need to raise additional capital to execute our business plan, which may not be available on acceptable terms or at all; the risk that we may never achieve or sustain profitability; the risk that we are unable to secure or protect our intellectual property; volatility in the price of our securities; the risk that our securities will not maintain the listing on the NYSE; and the numerous other factors set forth in D-Wave's Annual Report on Form 10-K for its fiscal year ended December 31, 2022 and other filings with the Securities and Exchange Commission. Undue reliance should not be placed on the forward-looking statements in this press release in making an investment decision, which are based on information available to us on the date hereof. We undertake no duty to update this information unless required by law.

About IBN

IBN has introduced 50+ distinct investor-focused brands over the last 15+ years and has leveraged these unique brands to amass a collective audience of millions of social media followers. In conjunction with 5,000+ syndication partnerships, IBN's broad-based investor-facing brands help fulfill the diverse demands and needs of our rapidly growing base of client-partners. IBN will continue to expand our branded network of highly influential properties, harnessing the energy and experience of our team of specialized experts to exceed the expectations of each of our client-partners.

Please see full terms of use and disclaimers on the InvestorBrandNetwork website applicable to all content provided by IBN, wherever published or re-published: http://IBN.fm/Disclaimer

Corporate Communications

IBN (InvestorBrandNetwork)Los Angeles, California

http://www.InvestorBrandNetwork.com310.299.1717 OfficeEditor@InvestorBrandNetwork.com

Go here to see the original:
D-Wave Quantum Featured in The New Stack Article Discussing Potential Impact of Quantum Computing on AI - Yahoo Finance

Read More..

Scientists used machine learning to perform quantum error correction – Tech Explorist

The qubits that make up quantum computers can assume any superposition of the computational base states. This allows quantum computers to conduct new tasks in conjunction with quantum entanglement, another quantum property that joins several qubits in ways that go beyond what is possible with classical connections.

The extraordinary fragility of quantum superpositions is the primary obstacle to the practical implementation of quantum computers. In fact, errors that quickly shatter quantum superpositions are caused by minor disturbances, such as those caused, for example, by the environments pervasive presence. As a result, quantum computers lose their competitive advantage.

To overcome this obstacle, sophisticated methods for quantum error correction have been developed. While they can neutralize the effect of errors, they often come with a massive overhead in device complexity.

In a new study, scientists from the RIKEN Center for Quantum Computing used machine learning to perform error correction for quantum computers. Through this, they took a step forward in making these devices practical.

In particular, scientists used an autonomous correction system that, despite being approximate, can efficiently determine how best to make the necessary corrections.

This study used machine learning to find error correction methods with low device overhead and high error-correcting performance. To do this, they focused on an autonomous approach to quantum error correction, in which a skillfully created artificial environment replaces the requirement for performing regular error-detecting measurements. They also studied bosonic qubit encodings, which are, for example, used in some of the most promising and common quantum computing devices now accessible and built on superconducting circuits.

The vast search space for bosonic qubit encodings poses a challenging optimization problem that scientists attempt to solve with reinforcement learning. This cutting-edge machine learning technique involves an agent exploring an environment that may be abstract to learn and improve its action policy. As a result, the team discovered that an approximative qubit encoding that was unexpectedly simple could not only significantly reduce device complexity when compared to previously proposed encodings but also exceed its rivals in terms of its capacity to repair errors.

Yexiong Zeng, the first author of the paper, says,Our work not only demonstrates the potential for deploying machine learning towards quantum error correction, but it may also bring us a step closer to the successful implementation of quantum error correction in experiments.

According to Franco Nori,Machine learning can play a pivotal role in addressing large-scale quantum computation and optimization challenges. Currently, we are actively involved in several projects that integrate machine learning, artificial neural networks, quantum error correction, and quantum fault tolerance.

Journal Reference:

Read the original here:
Scientists used machine learning to perform quantum error correction - Tech Explorist

Read More..

Machine learning contributes to better quantum error correction – Phys.org

This article has been reviewed according to ScienceX's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

trusted source

proofread

close

Researchers from the RIKEN Center for Quantum Computing have used machine learning to perform error correction for quantum computersa crucial step for making these devices practicalusing an autonomous correction system that despite being approximate, can efficiently determine how best to make the necessary corrections.

The research is published in the journal Physical Review Letters.

In contrast to classical computers, which operate on bits that can only take the basic values 0 and 1, quantum computers operate on "qubits", which can assume any superposition of the computational basis states. In combination with quantum entanglement, another quantum characteristic that connects different qubits beyond classical means, this enables quantum computers to perform entirely new operations, giving rise to potential advantages in some computational tasks, such as large-scale searches, optimization problems, and cryptography.

The main challenge towards putting quantum computers into practice stems from the extremely fragile nature of quantum superpositions. Indeed, tiny perturbations induced, for instance, by the ubiquitous presence of an environment give rise to errors that rapidly destroy quantum superpositions and, as a consequence, quantum computers lose their edge.

To overcome this obstacle, sophisticated methods for quantum error correction have been developed. While they can, in theory, successfully neutralize the effect of errors, they often come with a massive overhead in device complexity, which itself is error-prone and thus potentially even increases the exposure to errors. As a consequence, full-fledged error correction has remained elusive.

In this work, the researchers leveraged machine learning in a search for error correction schemes that minimize the device overhead while maintaining good error correcting performance. To this end, they focused on an autonomous approach to quantum error correction, where a cleverly designed, artificial environment replaces the necessity to perform frequent error-detecting measurements.

They also looked at "bosonic qubit encodings", which are, for instance, available and utilized in some of the currently most promising and widespread quantum computing machines based on superconducting circuits.

Finding high-performing candidates in the vast search space of bosonic qubit encodings represents a complex optimization task, which the researchers address with reinforcement learning, an advanced machine learning method, where an agent explores a possibly abstract environment to learn and optimize its action policy.

With this, the group found that a surprisingly simple, approximate qubit encoding could not only greatly reduce the device complexity compared to other proposed encodings, but also outperformed its competitors in terms of its capability to correct errors.

Yexiong Zeng, the first author of the paper, says, "Our work not only demonstrates the potential for deploying machine learning towards quantum error correction, but it may also bring us a step closer to the successful implementation of quantum error correction in experiments."

According to Franco Nori, "Machine learning can play a pivotal role in addressing large-scale quantum computation and optimization challenges. Currently, we are actively involved in a number of projects that integrate machine learning, artificial neural networks, quantum error correction, and quantum fault tolerance."

More information: Yexiong Zeng et al, Approximate Autonomous Quantum Error Correction with Reinforcement Learning, Physical Review Letters (2023). DOI: 10.1103/PhysRevLett.131.050601

Journal information: Physical Review Letters

Read more from the original source:
Machine learning contributes to better quantum error correction - Phys.org

Read More..