Page 910«..1020..909910911912..920930..»

3 Up-and-Coming Machine Learning Stocks to Put on Your Must … – InvestorPlace

Source: Sergio Photone / Shutterstock.com

Stocks connected to machine learning are synonymous with those connected to artificial intelligence. Machine learning falls under the umbrella of AI and relates to the use of data and algorithms to imitate human learning to improve accuracy. Kinda scary? Sure. However, machine learning is also proving to be revolutionary in 2023. The emergence of generative AI and its promise to improve our world has created a lot of value. This has led to the rise of machine learning stocks to buy.

While the companies discussed in this article might not be truly up-and-coming as they are established, they certainly are improving. That makes them must-buy stocks that any investor ought to consider.

Source: Below the Sky / Shutterstock.com

There are 13.5 billion reasons Nvidia (NASDAQ:NVDA) why stock should be on every investors list. Im of course referring to Nvidias $13.5 billion in second-quarter revenues. That far exceeded the $11 billion mark, perceived as incredibly ambitious, that Nvidia had given as guidance.

Those blowout earnings lend credence to the notion that AI and machine learning will be much more than a bubble. Instead, it is crystal clear that companies are clamoring for Nvidias leading AI chips and that the pace of things is increasing, not slowing.

Nvidias data center revenues alone at $10.32 billion nearly reached that $11 billion figure. Cloud firms are scrambling to secure their supply of chips that are used for machine learning purposes among other things.

NVDA shares can absolutely run higher from their current position. Their price-to-earnings ratio has temporarily fallen given how unexpectedly high earnings were. Nvidia is predicting $16 billion in revenues for the coming quarter. I dont believe theres any real reason to back off from its shares currently.

Source: T. Schneider / Shutterstock.com

Crowdstrike (NASDAQ:CRWD) is another machine learning stock to consider. The company utilizes machine learning to help it better understand how to stop breaches before they can occur. Its an AI-powered cybersecurity firm that is strongly rated on Wall Street and offers a lot of upside on that basis.

Crowdstrike is getting better and better at thwarting cyber attacks probably by the second. Machine learning allows the company to more intelligently prevent cyber attacks with each piece of data it gathers from an attack.

The company has been growing at a rapid pace over the last few years and has seen year-over-year increases above 40% in each of those periods. However, it has simultaneously struggled to find profitability which likely explains the disconnect between prices and expected prices.

Crowdstrike has several opportunities in front of it. First, if it can address profitability concerns its certain to appreciate in price. Second, theres a general rush toward securing systems that also benefit the company and should provide it fertile ground for future gains.

Source: JHVEPhoto / Shutterstock.com

AMD (NASDAQ:AMD) is the runner-up in the battle for machine learning supremacy at this point.

The stock has boomed in 2023 alongside Nvidia but not to the same degree. It is going to continue to crop up in the machine learning/AI conversation and absolutely makes sense as an investment now.

Lets try to understand AMD in relation to machine learning and its strengths and weaknesses vis-a-vis Nvidia. By now, everyone knows that Nvidia wins the overall battle hands down. When it comes to CPUs, AMD has a lot to offer. Its CPUs, along with those from Intel (NASDAQ:INTC), are the highest rated for machine learning purposes.

However, GPUs outperform CPUs when it comes to machine learning and Nvidia is the king of GPU. It has the highest-rated machine learning GPUs for at least the top five spots according to this source.

As bad as that sounds AMD is roughly 80% as capable as Nvidia overall in relation to AI and machine learning. Therefore, it has a massive opportunity at hand in closing that gap. Its also one of those machine learning stocks to buy.

On the date of publication, Alex Sirois did not have (either directly or indirectly) any positions in the securities mentioned in this article. The opinions expressed in this article are those of the writer, subject to the InvestorPlace.com Publishing Guidelines.

Alex Sirois is a freelance contributor to InvestorPlace whose personal stock investing style is focused on long-term, buy-and-hold, wealth-building stock picks. Having worked in several industries from e-commerce to translation to education and utilizing his MBA from George Washington University, he brings a diverse set of skills through which he filters his writing.

Link:
3 Up-and-Coming Machine Learning Stocks to Put on Your Must ... - InvestorPlace

Read More..

Seismologists use deep learning to forecast earthquakes – University of California, Santa Cruz

For more than 30 years, the models that researchers and government agencies use to forecast earthquake aftershocks have remained largely unchanged. While these older models work well with limited data, they struggle with the huge seismology datasets that are now available.

To address this limitation, a team of researchers at the University of California, Santa Cruz and the Technical University of Munich created a new model that uses deep learning to forecast aftershocks: the Recurrent Earthquake foreCAST (RECAST). In a paper published today in Geophysical Research Letters, the scientists show how the deep learning model is more flexible and scalable than the earthquake forecasting models currently used.

The new model outperformed the current model, known as the Epidemic Type Aftershock Sequence (ETAS) model, for earthquake catalogs of about 10,000 events and greater.

The ETAS model approach was designed for the observations that we had in the 80s and 90s when we were trying to build reliable forecasts based on very few observations, said Kelian Dascher-Cousineau, the lead author of the paper who recently completed his PhD at UC Santa Cruz. Its a very different landscape today. Now, with more sensitive equipment and larger data storage capabilities, earthquake catalogs are much larger and more detailed

Weve started to have million-earthquake catalogs, and the old model simply couldnt handle that amount of data, said Emily Brodsky, a professor of earth and planetary sciences at UC Santa Cruz and co-author on the paper. In fact, one of the main challenges of the study was not designing the new RECAST model itself but getting the older ETAS model to work on huge data sets in order to compare the two.

The ETAS model is kind of brittle, and it has a lot of very subtle and finicky ways in which it can fail, said Dascher-Cousineau. So, we spent a lot of time making sure we werent messing up our benchmark compared to actual model development.

To continue applying deep learning models to aftershock forecasting, Dascher-Cousineau says the field needs a better system for benchmarking. In order to demonstrate the capabilities of the RECAST model, the group first used an ETAS model to simulate an earthquake catalog. After working with the synthetic data, the researchers tested the RECAST model using real data from the Southern California earthquake catalog.

They found that the RECAST model which can, essentially, learn how to learn performed slightly better than the ETAS model at forecasting aftershocks, particularly as the amount of data increased. The computational effort and time were also significantly better for larger catalogs.

This is not the first time scientists have tried using machine learning to forecast earthquakes, but until recently, the technology was not quite ready, said Dascher-Cousineau. New advances in machine learning make the RECAST model more accurate and easily adaptable to different earthquake catalogs.

The models flexibility could open up new possibilities for earthquake forecasting. With the ability to adapt to large amounts of new data, models that use deep learning could potentially incorporate information from multiple regions at once to make better forecasts about poorly studied areas.

We might be able to train on New Zealand, Japan, California and have a model that's actually quite good for forecasting somewhere where the data might not be as abundant, said Dascher-Cousineau.

Using deep-learning models will also eventually allow researchers to expand the type of data they use to forecast seismicity.

Were recording ground motion all the time, said Brodsky. So the next level is to actually use all of that information, not worry about whether were calling it an earthquake or not an earthquake but to use everything."

In the meantime, the researchers hope the model sparks discussions about the possibilities of the new technology.

It has all of this potential associated with it, said Dascher-Cousineau. Because it is designed that way.

Read the original post:
Seismologists use deep learning to forecast earthquakes - University of California, Santa Cruz

Read More..

UW-Madison: Cancer diagnosis and treatment could get a boost … – University of Wisconsin System

Thanks to machine learning algorithms, short pieces of DNA floating in the bloodstream of cancer patients can help doctors diagnose specific types of cancer and choose the most effective treatment for a patient.

The new analysis technique, created by University of WisconsinMadison researchersandpublished recently in Annals of Oncology, is compatible with liquid biopsy testing equipment already approved in the United States and in use in cancer clinics. This could speed the new methods path to helping patients.

Liquid biopsies rely on simple blood draws instead of taking a piece of cancerous tissue from a tumor with a needle.

Marina Sharifi

Liquid biopsies are much less invasive than a tissue biopsy which may even be impossible to do in some cases, depending on where a patients tumor is, saysMarina Sharifi, a professor of medicine and oncologist in UWMadisons School of Medicine and Public Health. Its much easier to do them multiple times over the course of a patients disease to monitor the status of cancer and its response to treatment.

Cancerous tumors shed genetic material, called cell-free DNA, into the bloodstream as they grow. But not all parts of a cancer cells DNA are likely to tumble away. Cells store some of their DNA by coiling it up in protective balls called histones. They unwrap sections to access parts of the genetic code as needed.

Kyle Helzer, a UWMadison bioinformatics scientist, says that parts of the DNA containing the genes that cancer cells use often are uncoiled more frequently and thus are more likely to fragment.

Were exploiting that larger distribution of those regions among cell-free DNA to identify cancer types, adds Helzer, who is also a co-lead author of the study along with Sharifi and scientist Jamie Sperger.

Shuang Zhao

The research team, led by UWMadison senior authorsShuang (George) Zhao, professor of human oncology, andJoshua Lang, professor of medicine, used DNA fragments found in blood samples from a past study of nearly 200 patients (some with, some without cancer), and new samples collected from more than 300 patients treated for breast, lung, prostate or bladder cancers at UWMadison and other research hospitals in the Big Ten Cancer Research Consortium.

The scientists divided each group of samples into two. One portion was used to train a machine-learning algorithm to identify patterns among the fragments of cell-free DNA, relatively unique fingerprints specific to different types of cancers. They used the other portion to test the trained algorithm. The algorithm topped 80 percent accuracy translating the results of a liquid biopsy into both a cancer diagnosis and the specific types of cancer afflicting a patient.

In addition, the machine learning approach was able to tell apart two subtypes of prostate cancer: the most common version, adenocarcinoma, and a swift-progressing variant called neuroendocrine prostate cancer (NEPC) that is resistant to standard treatment approaches. Because NEPC is often difficult to distinguish from adenocarcinoma, but requires aggressive action, it puts oncologists like Lang and Sharifi in a bind.

Joshua Lang

Currently, the only way to diagnose NEPC is via a needle biopsy of a tumor site, and it can be difficult to get a conclusive answer from this approach, even if we have a high clinical suspicion for NEPC, Sharifi says.

Liquid biopsies have advantages, Sperger adds, in that you dont have to know which tumor site to biopsy at, and it is much easier for the patient to get a standard blood draw.

The blood samples were processed using cell-free DNA sequencing technology marketed by Iowa-based Integrated DNA Technologies. Using standard panels like those currently in the clinic is a departure one that can reduce the time and cost of testing from other methods of fragmentomic analysis of cancer DNA in blood samples.

Most commercial panels have been developed around the most important cancer genes that indicate certain drugs for treatment, and they sequence those select genes, says Zhao. What weve shown is that we can use those same panels and same targeted genes to look at the fragmentomics of the cell-free DNA in a blood sample and identify the type of cancer a patient has.

The UW Carbone Cancer Centers Circulating Biomarker Core and Biospecimen Disease-Oriented Team contributed to the collection of the studys hundreds of patient samples.

This research was funded in part by grants from the National Institutes of Health (DP2 OD030734, 1UH2CA260389 and R01CA247479) and the Department of Defense (PC190039, PC200334 and PC180469.)

Written by Chris Barncard

Link to original story: https://news.wisc.edu/algorithmic-blood-test-analysis-will-ease-diagnosis-of-cancer-types-guide-treatment/

See original here:
UW-Madison: Cancer diagnosis and treatment could get a boost ... - University of Wisconsin System

Read More..

Smarter AI: Choosing the Best Path to Optimal Deep Learning – SciTechDaily

Researchers have improved deep learning by selecting the most efficient overall path to the output, leading to a more effective AI without added layers.

Like climbing a mountain via the shortest possible path, improving classification tasks can be achieved by choosing the most influential path to the output, and not just by learning with deeper networks.

Deep Learning (DL) performs classification tasks using a series of layers. To effectively execute these tasks, local decisions are performed progressively along the layers. But can we perform an all-encompassing decision by choosing the most influential path to the output rather than performing these decisions locally?

In an article published today (August 31) in the journal Scientific Reports, researchers from Bar-Ilan University in Israel answer this question with a resounding yes. Pre-existing deep architectures have been improved by updating the most influential paths to the output.

Like climbing a mountain via the shortest possible path, improving classification tasks can be achieved by training the most influential path to the output, and not just by learning with deeper networks. Credit: Prof. Ido Kanter, Bar-Ilan University

One can think of it as two children who wish to climb a mountain with many twists and turns. One of them chooses the fastest local route at every intersection while the other uses binoculars to see the entire path ahead and picks the shortest and most significant route, just like Google Maps or Waze. The first child might get a head start, but the second will end up winning, said Prof. Ido Kanter, of Bar-Ilans Department of Physics and Gonda (Goldschmied) Multidisciplinary Brain Research Center, who led the research.

This discovery can pave the way for better enhanced AI learning, by choosing the most significant route to the top, added Yarden Tzach, a PhD student and one of the key contributors to this work.

This exploration of a deeper comprehension of AI systems by Prof. Kanter and his experimental research team, led by Dr. Roni Vardi, aims to bridge between the biological world and machine learning, thereby creating an improved, advanced AI system. To date they have discovered evidence for efficientdendriticadaptationusingneuronal cultures, as well as how toimplement those findingsin machine learning, showing howshallow networkscan compete with deep ones, and finding themechanism underlying successful deep learning.

Enhancing existing architectures using global decisions can pave the way for improved AI, which can improve its classification tasks without the need for additional layers.

Reference: Enhancing the accuracies by performing pooling decisions adjacent to the output layer 31 August 2023, Scientific Reports.DOI: 10.1038/s41598-023-40566-y

Read more:
Smarter AI: Choosing the Best Path to Optimal Deep Learning - SciTechDaily

Read More..

How Can Hybrid Machine Learning Techniques Help With Effective … – Dataconomy

Apart from many areas in our lives, hybrid machine learning techniques can help us with effective heart disease prediction. So how can the technology of our time, machine learning, be used to improve the quality and length of human life?

Heart disease stands as one of the foremost global causes of mortality today, presenting a critical challenge in clinical data analysis. Leveraging hybrid machine learning techniques, a field highly effective at processing vast healthcare data volumes is increasingly promising in effective heart disease prediction.

According to the World Health Organization, heart disease takes an estimated 17.9 million lives each year. Although many developments in the field of medicine have succeeded in reducing the death rate of heart diseases in recent years, we are failing in the early diagnosis of these diseases. The time has come for us to treat ML and AI algorithms as more than simple trends.

However effective heart disease prediction proves complex due to various contributing risk factors such as diabetes, high blood pressure, and abnormal pulse rates. Several data mining and neural network techniques have been employed to gauge the severity of heart disease but the prediction of it is a different subject.

This ailment is subclinical, and thats why experts recommend check-ups twice a year for anyone over the age of 30. But lets face it, human beings are lazy and look for the simplest way to do something but how hard can it be to accept an effective and technological medical innovation at a time when we can do our weekly shopping at home with a single voice command into our lives?

Heart disease is one of the leading causes of death worldwide and is a significant public health concern. The deadliness of heart disease depends on various factors, including the type of heart disease, its severity, and the individuals overall health. But does that mean we are left without any preventative method? Is there any way to find it out before it happens to us?

The speed of technological development has reached a peak that we never could have imagined, especially in the last three years. This technological journey of humanity, which started with the slow integration of IoT systems such as Alexa into our lives, has peaked in the last quarter of 2022 with the increase in the prevalence and use of ChatGPT and other LLM models. We are no longer far from the concepts of AI and ML, and these products are preparing to become the hidden power behind medical prediction and diagnostics.

Hybrid machine learning techniques can help with effective heart disease prediction by combining the strengths of different machine learning algorithms and utilizing them in a way that maximizes their predictive power.

Hybrid techniques can help in feature engineering, which is an essential step in machine learning-based predictive modeling. Feature engineering involves selecting and transforming relevant variables from raw data into features that can be used by machine learning algorithms. By combining different techniques, such as feature selection, feature extraction, and feature transformation, hybrid machine learning techniques can help identify the most informative features that contribute to effective heart disease prediction.

The choice of an appropriate model is critical in predictive modeling. Hybrid machine learning techniques excel in model selection by amalgamating the strengths of multiple models. By combining, for example, a decision tree with a support vector machine (SVM), these hybrid models leverage the interpretability of decision trees and the robustness of SVMs to yield superior predictions in medicine.

Model ensembles, formed by merging predictions from multiple models, are another avenue where hybrid techniques shine. The synergy of diverse models often surpasses individual model performance, resulting in more accurate heart disease predictions. For instance, a hybrid ensemble uniting a random forest with a gradient-boosting machine leverages both models strengths to increase the prediction accuracy of heart diseases.

Dealing with missing values is a common challenge in medical data analysis. Hybrid machine learning techniques prove beneficial by combining imputation strategies like mean imputation, median imputation, and statistical model-based imputation. This amalgamation helps mitigate the impact of missing values on predictive accuracy.

The proliferation of large datasets poses challenges related to high-dimensional data. Hybrid approaches address this challenge by fusing dimensionality reduction techniques like principal component analysis (PCA), independent component analysis (ICA), and singular value decomposition (SVD) with machine learning algorithms. This results in reduced data dimensionality, enhancing model interpretability and prediction accuracy.

Traditional machine learning algorithms may falter when dealing with non-linear relationships between variables. Hybrid techniques tackle this issue effectively by amalgamating methods such as polynomial feature engineering, interaction term generation, and the application of recursive neural networks. This amalgamation captures non-linear relationships, thus improving predictive accuracy.

Hybrid machine learning techniques enhance model interpretability by combining methodologies that shed light on the models decision-making process. For example, a hybrid model coupling a decision tree with a linear model offers interpretability akin to decision trees alongside the statistical significance provided by linear models. This comprehensive insight aids in better understanding and trustworthiness of heart disease predictions.

Multiple studies have explored heart disease prediction using hybrid machine learning techniques One such novel method, designed to enhance prediction accuracy, incorporates a combination of hybrid machine learning techniques to identify significant features for cardiovascular disease prediction.

Mohan, Thirumalai, and Srivastava propose a novel method for heart disease prediction that uses a hybrid of machine learning techniques. The method first uses a decision tree algorithm to select the most significant features from a set of patient data.

The researchers compared their method to other machine learning methods for heart disease prediction, such as logistic regression and naive Bayes. They found that their method outperformed these other methods in terms of accuracy.

The decision tree algorithm used to select features is called the C4.5 algorithm. This algorithm is a popular choice for feature selection because it is relatively simple to understand and implement, and it has been shown to be effective in a variety of applications including effective heart disease prediction.

The SVM classifier used to predict heart disease is a type of machine learning algorithm that is known for its accuracy and robustness. SVM classifiers work by finding a hyperplane that separates the data points into two classes. In the case of heart disease prediction, the two classes are patients with heart disease and patients without heart disease.

Exploring the leading AI medical scribes

The researchers suggest that their method could be used to develop a clinical decision support system for the early detection of heart disease. Such a system could help doctors to identify patients who are at high risk of heart disease and to provide them with preventive care.

The authors method has several advantages over other machine learning methods for effective heart disease prediction. First, it is more accurate. Second, it is more robust to noise in the data. Third, it is more efficient to train and deploy.

The authors method is still under development, but it has the potential to be a valuable tool for the early detection of heart disease. The authors plan to further evaluate their method on larger datasets and to explore ways to improve its accuracy.

In addition to the advantages mentioned by the authors, their method also has the following advantages:

The authors evaluated their method on a dataset of 13,000 patients. The dataset included information about the patients age, sex, race, smoking status, blood pressure, cholesterol levels, and other medical history. The authors found that their method was able to predict heart disease with an accuracy of 87.2%.

In another study by Bhatt, Patel, Ghetia, and Mazzero which investigated the use of machine learning (ML) techniques to effectively predict heart disease in 2023, the researchers used a dataset of 1000 patients with heart disease and 1000 patients without heart disease. They used four different ML techniques: decision trees, support vector machines, random forests, and neural networks.

The researchers found that all four ML techniques were able to predict heart disease with a high degree of accuracy. The decision tree algorithm had the highest accuracy, followed by the support vector machines, random forests, and neural networks.

The researchers also found that the accuracy of the ML techniques was improved when they were used in combination with each other. For example, the decision tree algorithm combined with the support vector machines had the highest accuracy of all the models.

The studys findings suggest that ML techniques can be used as an effective tool for predicting heart disease. The researchers believe that these techniques could be used to develop early detection and prevention strategies for heart disease.

In addition to the findings mentioned above, the study also found that the following factors were associated with an increased risk of heart disease:

The studys findings highlight the importance of early detection and prevention of heart disease. By identifying people who are at risk for heart disease, we can take steps to prevent them from developing the disease.

The study is limited by its small sample size. However, the findings are promising and warrant further research. Future studies should be conducted with larger sample sizes to confirm the findings of this study.

Predicting heart disease using hybrid machine learning techniques is an evolving field with several challenges and promising future directions.

One of the primary challenges is obtaining high-quality and sufficiently large datasets for training hybrid models. This involves collecting diverse patient data, including clinical, genetic, and lifestyle factors. Choosing the most relevant features from a large pool is crucial. Hybrid techniques aim to combine different feature selection methods to enhance prediction accuracy.

Deciding which machine learning algorithms to use in hybrid models is critical. Researchers often experiment with various algorithms like random forest, K-nearest neighbor, and logistic regression to find the best combination. Interpreting hybrid model predictions can be challenging due to their complexity. Ensuring transparency and interpretability is essential for clinical acceptance.

The class distribution in heart disease datasets can be imbalanced, with fewer positive cases. Addressing this imbalance is vital for accurate predictions. Ensuring that hybrid models also generalize well to unseen data is a constant concern. Techniques like cross-validation and robust evaluation methods are crucial.

Future directions in effective heart disease prediction using hybrid machine learning techniques encompass several key areas.

A prominent trajectory in the field involves the customization of treatment plans based on individual patient profiles, a trend that continues to gain momentum. Hybrid machine learning models are poised to play a pivotal role in this endeavor by furnishing personalized risk assessments. This approach holds great promise for tailoring interventions to patients unique needs and characteristics, potentially improving treatment outcomes.

The integration of multi-omics data, including genomics, proteomics, and metabolomics, with clinical information represents a compelling avenue for advancing effective heart disease prediction. By amalgamating these diverse data sources, hybrid model techniques can generate more accurate predictions. This holistic approach has the potential to provide deeper insights into the underlying mechanisms of heart disease and enhance predictive accuracy.

As the complexity of hybrid machine learning model techniques increases, ensuring that these models are interpretable and provide transparent explanations for their predictions becomes paramount. The development of hybrid models that offer interpretable explanations can significantly enhance their clinical utility. Healthcare professionals can better trust and utilize these models in decision-making processes, ultimately benefiting patient care.

Another promising direction involves the integration of real-time patient data streams with hybrid models. This approach enables continuous monitoring of patients, facilitating early detection and intervention in cases of heart disease. By leveraging real-time data, hybrid models can provide timely insights, potentially preventing adverse cardiac events and improving patient outcomes.

Collaboration stands as a cornerstone for future progress in effective heart disease prediction using hybrid machine learning techniques. Effective collaboration between medical experts, data scientists, and machine learning researchers is instrumental in driving innovation. Combining domain expertise with advanced computational methods can lead to breakthroughs in hybrid models accuracy and clinical applicability for heart disease prediction.

While heart disease prediction using hybrid machine learning techniques faces data, model complexity, and interpretability challenges, it holds promise for personalized medicine and improving patient outcomes through early detection and intervention. Collaboration and advancements in data collection and analysis methods will continue to shape the future of this field and perhaps humanity.

Featured image credit: rawpixel.com/Freepik

Excerpt from:
How Can Hybrid Machine Learning Techniques Help With Effective ... - Dataconomy

Read More..

Electronic health records and stratified psychiatry: bridge to … – Nature.com

Development of an ML prediction model involves a multi-step process [11]. Briefly, labeled data are partitioned into training and test subsets. The data subsets undergo preprocessing to minimize the impact of dataset anomalies (e.g., missing values, outliers, redundant features) on the algorithms learning process. The algorithm is applied to the training data, learning the relationship between the features and predictive target. Performance is typically evaluated via cross-validation to estimate the models performance on new observations (internal validation). However, this only approximates a models ability to generalize to unseen data. Prediction models must demonstrate the ability to generalize to independent datasets (external validation) [12]. Ideally, external validation should occur in a separate study by a different analytic team [13]. Clinical validation involves assessing a models generalization to real world data as well as potential clinical utility and impact. Randomized cluster trials, for instance, evaluate groups of patients randomly assigned to receive care based on a models prediction versus care-as-usual.

Few examples exist of predictive ML models advancing to clinical validation in psychiatry, indicative of a sizeable translational gap. Delgadillo et al. compared the efficacy and cost of stratified care compared to stepped care for a psychological intervention for depression (n=951 patients) in a cluster randomized trial [14]. The investigators previously developed a ML prediction model to classify patients as standard or complex cases using self-reported measures and sociodemographic information extracted from clinical records (n=1512 patients) [15]. In the prospective trial, complex cases were matched to high-intensity treatment and standard cases to low-intensity treatment. Stratified care was associated with a 7% increase in the probability of improvement in depressive symptoms at a modest ~$140 increase in cost per patient [14].

What is driving this translational gap? Much of it may relate to challenges in generalizing models beyond their initial training data. There are no silver bullets in the development of ML prediction models and many potential pitfalls. The most common are overfitting and over-optimism due to insufficient training data, excess complexity improper (or lack of) cross-validation, and/or data leakage [16,17,18].

Most published ML studies in psychiatry suffer these methodological flaws [3,4,5]. Tornero-Costa et al. reviewed 153ML applications in mental health and found only one study to be at low risk of bias by the Prediction model Risk Of Bias ASsessment Tool (PROBAST) criteria [3]. Approximately 37.3% of studies used a sample size of 150 or less to train models. Details on preprocessing were completely absent in 36.6% of studies and 47.7% lacked a description of data missingness. Only 13.7% of studies attempted external validation. Flaws in the analysis domain (e.g., attempts to control overfitting and optimism) contributed significantly to bias risk in most applications (90.8%). Furthermore, in 82.3% of the studies, data and developed model were not publicly accessible. Two other systematic reviews also found overall high risk of bias (>90%) among ML prediction studies, including poor reporting of preprocessing steps as well as low rates of internal and external validation [4, 5]. Meehan et al. additionally reported that only 22.7% of studies (of those meeting statistical standards) appropriately embedded feature selection within cross-validation to avoid data leakage [5].

The precise degree to which published ML prediction models overestimate their ability to generalize is difficult to estimate. In the area of prognosis prediction, Rosen et al. assessed 22 published prediction models of transition to psychosis in individuals at clinical high-risk [19]. Models were assessed for external validation from a multisite, naturalistic study. Only two models demonstrated good (AUC>=0.7) performance and 9 models failed to achieve better than chance (AUC=0.5) prediction. None of the models outperformed the clinician raters (AUC=0.75) [19].

The model development process is vulnerable to human inductive biases, which can inflate model performance estimates due to unintentional errors or deliberate gaming for publication [17, 20]. Performance scores have become inappropriately prioritized in peer review due to erroneous higher=better assumptions. Most studies employ a single algorithm without justifying its selection or compare multiple algorithms performance on the same dataset, then select the best performing one (multiple testing issue) [17, 21]. Software packages like PyCaret (Python) offer the ability to screen the performance of a dozen or more algorithms on a dataset in a single step. This analytic flexibility creates risk, because even random data can be tuned to significance solely through manipulation of hyperparameters [17].

Methodological shortcomings offer only partial explanation for the observed translational gap. As the saying goes, garbage in, garbage out. Low quality, small, or biased training data can generate unreliable models with poor generalization to new observations or worse, make unfair predictions that adversely impact patients. Ideal ML training data is large, representative of the population of interest, complete (low missingness), balanced, and possesses accurate and consistent feature and predictive target labels or values (low noise). Per the systematic reviews above, these data quality criteria have been often neglected [3,4,5].

EHR data share many of the same quality issues impacting data collected explicitly for research, as well as some unique challenges that have deterred its use for ML in the past [22,23,24]. EHR data are highly heterogenous, encompassing both structured and unstructured elements. Structured data is collected through predefined fields (e.g., demographics, diagnoses, lab results, medications, sensor readings). Unstructured data is effectively everything else, including imaging and text. Extracting meaningful features from unstructured EHR data is non-trivial and often requires supervised and unsupervised ML techniques.

The quality of EHR data can vary by physician and clinical site. Quality challenges with EHR data that can adversely impact ML models for stratified psychiatry include:

EHR populations are non-random samples, which may create differences between the training data population and the target population [25]. Patients with more severe symptoms or treatment resistance may be frequently referred. Factors other than need for treatment (e.g., insurance status, referral, specialty clinics) can lead to systematic overrepresentation or underrepresentation of certain groups or disorders in the data. Marginalized populations, such as racial and ethnic minorities, for example, face barriers to accessing care and may be absent in the data [26]. When an algorithm trains on data that is not diverse, the certainty of the models predictions is questionable for unrepresented groups (high epistemic uncertainty) [27]. This may lead to unfair predictions (algorithmic bias) [28].

Missing data are common in EHRs. The impacts of missing data on model performance can be severe, especially when the data are missing not at random or missing at random but with a high proportion of missing values [29]. Furthermore, the frequency of records can vary substantially by patient. One individual may have multiple records in a period, others may have none [30]. Does absence of a diagnosis indicate true lack of a disorder or simply reflect that the patient received care elsewhere during a given interval? Structured self-reported patient outcome measures (e.g., psychometric measures) are often missing or incomplete [31].

Feature and target labels or values provide the ground truth for learning. Inaccuracies and missingness generate noise, which can hinder effective learning. The lineage of a given data element is important in considering its reliability and validity. For example, a patients diagnoses may be extracted from clinical notes, encounter/billing data, or problem lists (often not dated or updated) [32]. In some cases, the evaluating practitioner enters the encounter-associated diagnostic codes; in other instances, these are abstracted by a medical billing agent, creating uncertainty.

Imaging and sensor-based data may be collected using different acquisition parameters and equipment, leading to variability in measurements across EHRs and over time [33]. Data may be collected using different coding systems (e.g., DSM, ICD), the criteria for which also change over time. These issues can hinder external validation as well as contribute to data drift with the potential for deterioration in model performance [34].

When data are imbalanced, ML classification models may be more likely to predict the majority class, resulting in a high accuracy but low sensitivity or specificity for the minority class [35]. The consequences of data imbalance can be severe, particularly when the minority class is the most clinically relevant (e.g., patients with suicidal ideation who go on to attempt, adverse drug reactions).

Patient records represent a sequence of events over time [36]. Diagnostic clarification may create conflicts (e.g., depression later revealed to be bipolar disorder), depending on the forward and lookback windows used to create a dataset. Failure to appropriately account for the longitudinal nature of a patients clinical course can contribute to data leakage. Temporal data leakage occurs when future information is inadvertently used to make predictions for past events (e.g., including a future co-morbidity when predicting response to past treatment). Feature leakage occurs when variables expose information about the prediction target.

Empirical evidence indicates that preprocessing techniques can just as easily mitigate as exacerbate underlying data quality and bias issues. For example, missing data may be handled by complete case analysis (i.e., removal of observations with missing features) or imputation [37]. If data are not missing completely at random, deletion may eliminate key individuals [29]. Fernando et al. found that records containing missing data tended to be fairer than complete records and that their removal could contribute to algorithmic bias [38]. In the case of imputation, if the estimated values do not accurately represent the true underlying data, replacing missing values may inject error (e.g., imputing scores for psychometric scale items absent due to skip logic) and impact feature selection [39].

EHR data often require the creation of proxy features and outcomes to capture concepts (e.g., continuous prescription refills as an indicator of treatment effectiveness) or to reduce feature and label noise [40, 41]. No standards currently exist to guide such decisions or their reporting, creating high risk for bias. For example, if attempting to determine cannabis use when a patient was treated with a given antidepressant, one could check for a DSM/ICD diagnosis in their encounters or problem list, mine clinical notes to see whether use was endorsed/denied, or examine urine toxicology for positive/negative results. Each choice carries a different degree of uncertainty. Absence of evidence does not indicate evidence of absence [42], although studies often make that assumption.

See the original post here:
Electronic health records and stratified psychiatry: bridge to ... - Nature.com

Read More..

Working with Undirected graphs in Machine Learning part2 – Medium

Author : Shyan Akmal, Virginia Vassilevska Williams, Ryan Williams, Zixuan Xu

Abstract : The k-Detour problem is a basic path-finding problem: given a graph G on n vertices, with specified nodes s and t, and a positive integer k, the goal is to determine if G has an st-path of length exactly dist(s,t)+k, where dist(s,t) is the length of a shortest path from s to t. The k-Detour problem is NP-hard when k is part of the input, so researchers have sought efficient parameterized algorithms for this task, running in f(k)poly(n) time, for f as slow-growing as possible. We present faster algorithms for k-Detour in undirected graphs, running in 1.853kpoly(n) randomized and 4.082kpoly(n) deterministic time. The previous fastest algorithms for this problem took 2.746kpoly(n) randomized and 6.523kpoly(n) deterministic time [Bezkov-Curticapean-Dell-Fomin, ICALP 2017]. Our algorithms use the fact that detecting a path of a given length in an undirected graph is easier if we are promised that the path belongs to what we call a bipartitioned subgraph, where the nodes are split into two parts and the path must satisfy constraints on those parts. Previously, this idea was used to obtain the fastest known algorithm for finding paths of length k in undirected graphs [Bjrklund-Husfeldt-Kaski-Koivisto, JCSS 2017]. Our work has direct implications for the k-Longest Detour problem: in this problem, we are given the same input as in k-Detour, but are now tasked with determining if G has an st-path of length at least dist(s,t)+k. Our results for k-Detour imply that we can solve k-Longest Detour in 3.432kpoly(n) randomized and 16.661kpoly(n) deterministic time. The previous fastest algorithms for this problem took 7.539kpoly(n) randomized and 42.549kpoly(n) deterministic time [Fomin et al., STACS 2022].

2.Learning Spanning Forests Optimally using CUT Queries in Weighted Undirected Graphs (arXiv)

Author : Hang Liao, Deeparnab Chakrabarty

Abstract : In this paper we describe a randomized algorithm which returns a maximal spanning forest of an unknown {em weighted} undirected graph making O(n) CUT queries in expectation. For weighted graphs, this is optimal due to a result in [Auza and Lee, 2021] which shows an (n) lower bound for zero-error randomized algorithms. %To our knowledge, it is the only regime of this problem where we have upper and lower bounds tight up to constants. These questions have been extensively studied in the past few years, especially due to the problems connections to symmetric submodular function minimization. We also describe a simple polynomial time deterministic algorithm that makes O(nlognloglogn) queries on undirected unweighted graphs and returns a maximal spanning forest, thereby (slightly) improving upon the state-of-the-art.

See the rest here:
Working with Undirected graphs in Machine Learning part2 - Medium

Read More..

Generative AI at an inflection point: What’s next for real-world … – VentureBeat

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here

Generative AI is gaining wider adoption, particularly in business.

Most recently, for instance, Walmart announced that it is rolling-out a gen AI app to 50,000 non-store employees. As reported by Axios, the app combines data from Walmart with third-party large language models (LLM) and can help employees with a range of tasks, from speeding up the drafting process, to serving as a creative partner, to summarizing large documents and more.

Deployments such as this are helping to drive demand for graphical processing units (GPUs) needed to train powerful deep learning models. GPUs are specialized computing processors that execute programming instructions in parallel instead of sequentially as do traditional central processing units (CPUs).

According to the Wall Street Journal, training these models can cost companies billions of dollars, thanks to the large volumes of data they need to ingest and analyze. This includes all deep learning and foundational LLMs from GPT-4 to LaMDA which power the ChatGPT and Bard chatbot applications, respectively.

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

The gen AI trend is providing powerful momentum for Nvidia, the dominant supplier of these GPUs: The company announced eye-popping earnings for their most recent quarter. At least for Nvidia, it is a time of exuberance, as it seems nearly everyone is trying to get ahold of their GPUs.

Erin Griffiths wrote in the New York Times that start-ups and investors are taking extraordinary measures to obtain these chips: More than money, engineering talent, hype or even profits, tech companies this year are desperate for GPUs.

In his Stratechery newsletter this week, Ben Thompson refers to this as Nvidia on the Mountaintop. Adding to the momentum, Google and Nvidia announced a partnership whereby Googles cloud customers will have greater access to technology powered by Nvidias GPUs. All of this points to the current scarcity of these chips in the face of surging demand.

Does this current demand mark the peak moment for gen AI, or might it instead point to the beginning of the next wave of its development?

Nvidia CEO Jensen Huang said on the companys most recent earnings call that this demand marks the dawn of accelerated computing. He added that it would be wise for companies to divert the capital investment from general purpose computing and focus it on generative AI and accelerated computing.

General purpose computing is a reference to CPUs that have been designed for a broad range of tasks, from spreadsheets to relational databases to ERP. Nvidia is arguing that CPUs are now legacy infrastructure, and that developers should instead optimize their code for GPUs to perform tasks more efficiently than traditional CPUs.

GPUs can execute many calculations simultaneously, making them perfectly suited for tasks like machine learning (ML), where millions of calculations are performed in parallel. GPUs are also particularly adept at certain types of mathematical calculations such as linear algebra and matrix manipulation tasks that are fundamental to deep learning and gen AI.

However, other classes of software (including most existing business applications), are optimized to run on CPUs and would see little benefit from the parallel instruction execution of GPUs.

Thompson appears to hold a similar view: My interpretation of Huangs outlook is that all of these GPUs will be used for a lot of the same activities that are currently run on CPUs; that is certainly a bullish view for Nvidia, because it means the capacity overhang that may come from pursuing generative AI will be backfilled by current cloud computing workloads.

He continued: That noted, Im skeptical: Humans and companies are lazy, and not only are CPU-based applications easier to develop, they are also mostly already built. I have a hard time seeing what companies are going to go through the time and effort to port things that already run on CPUs to GPUs.

Matt Assay of InfoWorld reminds us that we have seen this before. When machine learning first arrived, data scientists applied it to everything, even when there were far simpler tools. As data scientist Noah Lorang once argued, There is a very small subset of business problems that are best solved by machine learning; most of them just need good data and an understanding of what it means.'

The point is, accelerated computing and GPUs are not the answer for every software need.

Nvidia had a great quarter, boosted by the current gold-rush to develop gen AI applications. The company is naturally ebullient as a result. However, as we have seen from the recent Gartner emerging technology hype cycle, gen AI is having a moment and is at the peak of inflated expectations.

According to Singularity University and XPRIZE founder Peter Diamandis, these expectations are about seeing future potential with few of the downsides. At that moment, hype starts to build an unfounded excitement and inflated expectations.

To this very point, we could soon reach the limits of the current gen AI boom. As venture capitalists Paul Kedrosky and Eric Norlin of SK Ventures wrote on their firms Substack: Our view is that we are at the tail end of the first wave of large language model-based AI. That wave started in 2017, with the release of the [Google] transformers paper (Attention is All You Need), and ends somewhere in the next year or two with the kinds of limits people are running up against.

Those limitations include the tendency to hallucinations, inadequate training data in narrow fields, sunsetted training corpora from years ago, or myriad other reasons. They add: Contrary to hyperbole, we are already at the tail end of the current wave of AI.

To be clear, Kedrosky and Norlin are not arguing that gen AI is at a dead-end. Instead, they believe there needs to be substantial technological improvements to achieve anything better than so-so automation and limited productivity growth. The next wave, they argue, will include new models, more open source, and notably ubiquitous/cheap GPUs which if correct may not bode well for Nvidia, but would benefit those needing the technology.

As Fortune noted, Amazon has made clear its intentions to directly challenge Nvidias dominant position in chip manufacturing. They are not alone, as numerous startups are also vying for market share as are chip stalwarts including AMD. Challenging a dominant incumbent is exceedingly difficult. In this case, at least, broadening sources for these chips and reducing prices of a scarce technology will be key to developing and disseminating the next wave of gen AI innovation.

The future for gen AI appears bright, despite hitting a peak of expectations existing limitations of the current generation of models and applications. The reasons behind this promise are likely several, but perhaps foremost is a generational shortage of workers across the economy that will continue to drive the need for greater automation.

Although AI and automation have historically been viewed as separate, this point of view is changing with the advent of gen AI. The technology is increasingly becoming a driver for automation and resulting productivity. Workflow company Zapier co-founder Mike Knoop referred to this phenomenon on a recent Eye on AI podcast when he said: AI and automation are mode collapsing into the same thing.

Certainly, McKinsey believes this. In a recent report they stated: generative AI is poised to unleash the next wave of productivity. They are hardly alone. For example, Goldman Sachs stated that gen AI could raise global GDP by 7%.

Whether or not we are at the zenith of the current gen AI, it is clearly an area that will continue to evolve and catalyze debates across business. While the challenges are significant, so are the opportunities especially in a world hungry for innovation and efficiency. The race for GPU domination is but a snapshot in this unfolding narrative, a prologue to the future chapters of AI and computing.

Gary Grossman is senior VP of the technology practice at Edelman and global lead of the Edelman AI Center of Excellence.

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even considercontributing an articleof your own!

Read More From DataDecisionMakers

Continued here:
Generative AI at an inflection point: What's next for real-world ... - VentureBeat

Read More..

CyberConnect Rejects Proposal CP-1 After CYBER Coin Dropped … – Business Blockchain HQ

CyberConnect, a blockchain project focused ondecentralized social protocol championing identity sovereignty for mass adoption and network effects, announced the rejection of its much-anticipated snapshot proposal CP-1. The official Twitter account of CyberConnect @CyberConnectHQ stated,

There was a mistake in the snapshot proposal CP-1 and so it was rejected. The intended usage of Community Treasury for providing liquidity was 1,088,000 CYBER which was unlocked already.

What The Proposal CP-1 Says

The primary objective of proposal CP-1 was to enhance the liquidity of the CYBER token across Ethereum (ETH), Binance Smart Chain (BSC), and Optimism networks. It states:

To optimize CYBER liquidity across ETH, BSC, Optimism networks, we propose a series of active balancing strategies for CYBER token on these networks.

The proposal outlined a series of active balancing strategies, including:

Deployment of Bridges: The plan was to deploy CYBER-ETH, CYBER-BSC, and CYBER-OP bridges, powered by LayerZeros ProxyOFT. This would have allowed users to bridge CYBER tokens from any chain to another via Stargate. LayerZero Documentation

Utilization of Community Treasury: The CyberConnect foundation intended to use 1,088,000 unlocked CYBER tokens from the Community Treasury to provide liquidity for these bridges. The foundation aimed to maintain 25,000 CYBER in each of the CYBER-ETH, CYBER-BSC, and CYBER-OP bridges.

Supply Maintenance: In case of liquidity issues on any of the networks, the foundation would burn and mint CYBER tokens across chains to maintain a balanced supply. The total supply of CYBER tokens across all chains would remain constant at 100,000,000.

Market Reaction: A 40% Drop in CYBER Token Value

Following the announcement of the proposal, the value of CyberConnects native token, CYBER, experienced a sharp decline, plummeting over 40% within hours. This has raised concerns among traders that theCyberConnectproject owners and early investers would like to dump CYBER.However, upon the subsequent rejection of the proposal, the CYBER token experienced a slight rebound in price.

Image source: Shutterstock

Go here to read the rest:

CyberConnect Rejects Proposal CP-1 After CYBER Coin Dropped ... - Business Blockchain HQ

Read More..

Introduction to TLifeCoin: Redefining Finance in a Digital Era – webindia123

ATK

New Delhi [India], September 2: TLifeCoin operates on the Binance Smart Chain (BSC), ensuring secure and decentralized transactions. With a focus on financial inclusivity, TLifeCoin provides users with a versatile digital asset for seamless transactions, borderless payments, and participation in a thriving ecosystem.

Roadmap to Success: TLifeCoin's Vision and Achievements

TLifeCoin's roadmap outlines a transformative journey:

1. Blockchain Advancements: TLifeCoin is dedicated to enhancing its presence on the Binance Smart Chain (BSC) for even faster, secure, and transparent transactions.

2. Tallwin Exchange Launch: Our efficient trading platform on BSC is set to launch, providing users a robust marketplace for digital assets.

3. Tallwin Metaverse Integration: TLifeCoin is stepping into the virtual universe on BSC, offering immersive experiences within a decentralized metaverse.

4. NFT Marketplace Debut: We're introducing a vibrant NFT marketplace on BSC, enabling users to create, buy, and trade unique digital assets.

5. Gaming Zone Excitement: Engage in innovative gaming experiences on BSC fueled by TLifeCoin's integration.

6. Digital Payment Revolution: TLifeCoin's digital payment solutions on BSC are bridging convenience and innovation.

Listing Milestones: TLifeCoin on Coinsbit and Azbit

TLifeCoin's journey is marked by significant milestones:

Coinsbit Listing: TLifeCoin was listed on Coinsbit on August 9, 2023. Deposits, trading, and withdrawals are available for the TLIFE/USDT trading pair.

Azbit Listing: TLifeCoin is also listed on Azbit Exchange, extending its reach and providing users with new avenues for participation.

Security Validation: Cyberscope & Hacken Audit Achievement

TLifeCoin's commitment to security is validated by its recent audit by Cyberscope and hacken. These audit further solidifies our dedication to providing a safe and secure environment for users' assets.

Tokenomics: A Sustainable Ecosystem

TLifeCoin's tokenomics ensure a balanced and sustainable ecosystem on the Binance Smart Chain (BSC):

- Total Supply: 21,000,000 TLIFE

- Staking Rewards: 60%

- Team: 10%

- Marketing & Development: 10%

- Treasury: 10%

- For Staking: 10%

A Visionary Future on Binance Smart Chain (BSC): TLifeCoin's Commitment

With its visionary roadmap, impressive listing achievements, robust tokenomics, and security validation through the Cyberscope audit, TLifeCoin is poised to redefine the future of finance on the Binance Smart Chain (BSC).

For more information about TLifeCoin, visit and explore our social channels:

Twitter: @TLifeCoin

Telegram: @tlife_coin

Medium: @tlifecoin

TLifeCoin is a secure and transparent cryptocurrency project focused on revolutionizing the digital finance landscape on the Binance Smart Chain (BSC). With innovative solutions, a commitment to empowerment, and a robust ecosystem on BSC, TLifeCoin aims to reshape the future of finance for individuals worldwide.

(Disclaimer: The above press release has been provided by ATK. ANI will not be responsible in any way for the content of the same)

Read the original:

Introduction to TLifeCoin: Redefining Finance in a Digital Era - webindia123

Read More..