Page 350«..1020..349350351352..360370..»

Creation of a machine learning-based prognostic prediction model for various subtypes of laryngeal cancer | Scientific … – Nature.com

Data collection

The Surveillance, Epidemiology, and End Results Program (SEER) database was used to gather the study's data (Incidence-Seer Research Plus Data 17 Registries Nov 2021 Sub). Using the SEER*Stat program (version 8.4.2), we retrieved individuals who had been given a larynx carcinoma diagnosis by the third edition of the International Classification of Oncology Diseases (ICD-O-3). The period frame covers instances handled between 2000 and 2019. The following were the inclusion requirements: The behavior was identified as malignant and encoded by position and shape as "larynx".

In total, 54,613 patients with primary laryngeal malignant tumors were included. The median follow-up duration of the sample in this study is 38months. We used the following exclusion criteria to clean up the data: (1) Patients with limited follow-up information; (2) Patients without T stage (AJCC7), N stage (AJCC7), M stage (AJCC7), or AJCC stage grade information.

We selected variables that were directly related to the clinic, such as age, race, and gender, based on clinical experience. We chose the T stage, N stage, M stage, AJCC stage (AJCC stage 7), tumor size, and pathological categorization to assess the patient's health. Finally, to evaluate the patient's treatment plans, we also included radiation therapy, surgery, and chemotherapy.

A classic model for survival analysis, the Cox proportional hazards (CoxPH) model has been the most commonly applied multifactor analysis technique in survival analysis to date18,19.

CoxPH is a statistical technique for survival analysis, which is mainly used to study the relationship between survival time and one or more predictors. The core of the model is the proportional risk hypothesis.

It is expressed as h(t|x)=h0 (t) exp (|x), h(t|x) is the instantaneous risk function under the given covariable x, h0 (t) is the baseline risk function, on the other hand, exp ( x) represents the multiplicative effect of covariates on risk.

The random survival forest (RSF) model is an extremely efficient integrated learning model that can handle complex data linkages and is made up of numerous decision trees20.

RSF can improve the accuracy and robustness of the prediction, but it does not have a single expression because it is an integrated model consisting of multiple decision trees21. RSF constructs 1000 trees and calculates the importance of variables. To find the optimal model parameters, we adjust three key parameters: the maximum number of features of the tree (mtry), the minimum sample size of each node (nodesize), and the maximum depth of the tree (nodedepth). The values of these parameters are set to mtry from 1 to 10, nodesize from 3 to 30, and nodedepth from 3 to 6. We use a random search strategy (RandomSearch) to optimize the parameters. To evaluate the performance of the model under different parameter configurations, we use tenfold cross-validation and use C-index (ConcordanceIndex) as the evaluation index. The purpose of this process is to find the parameter configuration that can maximize the prediction accuracy of the model through many iterations.

One of the integrated learning methods called Boosting is the gradient boosting machine (GBM) model, which constructs a strong prediction model by combining several weak prediction models (usually decision trees). At each step, GBM adds a new weak learner by minimizing the loss function. The newly added model is trained to reduce the residual generated in the previous step, and the direction is determined by the gradient descent method. It can be expressed as Fm+1(x)=Fm(x)+mhm(x). Where the Fm(x) is a weak model newly added, and the m is the learning rate.

XGBoost is an efficient implementation of GBM, especially in optimizing computing speed and efficiency. To reuse the learner with the highest performance, it linearly combines the base learner with various weights22. eXtreme Gradient Boosting (XGBoost) is an optimization of the Gradient Boosting Decision Tree (GBDT), which boosts the algorithm's speed and effectiveness23. The neural network-based multi-task logic regression model developed by Deepsurv outperforms the conventional linear survival model in terms of performance24. DeepSurv uses a deep neural network to simulate the Cox proportional hazard model. Therefore, deepsurv can be expressed as h(t|x)=h0 (t) exp (g(x)), Where the g (x) is the output of the neural network, which represents the linear combination of the covariable x8.

We categorize five models to adapt to various variable screening techniques used with various models. The RSF, GBM, and XGBoost models are screened using the least absolute shrinkage and selection operator (LASSO) regression analysis, while the CoxPH model is screened using the traditional Univariate and multivariate Cox regression analysis25,26,27.

In contrast, the Deepsurv model can automatically extract features and handle high-dimensional data and nonlinear relationships, so variable screening is not necessary28. We randomly split the data set into t and v datasets (training set and validation set) and test set in the ratio of 9:1 using spss (version 26) to further illustrate the model's dependability. Randomly selected 10% of the data as external verification. Once more, the ratio of 7:3 is used to divide the training set and validation set, and for both splits, the log-rank test is used to evaluate any differences between the two cohorts. The mlr3 package of R (version 4.2.2) uses the grid search approach to fine-tune the hyperparameters in the RSF, GBM, and XGBoost models in the validation set and chooses the most beneficial hyperparameters to build the survival model once the variables have been filtered following the aforementioned stages. Finally, the Deepsurv model is constructed using the Python (version 3.9) sksurv package, and the model is additionally optimized using grid search.

We used the integrated Brier score (IBS), which is appropriate for 1-year, 3-year, and 5-year time points, as the major assessment metric when evaluating the prediction performance of the model in the test set. In addition, the calibration curve is drawn and the conventional time-dependent receiver operating characteristic (ROC) curve as well as the area under the curve (AUC) (1year, 3years, and 5years) are compared. By calculating the clinical net benefit to address the actual needs of clinical decisions, Decision Curve Analysis (DCA), a clinical evaluation prediction model, incorporates the preferences of patients or decision-makers into the analysis. Calculating the various clinicopathological characteristics is also required for the prognosis of contribution. We visualized the survival contribution of several clinicopathological characteristics for 1-year, 3-years, and 5-years using The Shapley Additive Explanations (SHAP) plot.

Clinically speaking, various individuals require personalized care. Consequently, it is crucial to estimate the likelihood that a single patient will survive. The survival probability of a certain patient is predicted using the ggh4x package of R (version 4.2.2), along with the contribution of several clinicopathological characteristics to survival. This has major clinical work implications.

Read more:
Creation of a machine learning-based prognostic prediction model for various subtypes of laryngeal cancer | Scientific ... - Nature.com

Read More..

Optimize price-performance of LLM inference on NVIDIA GPUs using the Amazon SageMaker integration with NVIDIA … – AWS Blog

NVIDIA NIM microservices now integrate with Amazon SageMaker, allowing you to deploy industry-leading large language models (LLMs) and optimize model performance and cost. You can deploy state-of-the-art LLMs in minutes instead of days using technologies such as NVIDIA TensorRT, NVIDIA TensorRT-LLM, and NVIDIA Triton Inference Server on NVIDIA accelerated instances hosted by SageMaker.

NIM, part of the NVIDIA AI Enterprise software platform listed on AWS marketplace, is a set of inference microservices that bring the power of state-of-the-art LLMs to your applications, providing natural language processing (NLP) and understanding capabilities, whether youre developing chatbots, summarizing documents, or implementing other NLP-powered applications. You can use pre-built NVIDIA containers to host popular LLMs that are optimized for specific NVIDIA GPUs for quick deployment or use NIM tools to create your own containers.

In this post, we provide a high-level introduction to NIM and show how you can use it with SageMaker.

NIM provides optimized and pre-generated engines for a variety of popular models for inference. These microservices support a variety of LLMs, such as Llama 2 (7B, 13B, and 70B), Mistral-7B-Instruct, Mixtral-8x7B, NVIDIA Nemotron-3 22B Persona, and Code Llama 70B, out of the box using pre-built NVIDIA TensorRT engines tailored for specific NVIDIA GPUs for maximum performance and utilization. These models are curated with the optimal hyperparameters for model-hosting performance for deploying applications with ease.

If your model is not in NVIDIAs set of curated models, NIM offers essential utilities such as the Model Repo Generator, which facilitates the creation of a TensorRT-LLM-accelerated engine and a NIM-format model directory through a straightforward YAML file. Furthermore, an integrated community backend of vLLM provides support for cutting-edge models and emerging features that may not have been seamlessly integrated into the TensorRT-LLM-optimized stack.

In addition to creating optimized LLMs for inference, NIM provides advanced hosting technologies such as optimized scheduling techniques like in-flight batching, which can break down the overall text generation process for an LLM into multiple iterations on the model. With in-flight batching, rather than waiting for the whole batch to finish before moving on to the next set of requests, the NIM runtime immediately evicts finished sequences from the batch. The runtime then begins running new requests while other requests are still in flight, making the best use of your compute instances and GPUs.

NIM integrates with SageMaker, allowing you to host your LLMs with performance and cost optimization while benefiting from the capabilities of SageMaker. When you use NIM on SageMaker, you can use capabilities such as scaling out the number of instances to host your model, performing blue/green deployments, and evaluating workloads using shadow testingall with best-in-class observability and monitoring with Amazon CloudWatch.

Using NIM to deploy optimized LLMs can be a great option for both performance and cost. It also helps make deploying LLMs effortless. In the future, NIM will also allow for Parameter-Efficient Fine-Tuning (PEFT) customization methods like LoRA and P-tuning. NIM also plans to have LLM support by supporting Triton Inference Server, TensorRT-LLM, and vLLM backends.

We encourage you to learn more about NVIDIA microservices and how to deploy your LLMs using SageMaker and try out the benefits available to you. NIM is available as a paid offering as part of the NVIDIA AI Enterprise software subscription available on AWS Marketplace.

In the near future, we will post an in-depth guide for NIM on SageMaker.

James Parkis a Solutions Architect at Amazon Web Services. He works with Amazon.com to design, build, and deploy technology solutions on AWS, and has a particular interest in AI and machine learning. In h is spare time he enjoys seeking out new cultures, new experiences, and staying up to date with the latest technology trends.You can find him on LinkedIn.

Saurabh Trikande is a Senior Product Manager for Amazon SageMaker Inference. He is passionate about working with customers and is motivated by the goal of democratizing machine learning. He focuses on core challenges related to deploying complex ML applications, multi-tenant ML models, cost optimizations, and making deployment of deep learning models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.

Qing Lan is a Software Development Engineer in AWS. He has been working on several challenging products in Amazon, including high performance ML inference solutions and high performance logging system. Qings team successfully launched the first Billion-parameter model in Amazon Advertising with very low latency required. Qing has in-depth knowledge on the infrastructure optimization and Deep Learning acceleration.

Nikhil Kulkarni is a software developer with AWS Machine Learning, focusing on making machine learning workloads more performant on the cloud, and is a co-creator of AWS Deep Learning Containers for training and inference. Hes passionate about distributed Deep Learning Systems. Outside of work, he enjoys reading books, fiddling with the guitar, and making pizza.

Harish Tummalacherla is Software Engineer with Deep Learning Performance team at SageMaker. He works on performance engineering for serving large language models efficiently on SageMaker. In his spare time, he enjoys running, cycling and ski mountaineering.

Eliuth Triana Isaza is a Developer Relations Manager at NVIDIA empowering Amazons AI MLOps, DevOps, Scientists and AWS technical experts to master the NVIDIA computing stack for accelerating and optimizing Generative AI Foundation models spanning from data curation, GPU training, model inference and production deployment on AWS GPU instances. In addition, Eliuth is a passionate mountain biker, skier, tennis and poker player.

Jiahong Liuis a Solution Architect on the Cloud Service Provider team at NVIDIA. He assists clients in adopting machine learning and AI solutions that leverage NVIDIA accelerated computing to address their training and inference challenges. In his leisure time, he enjoys origami, DIY projects, and playing basketball.

Kshitiz Guptais a Solutions Architect at NVIDIA. He enjoys educating cloud customers about the GPU AI technologies NVIDIA has to offer and assisting them with accelerating their machine learning and deep learning applications. Outside of work, he enjoys running, hiking and wildlife watching.

Read more from the original source:
Optimize price-performance of LLM inference on NVIDIA GPUs using the Amazon SageMaker integration with NVIDIA ... - AWS Blog

Read More..

Applying machine learning algorithms to predict the stock price trend in the stock market The case of Vietnam … – Nature.com

Foundation theory

When discussing the stock market, with its inherent and complexity, the predictability of stock returns has always been a subject of debate that attracts much research. Fama (1970) postulates the efficient market hypothesis that determines that the current price of an asset always reflects all prior information available to it immediately. In addition, the random walk hypothesis states that a stocks price changes independently of its history, in other words, tomorrows price will depend only on tomorrows information regardless of todays price (Burton, 2018). These two hypotheses establish that there is no means of accurately predicting stock prices.

On the other hand, there are other authors who argue that, in fact, stock prices can be predicted at least to some extent. And a variety of methods for predicting and modeling stock behavior have been the subject of research in many different disciplines, such as economics, statistics, physics, and computer science (Lo and MacKinlay, 1999).

A popular method for modeling and predicting the stock market is technical analysis, which is a method based on historical data from the market, primarily price and volume. Quantity. Technical analysis follows several assumptions: (1) prices are determined exclusively by supply and demand relationships; (2) prices change with the trend; (3) changes in supply and demand cause the trend to reverse; (4) changes in supply and demand can be identified on the chart; And (5) the patterns on the chart tend to repeat. In other words, technical analysis does not take into account any external factors such as political, social or macroeconomic (Kirkpatrick & Dahlquist, 2010). Research by Biondo et al. (2013) shows that short-term trading strategies based on technical analysis indicators can work better than some traditional methods, such as the moving average convergence divergence (MACD) and the relative strength index (RSI).

Technical analysis is a well method of forecasting future market trends by generating buy or sell signals based on specific information obtained from those prices. The popularity and continued application of technical analysis has become widely recognized with techniques for uncovering any hidden pattern ranging from the very rudimentary analysis of the moving averages to the recognition of rather complex time series patterns. Brock et al. (1992) show that simple trading rules based on the movement of short-term and long-term moving average returns have significant predictive power with daily data for more than a century on the Dow Jones Industrial Average. Fifield et al. (2005) went on to investigate the predictive power of the filter rule and the moving average oscillator rule in 11 European stock markets, including covering the period from January 1991 to December 2000. Their key findings indicate that four emerging markets: Greece, Hungary, Portugal and Turkey, are information inefficient, compared with seven more advanced other markets. Past empirical results support technical analysis (Fifield et al. 2005); however, such evidence can be criticized because of data bias (Brock et al. 1992).

Elman (1990) proposed a Recurrent Neural Network (RNN). Basically, RNN solves the problem of processing sequence data, such as text, voice, and video. There is a sequential relationship between samples of this data type and each sample is associated with its previous sample. For example, in text, a word is related to the word that precedes it. In meteorological data, the temperature of one day is combined with the temperature of the previous few days. A set of observations is defined as a sequence from which multiple sequences can be observed. This feature of the RNN algorithm is very suitable for the properties of time series data in stock analysis as the Fig. 1:

Source: Lai et al. (2019).

Figure 1 shows the structure of an RNN, in which the output of the hidden layer is stored in memory. Memory can be thought of as another input. The main reason for the difficulty of RNN training is the passing of the hidden layer parameter . Since the error propagation on the RNN is not handled, the value of multiplies during both forward and reverse propagation. (1) The problem of Gradient Vanishing is that when the gradient is small, increasing exponentially, it has almost no effect on the output. (2) Gradient Exploding problem: conversely, if the gradient is large, multiplying exponentially leads to gradient explosion. Of course, this problem exists in any deep neural network, but it is especially evident due to the recursive structure of the RNN. Further, RNNs differ from traditional relay networks in that they not only have neural connections in one direction, in other words, neurons can transmit data to a previous layer or same class. Not storing information in a single direction, this is a practical feature of the existence of short-term memory, in addition to the long-term memory that neural networks have acquired through training.

The Long Short Term Memory (LSTM) algorithm introduced by the research of Hochreiter and Schmidhuber (1997) aims to provide better performance by solving the Gradient Vanishing problem that repeated networks will suffer when dealing with long strings of data. In LSTM, each neuron is a memory cell that connects previous information to the current task. An LSTM network is a special type of RNN. The LSTM can capture the error, so that it can be moved back through the layers over time. LSTM keeps the error at a certain maximum constant, so the LSTM network can take a long time to train, and opens the door to setting the correction of parameters in the algorithm (Liu et al. 2018). The LSTM is a special network topology with three gateway structures (shown in Fig. 2). Three ports are placed in an LSTM unit, which are called input, forget, and output ports. While the information enters the network of the LSTM, it can be selected according to the rules. Only information that matches the algorithm will be forwarded, and information that does not match will be forgotten through the forget gate.

Source: Ding et al. (2015).

This gate-based architecture allows information to be selectively forwarded to the next unit based on the principle of the activation function of the LSTM network. LSTM networks are widely used and achieved some positive results when compared with other methods (Graves, 2012), especially in terms of Natural Language Processing, and especially for handwriting recognition (Graves et al. 2008). The LSTM algorithm has branched out into a number of variations, but when compared to the original they do not seem to have made any significant improvements to date (Greff et al. 2016).

Data on the stock market is very large and non-linear in nature. To model this type of data, it is necessary to use models that can analyze the patterns on the chart. Deep learning algorithms are capable of identifying and exploiting information hidden within data through the process of self-learning. Unlike other algorithms, deep learning models can model this type of data efficiently (Agrawal et al. 2019).

The research studies analyzing financial time series data using neural network models using many different types of input variables to predict stock returns. In some studies, the input data used to build the model includes only a single time series (Jia, 2016). Some other studies include both indicators showing market information and macroeconomic variables (White, 1988). In addition, there are many different variations in the application of neural network models to time series data analysis: Ding et al. (2015) combine financial time series analysis and processing natural language data, Roman and Jameel (1996) and Heaton et al. (2016) use deep learning architecture to model multivariable financial time series. The study of Chan et al. (2000) introduces a neural network model using technical analysis variables that has been performed to predict the Shanghai stock market, compared the performance of two algorithms and two different weight initialization methods. The results show that the efficiency of back-propagation can be increased by learning the conjugate gradient with multiple linear regression weight initializations.

With the suitable and high-performance nature of the regression neural network (RNN) model, a lot of research has been done on the application of RNN in the field of stock analysis and forecasting. Roman and Jameel (1996) used back-to-back models and RNNs to predict stock indexes for five different stock markets. Saad, Prokhorov, and Wunsch (1998) apply delay time, recurrence, and probability neural network models to predict stock data by day. Hegazy et al. (2014) applied machine learning algorithms such as PSO and LS-SVM to forecast the S&P 500 stock market. With the advent of LSTM, data analysis became dependent on time becomes more efficient. The LSTM algorithm has the ability to store historical information and is widely used in stock price prediction (Heaton et al. 2016).

For stock price prediction, LSTM network performance has been greatly appreciated when combined with NLP, which uses news text data as input to predict price trends. In addition, there are also a number of studies that use price data to predict price movements (Chen et al. 2015), using historical price data in addition to stock indices to predict whether stock prices will increase, decrease or stay the same during the day (Di Persio and Honchar, 2016), or compare the performance of the LSTM with its own proposed method based on a combination of different algorithms (Pahwa et al. 2017).

Zhuge et al. (2017) combine LSTM with Naiev Bayes method to extract market emotional factors to improve predictive performance. This method can be used to predict financial markets on completely different time scales from other variables. The sentiment analysis model is integrated with the LSTM time series model to predict the stocks opening price and the results show that this model can improve the prediction accuracy.

Jia (2016) discussed the effectiveness of LSTM in stock price prediction research and showed that LSTM is an effective method to predict stock returns. The real-time wavelet transform was combined with the LSTM network to predict the East Asian stock index, which corrected some logic defects in previous studies. Compared with the model using only LSTM, the combined model can greatly improve the prediction degree and the regression error is small. In addition, Glmez (2023) believed that the LSTM model is suitable for time series data on financial markets in the context of stock prices established on supply and demand relationships. Researching on the Down Jones stock index, which is a market for stocks, bonds and other securities in USA, the authors also did the stock forecasts for the period 2019 to 2023. Another research by Usmani Shamsi (2023) on Pakistan stock market research on general market, industry and stock related news categories and its influence on stock price forecast. This confirms that the LSTM model is being used more widely in stock price forecasting recently.

See original here:
Applying machine learning algorithms to predict the stock price trend in the stock market The case of Vietnam ... - Nature.com

Read More..

Nacha Rules Illustrate Need for AI and ML as Banks Battle Fraud – PYMNTS.com

For financial institutions and stakeholders, the twin threats of business email compromise (BEC) and credit-push fraud demand more monitoring and advanced technologies to do so.

Nachaintroduced new rules Monday (March 18) that create a base level ofACH monitoringfor all parties in the ACH Network with the exception of consumers.

The rules are designed to promote the detection of fraud through the credit-push payment flow, from the point of origination through the point of receipt at an account at the [receiving depository financial institution (RDFI)], PYMNTS reported Monday.

In terms of the mechanics of the new rules, when fraud is detected, the originating depository financial institution (ODFI) can request the return of the payment, the RDFI can delay funds availability (within limits), and the RDFI can return a suspicious transaction, without waiting for a request or a customer claim.

The rules are a bid to reduce BEC and vendor impersonation, among other scams.

PYMNTS Intelligence found last year that 43% of U.S. banks saw a rise infraudulent transactions. Twelve percent of scams came from impersonation schemes. From 2022 to 2023, the share of firms that experienced an increase in payments made via same-day ACH increased by about 35 percentage points, surging from nearly 11% to nearly 46%.

The financial impact from that fraud has been considerable, as the price tag, as it were, due to fraudulent transactions came in at $3.2 million in 2023, up from $2.3 million the year before.

The share of fraudulent transactions due to schemes that impersonated authorized parties grew over the same timeframe from 11.1% in 2022 to 13.9% last year.

Separately, theFBIreported thatBEC scamsgrew by double-digit percentage points. Between December 2021 and December 2022, there was a 17% increase in identified global exposed losses. There were more than 277,900 reported incidents, resulting in more than $50.8 billion in reported losses.

TheFederal Trade Commissionreported earlier this year thatimpostor scamstotaled $2.7 billion, with $800 as the median amount lost when consumers were targeted.

In the United Kingdom, draft legislation would mandate that banks and payment firms would have to reimburse victims of authorizedpush payment fraudup to 415,000 pounds (about $528,000) per incident, as well as implement a policy delaying payments for up to four days if fraud is suspected.

PYMNTS Intelligence found that rules-based algorithms, artificial intelligence and machine learning are the technologies most used to combat fraud, especially among larger banks. Sixty percent of financial institutions reported using rules-based algorithms to combat fraud, up from 50% in 2022.

PYMNTS Intelligence also found that at least 66% of financial institutions with more than $5 billion in assets use AI and ML, exceeding the 44% of smaller banks that do the same.

Elsewhere, 56% of financial institutions overall plan to increase their use of AI and ML models to combatfraud.

Go here to see the original:
Nacha Rules Illustrate Need for AI and ML as Banks Battle Fraud - PYMNTS.com

Read More..

Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system – Nature.com

Design of the wearable sensing-actuation system

A thin, flexible, and adhesive wearable sensing-actuation system was attached to the throat surface, as shown in Fig.1a, for speaking without vocal folds. This system comprises two symmetrical components: a sensing component (located at the bottom part of the device) converting the biomechanical muscle activities into high-fidelity electrical signals and an actuation component using the electrical signals to produce sound (located at the upper part of the device), as shown in Fig.1b. Both components consist of a polydimethylsiloxane (PDMS) layer (~200m thick) and a magnetic induction (MI) layer made of serpentine copper coil (with 20 turns and a diameter of ~67m). The serpentine configuration of the coil ensures the flexibility of the device while maintaining its performance, as discussed in Supplementary Note1. The symmetrical design of the device enhances its user-friendliness. The middle layer of the device is the shared magnetomechanical coupling (MC) layer, made of magnetoelastic materials consisting of mixed PDMS and micromagnets. The MC layer, with a thickness of approximately 1mm, is fabricated with a kirigami structure to enhance the devices sensitivity and stretchability (see Fig.S1). The entire system is small, thin (~1.35cm3, with a width and length of ~30mm and a thickness of ~1.5mm), and lightweight (~7.2268g) (see Fig.S2 and Supplementary TableS1).

a Illustration of the wearable sensing and phonation system attached to the throat. b Explosion diagram exhibiting each layer of the device design. c Two modes of muscle movement, expansion induces the elongation in the x- and y-axis, while contraction induces the elongation in the z-axis. Kirigami-structured device response to muscle movement patterns in the x, y (d), and z direction (e): expansion results in x- and y-axis expansion and less deformation in the z-axis, contraction results in less deformation in x and y direction and expansion of the z-axis. f Detailed illustration of the magnetic field change caused by magnetic particles. For one part, the angle change between each single unit of the kirigami structure is represented by . For the other part, the magnetic particle itself undergoes torque caused by the deformation applied onto the polymer (g), thus, generating a change of magnetic flux and, subsequently, current in the coil. The photo of the device in muscle expansion state is shown in h (x-, y-axis), i (z-axis), and in muscle contraction state is shown in j (x-, y-axis), k (z-axis). Scale bars, 1cm.

Multidirectional movement of laryngeal muscles sets the significance of capturing laryngeal muscle movement signals in a three-dimensional manner. Moreover, the learning process of phonation may be heterogeneous across populations: different people may adopt a variety of muscle patterns to achieve identical vocal movements45,53. Such complexity of muscle movement requires the device to be able to capture the deformation of muscles not horizontally or vertically alone, but rather in a three-dimensional way. Figure 1c illustrates the movement of the muscle fiber during two stages, i.e., expansion and contraction. During the expansion phase, the muscle relaxes and elongates in the x- and y-axis. On the other hand, during the contraction phase, the muscle shortens in the x- and y-axis while thickening in the z-axis through the increase in muscle fiber bundle diameters. Figure 1d, e demonstrates the devices response in the x-, y-axis, and z-axis, respectively. During the expansion phase, the kirigami-structured device expands in surface area with slight deformation in the z-axis. Conversely, during the contraction phase, the device opposes deformation in the x- and y-axis and undergoes deformation in the z-axis. Thus, the device captures the muscle movement across all three dimensions by measuring the corresponding deformation, which generates the change of magnetic flux density followed by the induction of an electrical signal in the MI layer. Supplementary Note2 further demonstrates the response of the device to the omnidirectional laryngeal movements and how the kirigami structure ensures the sensing performance.

The key defining characteristic of this system (MC layer) is based on the magnetoelastic effect, which refers to a change in the magnetic flux density of a ferromagnetic material in response to an externally applied mechanical stress, which was discovered in the mid-19th century54. It has been observed in rigid metals and metal alloys such as Fe1xCox54, TbxDy1x Fe2 (Terfenol-D)55, and GaxFe1x (Galfenol)56. Historically, these materials received limited attention within the bioelectronics domain for several reasons: the magnetization variation of magnetic alloys within biomechanical stress ranges is limited; the necessity for an external magnetic field introduces structural intricacies; and a significant mechanical modulus mismatch exists between magnetic alloys and human tissue, differing by six orders of magnitude. However, a breakthrough occurred in 2021 when the pronounced magnetoelastic effect was observed in a soft matter system57. This system exhibited a peak magnetomechanical coupling factor of 7.17108TPa1, representing an enhancement up to fourfold compared to traditional rigid metal alloys, underscoring its potential in soft bioelectronics. Functionally, the MC layer converts the mechanical movement of extrinsic laryngeal muscle into magnetic field variation, and the copper coils transfer the magnetic change into electrical signals based on electromagnetic induction, operating in a self-powered manner. While additional power management circuits are essential for processing and filtering the signals, the initial sensing phase is autonomous and does not rely on an external power supply. After recognition through the machine learning model, the voice signal is output through the actuation system (Fig.1a).

The signal conversion through the giant magnetoelastic effect in soft elastomers can be explained at both the micro and atomic scales. At the microscale, compressive stress applied to the soft polymer composite causes a corresponding shape deformation, leading to magnetic particle-particle interactions (MPPI), including changes in the distance and orientation of the inter-particle connections. The horizontal rotation of each subunit in the kirigami structure (Fig.1d) and vertical bending deformation (Fig.1e) create a micro change of magnetic density. In detail, as shown in Fig.1f, in a subunit of the kirigami structure, deformation-induced angle shift generates a concentration of stress and MPPI in between each single unit of the kirigami structure. At the atomic scale, mechanical stress also induces magnetic dipole-dipole interactions (MDDI), which results in the rotation and movement of magnetic domains within the particles. As shown in Fig.1g, a torque was made on each magnetic nanoparticle, and the shift of angle generates the change in magnetic flux density. The photo of the device design is presented in Fig.1h, i as the x-, y-axis, and z-axis response in the expansion phase; and in Fig.1j, k as in the contraction phase. Fig. 1h,j describes the expansion and contraction in the x-y plane, and Fig. 1i, j describes the corresponding z-axis contraction and expansion.Such structural design also displays a series of appealing features, including high current generation, low inner impedance, and intrinsic waterproofness, which will be presented in the following sections.

Our present work compares previous approaches based on PVDF and graphene for flexible voice monitoring and emitting, as shown in Fig.2a and Supplementary TableS335,36,37,58,59,60,61,62. The device developed in this work has a similar acoustic performance, with a frequency range covering the entire human hearing range. However, it has a much lower driving voltage (1.95V) and a Youngs modulus of 7.83105Pa. As shown in Fig.S3, it exhibits the stress-strain curve and testing photo of the material with and without the kirigami structure, which lowered Youngs modulus from 2.59107Pa to 7.83105Pa. This result ensures a higher comfort level while wearing as the modulus of the device is very close to that of the human skin. Notably, the device we developed has two unique features of stretchability and water resistance, which ensure the detection of horizontal movements, wearing comfort and resistance to respiration. Additionally, the device does not have the issue of temperature rising during use, preventing unexpected low-temperature scalding of users. Subsequently, several standard tests establish the sensing features of the device and its efficacy in outputting voice signals. To enhance the stretchability of the device, a kirigami structure was fabricated onto the MC layer of the device. The unit design of the structure is shown in Fig.S4, and the stretchability with regards to the parameters of the kirigami unit design is exhibited in Fig.S5. Such an approach not only enhances the stretchability of the device to a maximum of 164% with Youngs modulus at the level of 100kPa but also realizes isotropy. Furthermore, the structure enlarges the horizontal deformation of the device under unit pressure, generating a higher current output and enhanced detectable signals of extrinsic muscle contraction and relaxation, as shown in Figs.S6S7. The change in sensitivity brought by the structure on the vertical axis was also tested, and an elevation can be observed, as shown in Figs.S8S9. Moreover, isotropy prevents the device from being disturbed by random and uneven body movements in use. Thus, there are no requirements on wearing orientation which elevates user-friendliness as revealed in Figs.S10S11.

a Performance comparison of different flexible throat sensors in terms of Youngs modulus, stretchability, underwater sound pressure level, temperature rise, driving voltage, and working frequency range. b Pressuresensitivity response of the device at varied degrees of stretching under different amplification levels. (arb. units) referring to arbitrary units. c Response time and signal-to-noise ratio of the device. d Variation of sound pressure level with distance from the device at different amplification levels. e Sound pressure level of the device with resonance point highlighted in human hearing frequency range compared to SPL normal human speaking threshold. f The right shift of the first resonance point towards high frequency with regards to increasing strains. The test of performance is repeated 3 times at each condition. Data are presented as mean valuesSD. g Relationship between Kirigami structure parameters and actuating (first resonance point and sound pressure level)/sensing properties (response time and signal-to-noise ratio). The test of performance is repeated three times at each condition. Data are presented as mean valuesSD. Waveform (h) and spectrum (i) comparison of commercial loudspeaker (Red) and the device (Yellow) sound output at 900Hz and maximum strain (164%).

The stretchable structure of the device was leveraged to examine its sensitivity with respect to deformation degrees, as depicted in Fig.2b. The sensitivity curve demonstrated consistency under varying strains, with a minor change observed under maximum strain (164%). This change could be attributed to the reduction in the MC layers thickness due to deformation, which in turn decreases the magnetic flux density under the same pressure level, resulting in lower current generation. The devices response curve under different frequencies and forces of the shaker was tested, as shown in Fig.S12. We have also validated that the electric output of the device is not due to the triboelectricity in Supplementary Note363. The devices inherent flexibility and stretchability facilitate tight adherence to the throat, yielding a high signal-to-noise ratio (SNR) and swift response time (Fig.2c). In addition to the kirigami structure design parameters, other factors influencing the devices sensitivity, response time, and SNR were also evaluated. Fig.S13 illustrates that an increase in coil turns results in longer response times and lower SNR due to the increased total thickness of the copper coils. This thickness impedes the membranes deformation during vibrations, leading to longer response times and lower signal quality. We have further investigated the increase of thickness with the coil turn ratios in Supplementary TableS2. As the number of coil turns escalates, theres a direct correlation with the likelihood of copper wires stacking. Consequently, a significant number of samples exhibit thicknesses approximating 2 or 3 layers of copper (134 m and 201 m, respectively). This stacking effect amplifies the average coil thickness as the number of turns increases. However, this augmentation isnt strictly linear. For instance, the propensity for overlapping is less pronounced for turn ratios of 20 and 40. In contrast, for turn ratios exceeding 60, a clear trend emerges where the likelihood of overlapping increases with the number of turns. The relationship between the sensing performance and nanomagnetic powder concentrations of the MC Layer is presented in Fig.S14. A semi-linear relationship was observed, with higher magnetic nanoparticle concentration generating a stronger magnetic field and, consequently, higher current output. The influence of varying PDMS ratios in the sensing membrane on the performance of the sensor is delineated in Fig.S15. An increase in the PDMS ratios was found to extend the response times and decrease the SNR while having a negligible effect on the sensitivity curve. The augmentation in PDMS ratios leads to a softer membrane, which is prone to deformation at a slower rate. Consequently, devices with higher PDMS ratios exhibit heightened sensitivity to noise-generating deformations, albeit at a reduced response time. The influence of thickness on sensing performance was tested in Fig.S16, with thicker membranes resulting in quicker response times and a fluctuating SNR. Lastly, the impact of the MC layers thickness was tested in Fig.S17. A thicker MC layer had no influence on response time but reduced SNR. Weve consolidated the results of each optimization factor in Fig.S18, providing a clear overview of the primary variables influencing each performance metric. After considering the sensing performance, weight, and flexibility of the device, the current parameters were determined. The devices durability with these parameters was evaluated in Fig.S19, where the device underwent continuous working for 24,000 cycles with a shaker under a frequency of 5Hz, with no observable degradation in the current generation.

The acoustic performance of the actuation system of the device is examined firstly with a focus on its sound pressure level (SPL) at different distances. The results, presented in Fig.2d, show that larger output magnification led to a higher SPL at all tested positions. Even at a distance of 1 meter, the typical distance during normal conversations, the device provided an SPL of over 40dB, which is above the lower limit of normal speaking SPL (4060dB)64. We also tested the devices SPL at different angles and compared its performance with those of previous works on acoustic devices (Fig.S20, Supplementary TableS3). The devices performance across various frequencies was tested and presented in Fig.2e, which indicates that it could provide sound with SPL louder than normal speaking loudness across the entire human hearing range64. The resonance point in the figure indicates the frequency at which the device has relatively the largest loudness output under the same signal strength as other adjacent frequencies. Further investigation into the SPL regarding frequency under different strains revealed that the first few resonance points tended to have the largest acoustic output across the frequency range (Fig.S21). Since the device under one strain has multiple resonance points that change non-linearly with deformation, investigating the change of every resonance point is complicated. Therefore, we only investigated the first resonance point (FRP) in Fig.2f because of its complexity and our interest in the highest output. According to Fig.2e and Fig.S22, the voice output at each strain was above the normal talking threshold across the whole human hearing range. Figure2f revealed a right shift of FRP of the device as the deformation gets larger, enabling the device to adjust its best output performance under different usage scenarios. Our device can adjust its best output performance by simply changing the deformation degree, thus creating a unique output setting for each individual and realizing user adaptability. More details about the right shift of FRP are shown in Fig.S23.

We also tested the influence of introducing the kirigami design into the device, as presented in Fig.2g. The results show that the parameter of the kirigami design had a negligible impact on the sensing and acoustic performance, further supporting the decision to use this design due to its impact on flexibility (Fig.S5). Additional factors influencing the acoustic performance of the actuation system were evaluated, and the final parameters were determined based on both performance and the devices mass/flexibility. Fig.S24 explores the impact of coil turn ratios on the SPL produced by the device. It was observed that an increase in coil turns led to a decrease in SPL, likely due to the weight of the additional coil impeding membrane vibration and subsequently reducing SPL. The relationship between SPL and the PDMS ratio of the actuator membrane was examined in Fig.S25. As the ratio increased, the membrane softened, leading to a decrease in the generated SPL. The dampening effect of a softer membrane hindered vibration and sound generation, resulting in a semi-linear decrease. Fig.S26 presents the relationship between SPL and magnetic powder concentrations. The devices SPL increased with the addition of higher amounts of magnetic powder in the MC layer, plateauing after a ratio of 4:1. The effect of varying MC layer thickness on SPL is shown in Fig.S27. A sharp increase in the devices SPL was observed as the MC layers thickness increased from 0.5mm to 1mm. However, the increase slowed and eventually plateaued as the MC layer became thicker. Finally, the SPL under different actuator membrane thicknesses was tested in Fig.S28. The devices SPL increased as the PDMS membrane (vibrating membrane) thickness increased from 100 to 200m but decreased when the membrane became thicker. The weight of thicker membranes may dampen the vibration and reduce the loudness produced by the device. Regarding the acoustic output quality of the device, Fig.2h displays the waveform of the commercial loudspeaker and our device at the maximum (164%) strain at the frequency of 1100Hz. The device reproduced the voice signal accurately, even under maximum deformation, with only slight distortion. The distortion was further explained in the spectrogram of Fig.2i, which shows that a noise of around 1400Hz was generated in the output of our device but not strong enough to significantly distort the signal. Output of other strains was tested in Fig.S29, a similar distortion of less extent can be observed with less strain. In the final phase of our study, we evaluated the water resistance of our device. The waveform of the device outputting an identical voice signal segment under water and in air is depicted in Fig.S30. The waveforms are notably similar, with no significant signal distortion observed. A slight loss of the high-frequency component, without major signal attenuation, is evident in the frequency domain (Fig.S31). The device demonstrated consistent performance even after being submerged in water for an accelerated aging test with aduration of 7 days (Fig.S32). The sound pressure level (SPL) in relation to distance underwater is presented in Fig.S33. A correlation was observed between the depth of the device underwater and the sound output, with deeper submersion resulting in lower output. However, the device could produce an output exceeding 60dB when placed 2cm underwater at a distance of 20cm. The SPL of the device in relation to frequency underwater is illustrated in Fig.S34. Despite the attenuation of high-frequency components underwater, the device consistently delivered an SPL above the normal speaking range (60dB) across the entire human hearing range. These results suggest that our device, as a wearable, can effectively withstand conditions of perspiration, damp environments, and rain exposure.

After obtaining the preliminary standard test results, we focused on collecting laryngeal muscle movement signals using our wearable sensing component. The experiment is schematically illustrated in Fig.3a. The analog signal generated by the vibration of the extrinsic laryngeal muscles (Sternothyroid muscle, as shown in Fig.3a) was collected by the sensor and then passed through an amplifier and a low-pass filter exhibited in Fig.3b. The digital signal of the laryngeal muscle movements was output and collected for further analysis. The sensitivity and repeatability of the device were tested in Fig.3c with two successive different throat movements. The device was able to generate distinguishable and unique signals for each different throat movement, indicating its feasibility to detect and analyze different laryngeal movement properties. Furthermore, the device responded consistently to one throat movement, as demonstrated by the participants continuous two throat movements. In addition, larger throat muscle movements, such as coughing or yawning, generated larger peaks, while longer movements, such as swallowing, generated longer signals. We also conducted experiments to test the devices functionality under different conditions. In Fig.3d, we asked the participant to voicelessly pronounce the same word (UCLA) under different conditions, including standing still, walking, running, and jumping. The device was able to discern the unique and repeatable feature syllable wave shape of each word, with only slight differences made by the participants with different pronouncing paces each time. Thus, the wearable device was able to function without being influenced by the users body movements, even during strenuous exercise. Finally, to test the signal quality and accuracy acquired by purely the laryngeal muscle movement, we performed examinations to compare normal speaking and voiceless speaking, as shown in Fig.3e. The five successive signals of participant saying Go Bruins with and without vocal fold vibration were compared in Fig.3f and g, respectively. Both tests generated consistent signals, and the syllables of each word were represented with distinguishable waveforms. Comparing the test results of normal speaking and speaking voicelessly, we observed only a slight loss of maximum amplitude in the signal of speaking voicelessly. This could be explained by the fact that the vibration of vocal folds requires more and stronger muscle movements, thus generating stronger signals. Furthermore, a clear loss of high-frequency components in voiceless signals compared to the signals with vocal fold vibration was observed in Fig.3h, i after the Fourier transform of both signals across frequencies. This finding was consistent with our hypothesis that the high-frequency part of the vibration generated by intrinsic muscles and vocal folds is absent in voiceless signals, leaving a smoother yet distinguishable waveform. Hence, the device was proven to capture recognizable and unique signals with laryngeal muscle movements for further analysis.

a Schematic illustration of the extrinsic muscle and vibration. Created with BioRender.com. b Circuit diagram of the system for collecting extrinsic muscle movement signal. c Sensor output for different throat movementsCoughing, Humming, Nodding, Swallowing, and Yawning. d Device signal output for participant pronouncing UCLA under different body movements. e Sensor output for participant pronouncing Go Bruins! with vocal fold vibration (upper, gray) and voiceless (lower, red). Enlarged waveform of participant pronouncing Go Bruins! with vocal fold vibration (f) and voiceless (g). Amplitude-frequency spectrum of the signal with vocal fold vibration (h) and voiceless (i).

With generated data of laryngeal muscle movement, a machine-learning algorithm was employed to classify the semantical meaning of the signal and select a corresponding voice signal for outputting through the actuation component of the system. A schematic flow chart of the machine-learning algorithm is presented in Fig.4a. The algorithm consists of two steps: training and classifying a set of n sentences for which assisted speaking is required. Firstly, the filtered training data was fed to the algorithm for model training. The electrical signal of each of the n sentences was compacted into an Nth-order matrix for feature extraction with principal component analysis (PCA) (Fig.4b). N is determined by the sampling window, which is the length of the longest sentences signal. PCA is applied to remove redundancy and prepare the signal for classification. Multi-class support vector classification (SVC) was chosen as the classification algorithm with the decision function shape of one vs. rest. For each sentence to be classified, the rest of the n-1 sentences were considered as a whole to generate a binary classification boundary to discriminate the target sentence. A brief illustration of the support vector machine (SVM) process is depicted in Fig.4c. The margin of the linear boundary between two target data groups undergoes a series of optimizing processes and was set to the largest with support vectors. Details of PCA and multi-class SVC are discussed in Methods. After the classifier was trained with pre-fed training data, it was used for classifying newly collected laryngeal muscle movement signals. The real-time data were fed to the classifier, and the class (which sentence) of the signal was output for voice signal selection. Subsequently, the corresponding pre-recorded voice signal was played by the actuation component, realizing assisted speaking.

a Flow chart of the machine-learning-assisted wearable sensing-actuation system. b Illustration depicting the process of data segmentation and principal components analysis (PCA) applied to the muscle movement signal captured by the sensor. Yellow indicates one sentence, and red indicates another one. c Optimizing process of data classification after PCA with support vector machine (SVM) algorithm. d Contour plot of the classification results with SVM, class 1, indicating 100% possibility of the target sentence, dotted lines are the possibility boundaries between the target sentence and the others. e Bar chart exhibiting 7 participants accuracy of both validation set and testing set. f Confusion matrix of the 8th participants validation set with an overall accuracy of 98%. g Confusion matrix of the 8th participants testing set with an overall accuracy of 96.5%. h Demonstration of the machine-learning-assisted wearable sensing-actuation system in assisted speaking. The left panel shows the muscle movement signal captured by the sensor as the participant pronounces the sentence voicelessly, while the right panel shows the corresponding output waveform produced by the systems actuation component. i The SPL and temperature trends over time while the device is worn by participants; no notable temperature increase or SPL decrease was seen for up to 40min. j The devices SPL outputs participant-specific sound signals, both with and without sweat presence. Each participant was asked to repeat testing of N=3 times for both scenarios. Data are presented as mean valuesSD. The p-value between dry and sweaty state is calculated to be 0.818, indicating no significant difference in the devices performance under the two cases. k The devices SPL across various conversation angles while done by the participant. Created with BioRender.com.

A brief demonstration was made with five sentences that we had selected for training the algorithm (S1: Hi Rachel, how you are doing today?, S2: Hope your experiments are going well!, S3: Merry Christmas!, S4: I love you!, S5: I dont trust you.). Each participant repeated each sentence 100 times for data collection. The resulting contour plot in Fig.4d shows an example of the classification result, with the red dots indicating the target sentence and the yellow dots indicating the others. A probability contour was drawn to classify whether a newly input sentence point belonged to the target sentence or not. With the trained classifier, the laryngeal movement signal was recognized for the corresponding sentence that the participant wished to express. To test the robustness and user-adaptability of the algorithm, the device was tested with eight participants, each repeating the sentence 120 times in total, with 100 repeats selected for the training set and 20 separated as the testing set. Of the 100 repeats, 20 were selected as the validation set. Figure4e shows the validation and testing results of seven out of the eight participants, while Fig.4f, g presents a detailed illustration of the confusion matrix of the 8th participant for the validation and testing sets, respectively. Even slightly lower than the validation set, each participants testing set achieved more than 93% accuracy. FigureS35 shows the detailed confusion matrix of both the validation and testing set and the accuracy of every other participant. The overall prediction accuracy of the model was 94.68%, and it worked well with different participants. Each participants voice signal was played by the actuation component, realizing the demonstration in Fig.4h. The left panel shows the muscle movement signal transferred into the correct voice signal, with the waveform shown in the right panel. Further, we extended our analysis to validate the practical usability of the device for vocal output after the selection of the accurate voice signal by the algorithm. As demonstrated in Fig.4i, an evaluation of the SPL and temperature of the device during use by the participant revealed no significant drop in SPL or rise in temperature, even after an extended working period of 40min. This suggests the devices durability in voice output and safe usage. In Fig.4j, we display the SPL of the device as it produces voice signals for seven participants, both with and without sweat. We noted consistent performance by the device across different participants, with no evident signal attenuation despite the presence of perspiration. Finally, Fig.4k illustrates the devices SPL during voice output at various normal conversation angles while worn by the participant. The device demonstrated reliable sound performance across all angles, thereby enabling assisted speaking in multiple real-life scenarios. In conclusion, the device can convert laryngeal muscle movement into voice signals, providing patients with voice disorders with a feasible method to communicate during the recovery process.

Link:
Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system - Nature.com

Read More..

DARPA and IBM Secure AI Systems from Hackers – AiThority

The US Department of Defenses (DoD) research and development arm, DARPA, and IBM have been collaborating on several projects related to hostile AI for the past four years. The team from IBM has been working on a project called GARD, which aims to construct defenses that can handle new threats, provide theory to make systems provably robust and create tools to evaluate the defenses of algorithms reliably. The project is led by Principal Investigator (PI) Nathalie Baracaldo and co-PI Mark Purcell. To make ART more applicable to potential use cases encountered by the US military and other organizations creating AI systems, researchers have upgraded it as part of the project.

Read: Data monetization With IBM For Your Financial benefits

With the hope of inspiring other AI experts to collaborate on developing tools to safeguard AI deployments in the actual world, IBM gave ART to the Linux Foundation in 2020. In addition to supporting numerous prominent machine-learning model structures, like TensorFlow and PyTorch, ART also has its own GitHub repository. To continue meeting AI practitioners where they are, IBM has now added the updated toolkit to Hugging Face. When it comes to finding and using AI models, Hugging Face has swiftly risen to the top of the internet. The current geographic model developed with NASA is one of many IBM projects that have been made publicly available on Hugging Face. Models from the AI repository are the intended users of Hugging Faces ART toolset. It demonstrates how to include the toolbox into time, a library utilized to construct Hugging Face models, and provides instances of assaults and defenses for evasion and poisoning threats.

Read:Top 10 Benefits Of AI In The Real Estate Industry

The researchers in this dispersed group would use their standards to assess the efficacy of the defenses they constructed. ART has amassed hundreds of stars on GitHub and was the first to provide a single toolset for many practical assaults. This exemplifies the communitys cooperative spirit as they strive toward a common objective of protecting their AI procedures. Although they have come a long way, machine-learning models are still fragile and open to both targeted attacks and random noise from the real world.

A disorganized and immature adversarial AI community existed before GARD. Digital assaults, such as introducing small disturbances to photos, were the main focus of researchers, although they werent the most pressing issues. Physical attacks, such as covering a stop sign with a sticker to trick an autonomous vehicles AI model, and attacks where training data is poisoned are the main concerns in the real world.

New:10 AI ML In Personal Healthcare Trends To Look Out For In 2024

Researchers and practitioners in the field of artificial intelligence security lacked a central hub for exchanging attack and defense codes before the advent of ART. To accomplish this, ART offers a platform that enables teams to concentrate on more particular tasks. As part of GARD, the group has created resources that blue and red teams can use to evaluate and compare various machine learning models performance in the face of various threats, such as poisoning and evasion. What is included in ART are the practical countermeasures against those attacks. Although the project is coming to a close this spring after four years, it is far from over.

[To share your insights with us as part of editorial or sponsored content, please write tosghosh@martechseries.com]

Excerpt from:
DARPA and IBM Secure AI Systems from Hackers - AiThority

Read More..

Identifying microRNAs associated with tumor immunotherapy response using an interpretable machine learning model … – Nature.com

ICB response prediction using miRNA expression profiles

We compiled predictive ICB responses using the TIDE model and miRNA expression profiles from 7721 samples across 19 different tumor types within The Cancer Genome Atlas (TCGA) dataset (Table 1). To predict immunotherapy response using miRNA expression profiles, we first developed a random forest classifier to determine CTL levels. The optimal parameters for the random forest classifier were determined through a grid search with tenfold cross-validation (Table 2). Using the identified optimal parameters, we trained random forest classifiers on the designated training data and rigorously assessed the predictive performance on the independent test data. The results showed that the random forest classifier predicted the CTL levels well, with an AUC of 0.9400 (Fig.2A). Furthermore, when evaluating the performance using the F1 score and Balanced AUC indicators, high performance was confirmed, with an F1 score of 0.9849 and a Balanced AUC of 0.7182.

Predicted results for each model learned using miRNA expression profiles. (A) ROCAUC of the random forest classifier that predicts the CTL level. The class True signifies the high group and False signifies the low group. (B) Scatterplot of the random forest regression model for predicting the dysfunction score. The red line indicates the regression line. (C) Scatterplot of the random forest regression model for predicting the exclusion score. The red line indicates the regression line. (D) Scatterplot of the stepwise prediction model predicting ICB response based on the TIDE score. The red line indicates the regression line.

Next, we predicted the dysfunction and exclusion scores based on random forest regression. A grid search with tenfold cross-validation was performed to determine the optimal parameters for random forest regression (Table 2). Employing the optimal parameters, two random forest regression models to predict the dysfunction and exclusion scores were independently learned from the training data. The predictive results with the independent test datasets showed that the MSE of the regression model for predicting the dysfunction and exclusion scores were both 0.0361. The Pearson correlation coefficient (PCC) between the observed and predicted values was also calculated. The PCC for the dysfunction score prediction model was 0.8158 and that for the exclusion model was 0.8704. This indicated a strong positive correlation between the predicted and actual values in both models (Fig.2B,C).

Finally, we predicted the ICB responses based on the TIDE score by combining the two-step machine learning model, constructed a random forest classifier for CTL prediction, and random forest regression models for the dysfunction and exclusion scores. The MSE of the combined stepwise model was 0.0360. Furthermore, the PCC between the observed and predicted values exhibited a strong positive correlation of 0.9270 (Fig.2D).

Thereafter, we used SHAP, an interpretable machine learning approach, to analyze the results of our machine learning models. Using SHAP analysis, we identified informative miRNAs that contributed to the prediction of target values. Figure3 shows the top 20 miRNAs ranked according to their feature importance scores in each model.

Shapley value plot for exhibiting feature importance. (A) SHAP feature importance for the random forest classifier to predict CTL level, (B) summary plot for the random forest classifier when the CTL prediction model predicts the CTL level is high, (C) summary plot for the random forest classifier when the CTL prediction model predicts the CTL level is low, (D) SHAP feature importance for random forest regression to predict dysfunction score, (E) summary plot for random forest regression to predict dysfunction score, (F) SHAP feature importance for random forest regression to predict exclusion score, and (G) summary plot for random forest regression to predict exclusion score. (A, D, F) are plots that arrange features based on the average of the absolute Shapley values, which serve as indicators of feature importance. (B, C, E, G) are summary plots that depict feature importance and feature effects simultaneously. Each point signifies the Shapley value of the feature and instance. The x-axis represents the Shapley value, and the y-axis represents each feature. The color of each point corresponds to the high and low feature values (i.e., miRNA expression values).

For CTL-level prediction based on a random forest classifier, hsa-miR-155 was the most informative feature with the highest Shapley value. In particular, focusing on high and low CTL predictions, the expression of hsa-mir-155 was positively associated with CTL-level prediction (Fig.3B,C). Notably, miR-155 is an essential factor orchestrating the CD8+T cell response in cancer, and its overexpression has been associated with the enhancement of the anti-tumor response22,23. hsa-miR-150, which had the second-highest impact on model predictions, exhibited a similar trend. miR-150 also plays a crucial role in the differentiation and functional regulation of CD8+T cells24. The absence of miR-150 leads to a decline in the killing ability of CD8+T cells24. In addition, hsa-miR-4772, hsa-miR-21, hsa-miR-142, and hsa-miR-10a were also identified with notably high Shapley values.

In the random forest regression model used to predict the dysfunction score, the miRNA with the highest Shapley value was hsa-miR-10b. The Shapley value of hsa-mir-10b was negative when its expression was low, and positive when its expression was high (Fig.3E). This indicated a positive correlation between hsa-miR-10b expression and dysfunction prediction. In contrast, hsa-miR-183 negatively correlated with dysfunction prediction. Both miR-150 and miR-155 showed positive correlations in dysfunction predictions and played an important role in dysfunction mechanisms, as well as in CTL level predictions. Furthermore, miR-151a and miR-210 exhibit negative correlations, similar to those of miR-183.

In the random forest regression model predicting the exclusion score, hsa-miR-10b also showed the largest Shapley value (Fig.3G); however, it exhibited a negative correlation with hsa-miR-10b and the exclusion prediction, in contrast to the dysfunction prediction model. This observation serves as an example of how exclusion prediction, which has a mechanism opposite to that of dysfunction, is negatively correlated with dysfunction prediction. In contrast to the dysfunction results, hsa-miR-150 and hsa-miR-155 demonstrated opposite behaviors in exclusion prediction. Additionally, hsa-miR-10a, which was also identified in the CTL-level prediction, showed a positive correlation with exclusion prediction and played an important role in model prediction. Furthermore, the expression level of miR-194-1 and miR-194-2 is negatively correlated to the exclusion prediction.

Next, we verified whether ICB response could be predicted using a small number of informative miRNAs. We selected miRNAs with an average absolute Shapley value of 0.01 or higher (SHAP 0.01). Using this criterion, three miRNAs were identified in the CTL model, five miRNAs in the dysfunction prediction model, and 12 in the exclusion prediction model (Fig.3A,D,F). Because only a limited number of features were used to construct the models, we employed a simple algorithm to predict immunotherapy response.

To predict the CTL level, we applied logistic regression25 and determined the optimal parameters by conducting a grid search with tenfold cross-validation (Table 2). The model using the three informative miRNAs achieved an F1 score of 0.9805, a balanced accuracy of 0.7249, and an AUC value of 0.9300 (Fig. S1A). This analysis confirmed that a small subset of highly informative miRNAs displayed a similar performance in predicting CTL levels, even when a logistic regression model was utilized.

Subsequently, dysfunction and exclusion scores were predicted using a small number of informative miRNAs based on multiple linear regression. The obtained results showed that the MSE for the dysfunction model using the top miRNA (SHAP 0.01) was 0.0754 and that for the exclusion prediction model was 0.0840. The PCCs between the predicted and actual values were 0.5707 and 0.6638 for the dysfunction and exclusion prediction models, respectively (Figs. S1B,C). From these results, we confirmed that the performance was slightly degraded with a reduced number of features; however, the models still demonstrated comparable performance with only a small number of selected miRNAs.

Finally, to predict the ICB responses based on the TIDE scores, we applied a stepwise machine learning model by combining the logistic regression classifier for the CTL level and the linear regression model for dysfunction and exclusion scores. The MSE of the model that used the most informative miRNA (SHAP 0.01) was 0.0690. We also observed a strong positive correlation with informative miRNAs; the PCC of the top miRNA (SHAP 0.01) was 0.8457 (Fig. S1D).

Similarly, we applied robust criteria for the identification of informative miRNAs and verified whether having fewer miRNAs could result in the accurate prediction of immunotherapy response. We selected miRNAs with an average absolute Shapley value of 0.02 or higher (SHAP 0.02); two miRNAs were identified in the CTL model, four miRNAs in the dysfunction prediction model, and five miRNAs in the exclusion prediction model (Fig.3A,D,F).

For CTL level prediction using logistic regression, the model obtained an F1 score of 0.9800, a balanced accuracy of 0.7459, and an AUC value of 0.91 (Fig. S2A). In addition, the models showed good performance for dysfunction and exclusion score prediction using linear regression. The MSE for dysfunction prediction was 0.0810 and that for exclusion prediction was 0.0984. The PCCs were 0.5220 and 0.5900 for the dysfunction score and exclusion score prediction models, respectively (Fig. S2B,C). Furthermore, for the ICB response prediction based on the TIDE scores using the two-step machine learning model combining logistic regression and linear regression, the MSE was 0.0753 and the PCC was 0.8595 (Fig. S2D). Although the performance was slightly lower than that of the model using all miRNAs for predicting the ICB response, these results suggest that informative miRNAs based on Shapley values still exhibit strong predictive capability, even with a limited number of miRNAs and relatively simple classification and regression models.

To examine the biological roles of the informative miRNAs, we predicted the target genes of the informative miRNAs selected by Shapley values using miRDB and TargetScan. A list of the genes targeted by the top miRNAs from each model is shown in Tables S1 and S2. We investigated the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enriched in the target genes. Tables 3, 4, 5 and Tables S3S5 show the results of enrichment analyses using the informative miRNAs (SHAP 0.01) of each model. The top 20 pathways are listed in Tables 3, 4, 5 in ascending order of P-values, and all KEGG pathways satisfying statistical significance (adjusted P value<0.05) are shown in Tables S3S5.

The first-ranked KEGG pathway in the CTL-level prediction model was the TNF signaling pathway (Table 3). The following pathways are involved in the Hepatitis B and IL-17 signaling pathway. Hepatitis B is a significant contributor of hepatocellular carcinoma (HCC)26. Additionally, immune-related pathways such as the Fc epsilon RI signaling pathway and the T cell and B cell receptor signaling pathways were observed at the top. Enrichment analysis also revealed several other cancer-related terms, including Pathways in cancer, PI3K-Akt signaling pathway, Prostate cancer, Renal cell carcinoma, and Pancreatic cancer (Table S3). These results suggest a significant role for these miRNAs and their target genes in cancer and immunotherapy.

Tables 4 and S4 present the informative pathways identified using the dysfunction score prediction model. One of the most significantly enriched pathways was melanogenesis, which produces mutagenic intermediates that induce immunosuppression. The following term represents the Wnt signaling pathway and the ErbB signaling pathway. Moreover, our analysis identified various cancer-related terms, including Hepatocellular carcinoma, Prostate cancer, Breast cancer, and Gastric cancer, as well as Pathways in cancer (Table S4).

Tables 5 and S5 present the pathways identified using the exclusion score prediction model. The first pathway is Proteoglycans in cancer, which plays a significant role in regulating cytokine and chemokine expression on the cell surface. Moreover, various cancer-related pathways and terms, such as MAPK signaling, PI3K-Akt signaling pathway, and Rap1 signaling pathway, Pathways in cancer, Prostate cancer, Renal cell carcinoma, Lung cancer, and Breast Cancer, were also identified, along with immune-related pathways like T cell and B cell receptor pathway and Helper T cell differentiation. Furthermore, the presence of PD-L1 expression and the PD-1 checkpoint pathway in cancer indicate that the genes targeted by miRNAs are directly associated with immunotherapy.

Additionally, it was noted that several pathways related to the brain and neurons were observed, including Axon guidance27, a subfield of neurodevelopment associated with the process of neurons sending axons to reach accurate targets; Neurotrophin signaling pathway28, a protein that supports the survival, development, and function of neurons; Long-term potentiation29, a process that strengthens signal transmission between neurons; as well as Dopaminergic synapse and Cholinergic synapse (Table S5). This could be because the majority of CTL-low (exclusion) samples were involved in the TCGA LGG tumor type (Table S6).

Enrichment analysis results using the top miRNAs (SHAP value 0.02) of each model also identified diverse pathways related to cancer and immunity (Table S7-S9). These findings would provide valuable insights into the molecular mechanisms underlying exclusion and immune response regulation in cancer.

We proceeded to validate the stepwise machine learning model based on a random forest trained on all miRNAs using data from 12 distinct tumor types not included in the previous training and test phases (Fig.1F,I and Table S10). For the random forest classifier predicting CTL levels, we achieved an F1 score of 0.9912 and an AUC value of 0.9400 (Fig. S3A). When predicting the dysfunction and exclusion scores via random forest regression models, the MSE for the dysfunction score prediction model was 0.0478, and that for the exclusion score prediction model was 0.0641. The MSE value of the stepwise machine learning model for predicting the ICB response based on the TIDE score was 0.0475. Moreover, it could be observed that both the predicted value and the actual value showed a positive correlation (PCC=0.8698). (Fig. S3B-D).

Furthermore, we validated the predictive potential of our immunotherapy response prediction model using small subsets comprising informative miRNAs (SHAP 0.01 and SHAP 0.02) by applying the same approaches to the 12 tumor types (Fig.1F,I). The models employing informative miRNAs (SHAP 0.01) to predict CTL levels using logistic regression showed an F1 score of 0.9901 and an AUC of 0.9300 (Fig. S4A). In the dysfunction and exclusion score predictions using linear regression, the MSE were 0.0660 and 0.0677, respectively. Moreover, a positive correlation was observed between the predicted and actual values (PCC=0.2899 and 0.4198, respectively) (Fig. S4B,C). Lastly, the stepwise model used to predict the ICB response based on the TIDE score with informative miRNA (SHAP 0.01) yielded an MSE of 0.0661 and a PCC of 0.8335 (Fig. S4D).

Additionally, the results of models with a smaller number of informative miRNAs and strict criteria (SHAP 0.02) revealed compelling outcomes. CTL-level prediction using the logistic regression classifier model showed an F1 score of 0.9904 and an AUC of 0.9300. (Fig. S5A). The linear regression models to predict the dysfunction and exclusion scores also achieved good performances, with the dysfunction score prediction model showing an MSE of 0.0585 and a PCC of 0.3822 and the exclusion score prediction model displaying an MSE of 0.0797 and a PCC of 0.2816 (Fig. S5B,C). In addition, for the prediction of the ICB response using the combined stepwise machine learning model with SHAP 0.02, the MSE was 0.0594 and the PCC was 0.8538 (Fig. S5D). Notably, the experimental results from the external validation datasets confirmed that not only did our model exhibit robust predictive performance regardless of tumor type, but the informative miRNAs were also useful for tumor immunotherapy response prediction.

We further validated the stepwise machine learning model trained on all miRNAs, using novel external independent data from PCAWG (Pancancer Analysis of Whole Genomes). The parameters of each model were set through grid search with tenfold cross-validation (Table S11). For the random forest classifier predicting CTL levels, we achieved an F1 score of 0.9589 and an AUC value of 0.9226 (Table S12). Regarding the prediction of dysfunction and exclusion scores through a random forest regression model, the MSE for the dysfunction score prediction model was 0.0245, and for the exclusion score prediction model, it was 0.0251 (Table S12). The MSE value of the stepwise machine learning model for predicting ICB response based on the TIDE score was 0.0248 (Table S12).

Furthermore, we identified informative miRNAs using the SHAP analysis in the PCAWG cohort (Fig. S6). In addition, we investigated which miRNAs were informative in each tumor type using SHAP (Table S13). It was noted that the informative miRNAs at TCGA cohorts were also similarly identified even at the PCAWG datasets, even though the direct comparison of the miRNAs is difficult because TCGA represents precursor miRNA expression and the PCAWG provides the mature forms. For instance, miR-150 demonstrated the significance in CTL and Dysfunction models. Furthermore, miR-155 was also assigned at a high ranking.

We also validated the predictability of ICB response prediction models in the PCAWG cohort using the informative miRNAs (SHAP 0.01 and SHAP 0.02) extracted from the TCGA cohort (Table S12). The model employing informative miRNAs (SHAP 0.01) achieved an F1 score of 0.9556 and an AUC of 0.9161 for predicting CTL levels via logistic regression. For dysfunction and exclusion score predictions using linear regression, the MSEs were 0.0371 and 0.0528, respectively. The stepwise model for predicting ICB response based on the TIDE score with informative miRNA (SHAP 0.01) yielded an MSE of 0.0376.

Similarly, the model utilizing informative miRNAs (SHAP 0.02) extracted from the TCGA cohort attained an F1 score of 0.9527 and an AUC of 0.9097 for predicting CTL levels via logistic regression. For dysfunction and exclusion score predictions using linear regression, the MSEs were 0.0364 and 0.0798, respectively. Finally, the stepwise model for predicting ICB response based on the TIDE score with informative miRNA (SHAP 0.02) yielded an MSE of 0.0364. The results with the external datasets from PCAWG further affirmed the effectiveness of the informative miRNAs in predicting ICB responses.

Next, we employed the random forest-based ICB response prediction model on the TCGA cohort, stratified by tumor type, to investigate variations in the efficacy of ICI treatment in each tumor type. The parameters of each model were set through grid search with tenfold cross-validation (Table S11). The MSE values of the combined stepwise models for each tumor type ranged from 0.0093 to 0.0494 (Table S12). Notably, these results closely similar to the predictive performance derived from the entire tumor cohort. Thus, this suggests that the differences in ICI treatment response among various cancer types are minimal.

In addition, we investigated which miRNAs were informative in each tumor type using SHAP (Table S13). Even though there existed some differences in each tumor type, some informative miRNAs such as miR-150 and miR-155 were frequently observed at the highly-ranked miRNAs. This result indicates that these miRNAs are closely related to ICB responses across the tumor types.

Moreover, we also evaluated how well the stepwise model pre-trained using the whole 19 TCGA cohorts predicted the test data (20%) for each tumor type (Table S14). The MSE was ranged from 0.0113 to 0.1824 using total miRNAs. Using the information miRNAs (SHAP 0.01), the MSE was ranged from 0.0166 to 0.5530. Similarly, in the SHAP 0.02 model, the MSE was ranged from 0.0159 to 0.5562. These results showed the informative miRNAs were utilized for the prediction of ICB treatment responses even at a variety of cancer types.

Read more here:
Identifying microRNAs associated with tumor immunotherapy response using an interpretable machine learning model ... - Nature.com

Read More..

Will we ever be able to accurately predict solubility? | Scientific Data – Nature.com

Kennedy, T. Managing the drug discovery/development interface. Drug Discov. Today 2, 436444 (1997).

Article Google Scholar

Kola, I. & Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat. Rev. Drug Discov. 3, 711716 (2004).

Article CAS PubMed Google Scholar

Millard, J., Alvarez-Nez, F. & Yalkowsky, S. Solubilization by cosolvents. Establishing useful constants for the log-linear model. Int. J. Pharm. 245, 153166 (2002).

Article CAS PubMed Google Scholar

Jouyban, A. & Abolghassemi Fakhree, M. A. Solubility prediction methods for drug/drug like molecules. Recent Pat. Chem. Eng. 1, 220231 (2008).

Article CAS Google Scholar

van de Waterbeemd, H. Improving compound quality through in vitro and in silico physicochemical profiling. Chem. Biodivers. 6, 17601766 (2009).

Article PubMed Google Scholar

Llompart, P. et al Will we ever be able to accurately predict solubility? Recherche Data Gouv https://doi.org/10.57745/CZVZIA (2023)

Wang, J. & Hou, T. Recent advances on aqueous solubility prediction. Comb. Chem. High Throughput Screen. 14, 328338 (2011).

Article CAS PubMed Google Scholar

Elder, D. P., Holm, R. & Diego, H. L. Use of pharmaceutical salts and cocrystals to address the issue of poor solubility. Int. J. Pharm. 453, 88100 (2013). de.

Article CAS PubMed Google Scholar

Saal, C. & Petereit, A. C. Optimizing solubility: Kinetic versus thermodynamic solubility temptations and risks. Eur. J. Pharm. Sci. 47, 589595 (2012).

Article CAS PubMed Google Scholar

Wang, J. et al. Development of reliable aqueous solubility models and their application in druglike analysis. J. Chem. Inf. Model. 47, 13951404 (2007).

Article CAS PubMed Google Scholar

Johnson, S. R. & Zheng, W. Recent progress in the computational prediction of aqueous solubility and absorption. AAPS J. 8, E27E40 (2006).

Article CAS PubMed PubMed Central Google Scholar

Delaney, J. S. Predicting aqueous solubility from structure. Drug Discov. Today 10, 289295 (2005).

Article CAS PubMed Google Scholar

OECD. Test No. 105: Water Solubility. OECD Guidelines for the Testing of Chemicals, Section 1 https://read.oecd-ilibrary.org/environment/test-no-105-water-solubility_9789264069589-en (1995).

Llins, A., Glen, R. C. & Goodman, J. M. Solubility Challenge: Can You Predict Solubilities of 32 Molecules Using a Database of 100 Reliable Measurements? J. Chem. Inf. Model. 48, 12891303 (2008).

Article PubMed Google Scholar

Stuart, M. & Box, K. Chasing Equilibrium: Measuring the Intrinsic Solubility of Weak Acids and Bases. Anal. Chem. 77, 983990 (2005).

Article CAS PubMed Google Scholar

Huuskonen, J., Rantanen, J. & Livingstone, D. Prediction of aqueous solubility for a diverse set of organic compounds based on atom-type electrotopological state indices. Eur. J. Med. Chem. 35, 10811088 (2000).

Article CAS PubMed Google Scholar

Yalkowsky, RM & Dannenfleser, SH. Aquasol database of aqueous solubility. Version 5. https://hero.epa.gov/hero/index.cfm/reference/details/reference_id/5348039 (2009).

Bloch, D. Computer Software Review. Review of PHYSPROP Database (Version 1.0). ACS Publications https://pubs.acs.org/doi/pdf/10.1021/ci00024a602 (2004) https://doi.org/10.1021/ci00024a602.

Dalanay, J. S. ESOL: Estimating Aqueous Solubility Directly from Molecular Structure. J. Chem. Inf. Comput. Sci. 44, 10001005 (2004).

Article Google Scholar

US EPA. EPI Suite. https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface

Wang, J., Hou, T. & Xu, X. Aqueous Solubility Prediction Based on Weighted Atom Type Counts and Solvent Accessible Surface Areas. J. Chem. Inf. Model. 49, 571581 (2009).

Article CAS PubMed Google Scholar

Boobier, S., Hose, D. R. J., Blacker, A. J. & Nguyen, B. N. Machine learning with physicochemical relationships: solubility prediction in organic solvents and water. Nat. Commun. 11, 5753 (2020).

Article CAS PubMed PubMed Central Google Scholar

Tetko, I. V., Tanchuk, V. Y., Kasheva, T. N. & Villa, A. E. P. Estimation of Aqueous Solubility of Chemical Compounds Using E-State Indices. J. Chem. Inf. Comput. Sci. 41, 14881493 (2001).

Article CAS PubMed Google Scholar

Avdeef, A. Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database. ADMET DMPK 8, 29 (2020).

Article PubMed PubMed Central Google Scholar

Sorkun, M. C., Khetan, A. & Er, S. AqSolDB, a curated reference set of aqueous solubility and 2D descriptors for a diverse set of compounds. Sci. Data 6, 143 (2019).

Article PubMed PubMed Central Google Scholar

Sushko, I. et al. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J. Comput. Aided Mol. Des. 25, 533554 (2011).

Article CAS PubMed PubMed Central Google Scholar

Panapitiya, G. et al. Evaluation of Deep Learning Architectures for Aqueous Solubility Prediction. ACS Omega 7, 1569515710 (2022).

Article CAS PubMed PubMed Central Google Scholar

Wiercioch, M. & Kirchmair, J. Dealing with a data-limited regime: Combining transfer learning and transformer attention mechanism to increase aqueous solubility prediction performance. Artif. Intell. Life Sci. 1, 100021 (2021).

CAS Google Scholar

Lowe, C. N. et al. Transparency in Modeling through Careful Application of OECDs QSAR/QSPR Principles via a Curated Water Solubility Data Set. Chem. Res. Toxicol. 36, 465478 (2023).

Article CAS PubMed Google Scholar

Francoeur, P. G. & Koes, D. R. SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction. J. Chem. Inf. Model. 61, 25302536 (2021).

Article CAS PubMed PubMed Central Google Scholar

Sluga, J., Venko, K., Drgan, V. & Novi, M. QSPR Models for Prediction of Aqueous Solubility: Exploring the Potency of Randi-type Indices. Croat. Chem. Acta 93 (2020).

Meng, J. et al. Boosting the predictive performance with aqueous solubility dataset curation. Sci. Data 9, 71 (2022).

Article CAS PubMed PubMed Central Google Scholar

Lee, S. et al. Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks. ACS Omega 7, 1226812277 (2022).

Article MathSciNet CAS PubMed PubMed Central Google Scholar

Schrdinger. QikProp. (2015).

United States National Library of Medicine. ChemIDplus advanced. https://pubchem.ncbi.nlm.nih.gov/source/ChemIDplus (2011).

Khne, R., Ebert, R.-U., Kleint, F., Schmidt, G. & Schrmann, G. Group contribution methods to estimate water solubility of organic chemicals. Chemosphere 30, 20612077 (1995).

Article Google Scholar

OECD. eChemPortal: The Global Portal to Information on Chemical Substances, https://www.echemportal.org/echemportal/ (2023).

European Chemicals Agency. ECHA. https://echa.europa.eu/fr/ (2023).

Irmann, F. Eine einfache Korrelation zwischen Wasserlslichkeit und Struktur von Kohlenwasserstoffen und Halogenkohlenwasserstoffen. Chem. Ing. Tech. 37, 789798 (1965).

Article CAS Google Scholar

Hansch, C., Quinlan, J. E. & Lawrence, G. L. Linear free-energy relationship between partition coefficients and the aqueous solubility of organic liquids. J. Org. Chem. 33, 347350 (1968).

Article CAS Google Scholar

Yalkowsky, S. H. & Valvani, S. C. Solubility and partitioning I: Solubility of nonelectrolytes in water. J. Pharm. Sci. 69, 912922 (1980).

Article CAS PubMed Google Scholar

Ran, Y. & Yalkowsky, S. H. Prediction of drug solubility by the general solubility equation (GSE). J. Chem. Inf. Comput. Sci. 41, 354357 (2001).

Article CAS PubMed Google Scholar

Hansen, N. T., Kouskoumvekaki, I., Jrgensen, F. S., Brunak, S. & Jnsdttir, S. . Prediction of pH-Dependent Aqueous Solubility of Druglike Molecules. J. Chem. Inf. Model. 46, 26012609 (2006).

Article CAS PubMed Google Scholar

ChemAxon. Marvin. https://chemaxon.com/products/marvin (2023).

Johnson, S. R., Chen, X.-Q., Murphy, D. & Gudmundsson, O. A Computational Model for the Prediction of Aqueous Solubility That Includes Crystal Packing, Intrinsic Solubility, and Ionization Effects. Mol. Pharm. 4, 513523 (2007).

Article CAS PubMed Google Scholar

Hopfinger, A. J., Esposito, E. X., Llins, A., Glen, R. C. & Goodman, J. M. Findings of the Challenge To Predict Aqueous Solubility. ACS Publications https://pubs.acs.org/doi/pdf/10.1021/ci800436c (2008).

Lusci, A., Pollastri, G. & Baldi, P. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53, 15631575 (2013).

Article CAS PubMed PubMed Central Google Scholar

Huuskonen, J., Livingstone, D. J. & Manallack, D. T. Prediction of drug solubility from molecular structure using a drug-like training set. SAR QSAR Environ. Res. 19, 191212 (2008).

Article CAS PubMed Google Scholar

Zhou, D., Alelyunas, Y. & Liu, R. Scores of Extended Connectivity Fingerprint as Descriptors in QSPR Study of Melting Point and Aqueous Solubility. J. Chem. Inf. Model. 48, 981987 (2008).

Article CAS PubMed Google Scholar

Eri, S., Kalini, M., Popovi, A., Zloh, M. & Kuzmanovski, I. Prediction of aqueous solubility of drug-like molecules using a novel algorithm for automatic adjustment of relative importance of descriptors implemented in counter-propagation artificial neural networks. Int. J. Pharm. 437, 232241 (2012).

Article PubMed Google Scholar

Llinas, A. & Avdeef, A. Solubility Challenge Revisited after Ten Years, with Multilab Shake-Flask Data, Using Tight (SD 0.17log) and Loose (SD 0.62log) Test Sets. J. Chem. Inf. Model. 59, 30363040 (2019).

Article CAS PubMed Google Scholar

Llinas, A., Oprisiu, I. & Avdeef, A. Findings of the Second Challenge to Predict Aqueous Solubility. J. Chem. Inf. Model. 60, 47914803 (2020).

Article CAS PubMed Google Scholar

Hewitt, M. et al. In silico prediction of aqueous solubility: the solubility challenge. J. Chem. Inf. Model. 49, 25722587 (2009).

Article CAS PubMed Google Scholar

Goh, G. B., Hodas, N., Siegel, C. & Vishnu, A. SMILES2vec: Predicting Chemical Properties from Text Representations. Preprint at arXiv:1712.02034 (2018).

Cui, Q. et al. Improved Prediction of Aqueous Solubility of Novel Compounds by Going Deeper With Deep Learning. Front. Oncol. 10 (2020).

Maziarka, . et al. Molecule Attention Transformer. (2020).

Lovri, M. et al. Machine learning in prediction of intrinsic aqueous solubility of drug-like compounds: Generalization, complexity, or predictive ability? J. Chemom. 35, e3349 (2021).

Article Google Scholar

Kohavi, R. & Wolpert, D. H. in International Conference on Machine Learning Bias Plus Variance Decomposition for Zero-One Loss Function (1996).

Follow this link:
Will we ever be able to accurately predict solubility? | Scientific Data - Nature.com

Read More..

Weekly AiThority Roundup: Biggest ML and Automation Updates – AiThority

This is your AI Weekly Roundup. We are covering the top updates from around the world. The updates will feature state-of-the-art capabilities inartificial intelligence (AI),Machine Learning, Robotic Process Automation, Fintech, and human-system interactions. We cover the role of AI Daily Roundup and its application in various industries and daily lives.

Oracleannounced new generative AI capabilities within the Oracle Fusion Cloud Applications Suitethat will help customers improve decision making and enhance the employee andcustomer experience. The latest AI additions include new generative AI capabilities embedded in existing business workflows across finance, supply chain, HR, sales, marketing, and service, as well as an expansion of the Oracle Guided Journeys extensibility framework to enable customers and partners to incorporate more generative AI capabilities to support their unique industry and competitive needs.

Box, the leading Content Cloud, announced a new integration with Microsoft Azure OpenAI Service to bring its advanced large language models to Box AI. The integration of Azure OpenAI Service enables Box customers to benefit from the most advancedAI modelsin the world, while bringing Box and Microsofts enterprise-grade standards for security, privacy, and compliance to this groundbreaking technology.

Healthcare leaders are continuing to partner withQualtricsto make stressful and emotional care situations easier for their patients. Qualtrics helpshealthcare organizationsand healthcare providers better understand patient needs and respond to patients in the moment. In a recent Qualtrics report, consumers reported 16% of their recent experiences with hospitals and medical centers were very poor.

Royal Philips, a global leader inhealth technology, announced an expanded collaboration withAmazon Web Services (AWS) to address the growing need for secure, scalable digital pathology solutions in the cloud. The collaboration unites Philips leadership and expertise in digitization of pathology to optimize clinical workflows and AWS leadership in scalable, secure cloud solutions.

Salt Security,the leadingAPI securitycompany, released newthreat researchfromSalt Labshighlighting critical security flaws withinChatGPTplugins, highlighting a new risk for enterprises. Plugins provide AI chatbots like ChatGPT access and permissions to perform tasks on behalf of users within third party websites. For example, committing code to GitHub repositories or retrieving data from an organizations Google Drives. These security flaws introduce a new attack vector and could enable bad actors to:

Link:
Weekly AiThority Roundup: Biggest ML and Automation Updates - AiThority

Read More..

Machine Learning in Fraud Prevention: Exploring how machine learning boosts fraud prevention capabilities – CXOToday.com

By Mr. Jinendra Khobare

Online fraud is a significant issue in India, with various scams such as phishing attacks, identity theft, and counterfeit e-commerce sites. Cybercrime in India has been on the rise, with the country recording over five thousand cases of online identity theft in 2022. Phishing attacks have also seen a surge, with around 83% of IT teams in Indian organisations reporting an increase in phishing emails targeting their employees in 2020. Furthermore, about 38% of consumers have received a counterfeit product from an e-commerce site in the past year.

According to a report, a significant portion of fraudulent transactions occur between 10 PM to 4 AM, with credit card holders over 60 years being the primary victims. From January 2020 to June 2023, 77.4% of cybercrimes were reported in India. The number of cybercrime cases in a city in India rose from 2,888 in 2020 to over 6,000 in 2023.

Machine learning is instrumental in fraud prevention, enabling organisations to detect and prevent suspicious activities in real-time. Traditional fraud prevention methods often struggle to keep up with the evolving tactics of scammers. Machine learning algorithms can quickly analyse vast amounts of data, helping organisations identify patterns and anomalies that may indicate suspicious behaviour. These algorithms learn from past fraud cases, continually enhancing their ability to detect suspicious activities. By integrating machine learning into their fraud prevention strategies, organisations can stay ahead of scams and safeguard their assets effectively.

A key advantage of machine learning in fraud prevention is its ability to detect suspicious activities at an early stage. By analysing historical data and identifying patterns of dubious behaviour, machine learning algorithms can spot suspicious transactions in real-time, enabling organisations to act swiftly and prevent financial losses.

Graph databases, alongside machine learning, have come out as a strong tool in fraud detection. Graph databases record and analyse network interactions at high rates, making them useful for a variety of applications, including fraud detection. They can identify patterns and relationships in big data, reducing the level of complexity so that detection algorithms can effectively discover fraud attempts within a network.

In conclusion, as scammers evolve their tactics, organisations must adapt their fraud prevention strategies to counter these threats effectively. Machine learning and graph databases are powerful weapons in this ongoing battle. With their ability to analyse countless data points rapidly, these technologies can detect suspicious activities accurately, surpassing human capabilities. Its akin to having a team of superhuman fraud detectives working tirelessly around the clock. As quickly as organisations detect and prevent suspicious activities, scammers are equally fast at devising new deception methods.

(The author is Mr. Jinendra Khobare, Solution Architect, Sensfrx, Secure Layer7, and the views expressed in this article are his own)

Visit link:
Machine Learning in Fraud Prevention: Exploring how machine learning boosts fraud prevention capabilities - CXOToday.com

Read More..