Category Archives: Machine Learning

Google’s AI correctly predicts floods up to seven days early – Android Authority

C. Scott Brown / Android Authority

TL;DR

You may have heard plenty about Googles various generative AI products like Circle to Search, its AI wallpaper tool, Search Generative Experience (SGE), and more. But you may not have heard nearly as much about the other ways it is using AI, such as predicting floods. Recently, Google was able to accurately predict flooding up to seven days in advance thanks to machine learning (ML).

Today, Google announced it was able to significantly improve global-scale flood forecasting with the help of machine-learning technologies. According to the firm, it was able to improve the reliability of global nowcasts, on average, from zero to five days. In some cases, it was even able to predict floods a full week out before they happened.

One of the reasons why it can be difficult to predict floods ahead of time is the fact that most rivers dont have whats called a streamflow gauge. These gauges help provide relevant data, like precipitation and physical watershed information. However, the tech giant was able to get around this problem by feeding its ML technology on all available river data and applying the ML model to basins where no data was available.

Lending further credence to how impressive of an accomplishment this is, the companys findings were published in Nature. For context, Nature is a leading multidisciplinary science journal that publishes peer-reviewed research.

The ultimate goal of the technology is to scale the accuracy of flood forecasting to a global level, even in areas where local data is not available. Google has been able to provide forecasts to over 80 countries through its Flood Hub. It also delivers alerts on Google Search, Maps, and Android notifications.

Go here to read the rest:
Google's AI correctly predicts floods up to seven days early - Android Authority

HEAL: A framework for health equity assessment of machine learning performance – Google Research

Posted by Mike Schaekermann, Research Scientist, Google Research, and Ivor Horn, Chief Health Equity Officer & Director, Google Core

Health equity is a major societal concern worldwide with disparities having many causes. These sources include limitations in access to healthcare, differences in clinical treatment, and even fundamental differences in the diagnostic technology. In dermatology for example, skin cancer outcomes are worse for populations such as minorities, those with lower socioeconomic status, or individuals with limited healthcare access. While there is great promise in recent advances in machine learning (ML) and artificial intelligence (AI) to help improve healthcare, this transition from research to bedside must be accompanied by a careful understanding of whether and how they impact health equity.

Health equity is defined by public health organizations as fairness of opportunity for everyone to be as healthy as possible. Importantly, equity may be different from equality. For example, people with greater barriers to improving their health may require more or different effort to experience this fair opportunity. Similarly, equity is not fairness as defined in the AI for healthcare literature. Whereas AI fairness often strives for equal performance of the AI technology across different patient populations, this does not center the goal of prioritizing performance with respect to pre-existing health disparities.

In Health Equity Assessment of machine Learning performance (HEAL): a framework and dermatology AI model case study, published in The Lancet eClinicalMedicine, we propose a methodology to quantitatively assess whether ML-based health technologies perform equitably. In other words, does the ML model perform well for those with the worst health outcomes for the condition(s) the model is meant to address? This goal anchors on the principle that health equity should prioritize and measure model performance with respect to disparate health outcomes, which may be due to a number of factors that include structural inequities (e.g., demographic, social, cultural, political, economic, environmental and geographic).

The HEAL framework proposes a 4-step process to estimate the likelihood that an ML-based health technology performs equitably:

The final steps output is termed the HEAL metric, which quantifies how anticorrelated the ML models performance is with health disparities. In other words, does the model perform better with populations that have the worse health outcomes?

This 4-step process is designed to inform improvements for making ML model performance more equitable, and is meant to be iterative and re-evaluated on a regular basis. For example, the availability of health outcomes data in step (2) can inform the choice of demographic factors and brackets in step (1), and the framework can be applied again with new datasets, models and populations.

With this work, we take a step towards encouraging explicit assessment of the health equity considerations of AI technologies, and encourage prioritization of efforts during model development to reduce health inequities for subpopulations exposed to structural inequities that can precipitate disparate outcomes. We should note that the present framework does not model causal relationships and, therefore, cannot quantify the actual impact a new technology will have on reducing health outcome disparities. However, the HEAL metric may help identify opportunities for improvement, where the current performance is not prioritized with respect to pre-existing health disparities.

As an illustrative case study, we applied the framework to a dermatology model, which utilizes a convolutional neural network similar to that described in prior work. This example dermatology model was trained to classify 288 skin conditions using a development dataset of 29k cases. The input to the model consists of three photos of a skin concern along with demographic information and a brief structured medical history. The output consists of a ranked list of possible matching skin conditions.

Using the HEAL framework, we evaluated this model by assessing whether it prioritized performance with respect to pre-existing health outcomes. The model was designed to predict possible dermatologic conditions (from a list of hundreds) based on photos of a skin concern and patient metadata. Evaluation of the model is done using a top-3 agreement metric, which quantifies how often the top 3 output conditions match the most likely condition as suggested by a dermatologist panel. The HEAL metric is computed via the anticorrelation of this top-3 agreement with health outcome rankings.

We used a dataset of 5,420 teledermatology cases, enriched for diversity in age, sex and race/ethnicity, to retrospectively evaluate the models HEAL metric. The dataset consisted of store-and-forward cases from patients of 20 years or older from primary care providers in the USA and skin cancer clinics in Australia. Based on a review of the literature, we decided to explore race/ethnicity, sex and age as potential factors of inequity, and used sampling techniques to ensure that our evaluation dataset had sufficient representation of all race/ethnicity, sex and age groups. To quantify pre-existing health outcomes for each subgroup we relied on measurements from public databases endorsed by the World Health Organization, such as Years of Life Lost (YLLs) and Disability-Adjusted Life Years (DALYs; years of life lost plus years lived with disability).

However, while the model was likely to perform equitably across age groups for cancer conditions specifically, we discovered that it had room for improvement across age groups for non-cancer conditions. For example, those 70+ have the poorest health outcomes related to non-cancer skin conditions, yet the model didn't prioritize performance for this subgroup.

For holistic evaluation, the HEAL metric cannot be employed in isolation. Instead this metric should be contextualized alongside many other factors ranging from computational efficiency and data privacy to ethical values, and aspects that may influence the results (e.g., selection bias or differences in representativeness of the evaluation data across demographic groups).

As an adversarial example, the HEAL metric can be artificially improved by deliberately reducing model performance for the most advantaged subpopulation until performance for that subpopulation is worse than all others. For illustrative purposes, given subpopulations A and B where A has worse health outcomes than B, consider the choice between two models: Model 1 (M1) performs 5% better for subpopulation A than for subpopulation B. Model 2 (M2) performs 5% worse on subpopulation A than B. The HEAL metric would be higher for M1 because it prioritizes performance on a subpopulation with worse outcomes. However, M1 may have absolute performances of just 75% and 70% for subpopulations A and B respectively, while M2 has absolute performances of 75% and 80% for subpopulations A and B respectively. Choosing M1 over M2 would lead to worse overall performance for all subpopulations because some subpopulations are worse-off while no subpopulation is better-off.

Accordingly, the HEAL metric should be used alongside a Pareto condition (discussed further in the paper), which restricts model changes so that outcomes for each subpopulation are either unchanged or improved compared to the status quo, and performance does not worsen for any subpopulation.

The HEAL framework, in its current form, assesses the likelihood that an ML-based model prioritizes performance for subpopulations with respect to pre-existing health disparities for specific subpopulations. This differs from the goal of understanding whether ML will reduce disparities in outcomes across subpopulations in reality. Specifically, modeling improvements in outcomes requires a causal understanding of steps in the care journey that happen both before and after use of any given model. Future research is needed to address this gap.

The HEAL framework enables a quantitative assessment of the likelihood that health AI technologies prioritize performance with respect to health disparities. The case study demonstrates how to apply the framework in the dermatological domain, indicating a high likelihood that model performance is prioritized with respect to health disparities across sex and race/ethnicity, but also revealing the potential for improvements for non-cancer conditions across age. The case study also illustrates limitations in the ability to apply all recommended aspects of the framework (e.g., mapping societal context, availability of data), thus highlighting the complexity of health equity considerations of ML-based tools.

This work is a proposed approach to address a grand challenge for AI and health equity, and may provide a useful evaluation framework not only during model development, but during pre-implementation and real-world monitoring stages, e.g., in the form of health equity dashboards. We hold that the strength of the HEAL framework is in its future application to various AI tools and use cases and its refinement in the process. Finally, we acknowledge that a successful approach towards understanding the impact of AI technologies on health equity needs to be more than a set of metrics. It will require a set of goals agreed upon by a community that represents those who will be most impacted by a model.

The research described here is joint work across many teams at Google. We are grateful to all our co-authors: Terry Spitz, Malcolm Pyles, Heather Cole-Lewis, Ellery Wulczyn, Stephen R. Pfohl, Donald Martin, Jr., Ronnachai Jaroensri, Geoff Keeling, Yuan Liu, Stephanie Farquhar, Qinghan Xue, Jenna Lester, Can Hughes, Patricia Strachan, Fraser Tan, Peggy Bui, Craig H. Mermel, Lily H. Peng, Yossi Matias, Greg S. Corrado, Dale R. Webster, Sunny Virmani, Christopher Semturs, Yun Liu, and Po-Hsuan Cameron Chen. We also thank Lauren Winer, Sami Lachgar, Ting-An Lin, Aaron Loh, Morgan Du, Jenny Rizk, Renee Wong, Ashley Carrick, Preeti Singh, Annisah Um'rani, Jessica Schrouff, Alexander Brown, and Anna Iurchenko for their support of this project.

The rest is here:
HEAL: A framework for health equity assessment of machine learning performance - Google Research

Machine intelligence-accelerated discovery of all-natural plastic substitutes – Nature.com

Materials

MMT (BYK Additives Incorporation; Cloisite Na+), northern bleached softwood kraft (NBSK) pulp (NIST RM 8495), TEMPO (Sigma-Aldrich, 99%), sodium bromide (NaBr, Sigma-Aldrich, ACS reagent, 99.0%), sodium hypochlorite solution (NaClO, Sigma-Aldrich, reagent grade, available chlorine 1015%), sodium hydroxide (NaOH, Sigma-Aldrich, reagent grade, 98%), gelatin (Sigma-Aldrich, from cold-water fish skin) and glycerol (Sigma-Aldrich, ACS reagent, 99.5%) were used as received without further purification. Deionized (DI) water (18.2M) was obtained from a Milli-Q water purification system (Millipore) and used as the water source throughout this work.

The MMT nanosheet dispersion was prepared according to the literature57. To obtain medium-sized MMT nanosheets, MMT powders were mixed in DI water at 10mgml1, and the mixture was ultrasonicated for 2h and continuously stirred for another 12h. Afterward, the mixture was centrifuged at 1,252g for 60min, and the supernatant was then collected as the dispersion of MMT nanosheets with the concentration about 8mgml1. To obtain small-sized MMT nanosheets, the ultrasonication time was extended to 3h, and the mixture was centrifuged at 5,009g for 60min. Conversely, for large-sized MMT nanosheets, the ultrasonication time was reduced to 1h, and the mixture was centrifuged at a slower speed of 489g for 15min.

The CNF dispersion was prepared according to the literature58. First, 20g of NBSK pulp was suspended in 1.0litre of DI water, and then TEMPO (2103mol) and NaBr (0.02mol) were added into the pulp. The TEMPO-mediated oxidation was initiated by adding 0.2mol of NaClO, and the oxidation process was maintained under continuous stirring for 56h, during which the pH was controlled at 10.0 by adding NaOH solution (3.0M). The TEMPO-oxidized pulp was repeatedly washed with DI water until the pH returned back to 7.0. Afterward, the pulp was disassembled in a microfluidizer processor (Microfluidics M-110EH), and the concentration of CNF dispersion was about 10mgml1.

A total of 8.0g of gelatin was dissolved in 1.0litre of DI water followed by continuous stirring for 48h, and the concentration of gelatin solution was 8.0mgml1.

A total of 8.4g of glycerol was dissolved in 1.0litre of DI water followed by continuous stirring for 12h, and the concentration of glycerol solution was 8.4mgml1.

An automated pipetting robot (Opentrons OT-2) was operated to prepare different mixtures with varying MMT/CNF/gelatin/glycerol ratios. For each mixture, the dispersions/solutions of MMT nanosheets, CNFs, gelatin and glycerol were mixed at different volumes. Afterward, the robot-prepared mixtures were vortexed at 3,000rpm for 30s and placed in a vacuum desiccator to remove air bubbles. Then, the mixtures were cast into a flat, polystyrene-based container at 40C and air dried for 48h.

Each nanocomposite film was subject to detachment and flatness testing after it dried. Regarding detachability, except for samples that can be clearly labelled as detachable or non-detachable (Supplementary Fig. 38a), the mechanical delamination tests were conducted to measure the binding energies of nanocomposite films on hydrophobic polystyrene substrates. As shown in Supplementary Fig. 39. all the detachable samples exhibited the binding energies of <0.4Jcm2, while the undetachable ones were with the binding energies >0.6Jcm2. Thus, the threshold binding energy was set to be 0.5Jcm2 to classify the detachability of nanocomposite films. Regarding flatness, except for samples that can be clearly labelled as flat or curved (Supplementary Fig. 38b), a high-speed laser scanning confocal microscope was employed to characterize the roughness of nanocomposite films. As demonstrated in Supplementary Fig. 40a, the nanocomposite films considered flat exhibited height differences of <200m. Meanwhile, those considered curved typically showcased height differences of >500m (Supplementary Fig. 40b). Once the detachment and flatness tests were finished, only the detachable and flat samples were identified as A-grade nanocomposites.

After constructing the SVM classifier, we examined its prediction accuracy using a set of testing data points. As shown in Supplementary Table 3, a total of 35 MMT/CNF/gelatin/glycerol ratios were randomly selected, and 35 nanocomposite films were fabricated according to the established procedure. Detachment and flatness tests were conducted to categorize these nanocomposite films into different grades. Subsequently, the MMT/CNF/gelatin/glycerol ratios (that is, composition labels) were input into the SVM classifier to obtain the predicted grades, which were then compared with the experimental results. In this study, the SVM classifier accurately predicted the grades for 33 out of the 35 nanocomposite films, resulting in a prediction accuracy of 94.3%.

After constructing the ANN-based prediction model, we examined its prediction accuracy using a set of testing data points.

The deviation between model-predicted property labels and actual property values was quantified using a MRE, defined in equation (3),

$${rm{MRE}}=frac{1}{N},mathop{sum }limits_{i=1}^{N}left|frac{{{{mathrm{output}}}}^{i}-{E}^{i}}{{E}^{i}}right|,$$

(3)

where N is the cumulative number of testing data, ({{{mathrm{output}}}}^{i}) is the model-predicted property labels based on a testing datum (i), ({E}^{i}) is the actual property values of a testing datum (i). A smaller MRE value indicates higher prediction accuracy and vice versa.

The thickness of each all-natural nanocomposite was initially determined using a digital micrometre (293-340-30, Mitutoyo). For each strip sample used in the mechanical test, the nanocomposite thickness was gauged at three separate points, and the average thickness value was derived. Furthermore, the thickness of the all-natural nanocomposites was verified using a field emission scanning electron microscope (Tecan XEIA) operating at 15.0kV. Cross-sectional SEM images were taken, followed by thickness measurements to validate the earlier readings.

The transmittance spectra of all-natural nanocomposites were measured with an ultraviolet (UV)visible spectrometer from 250 to 1,100nm (UV-3600 Plus, PerkinElmer) equipped with an integrating sphere. The transmittance values at 365, 550 and 950nm were extracted as the spectral labels (({T}_{{{mathrm{UV}}}}), ({T}_{{{mathrm{Vis}}}}) and ({T}_{{{mathrm{IR}}}})), respectively.

The fire resistances of the all-natural nanocomposites were assessed using a horizontal combustibility testing method, modified from the standard test method (ASTM D6413)59. The all-natural nanocomposites were cut into 1cm1cm squares, and then they were exposed to the flame of an ethanol burner for 30s (with a flame temperature ranging from 600C o 850C)60. The fire resistance of the all-natural nanocomposites was quantified in terms of ({{mathrm{RR}}}). Three replicates were conducted, and the average ({{mathrm{RR}}}) values were recorded as the fire labels.

The stressstrain curves of the all-natural nanocomposites were determined using a mechanical testing machine (Instron 68SC-05) fitted with a 500-N load cell. After calibrating the load cell, the all-natural nanocomposites were cut into 3cm1cm stripes and subject to a tensile test at an extension rate of 0.02mms1. The tensile tests started with an initial fixture gap of 2cm. Three replicates were conducted for each all-natural nanocomposites.

The surface functional groups of all-natural nanocomposites were characterized using a Fourier transform infrared spectroscopy (FT-IR, Thermo Nicolet NEXUS 670).

The cytotoxic effects of all-natural nanocomposites on the cultured cells (that is, L929 cells) were determined by complying with ISO 10993. Six all-natural nanocomposites with different MMT/CNF/gelatin/glycerol ratios were incubated with Dulbeccos modified Eagle medium (DMEM, Gibco) supplemented with foetal bovine serum (Biological Industries) at 37C for 24h, and the media were then extracted for cell culture. L929 cells were then seeded in 96-well cell culture plates at the density of 1104 cells per well and incubated in a standard cell incubation environment with 5% CO2. After 24h of cell culture, the culture media were removed and replaced with the extracts of all-natural nanocomposites followed by additional 24-h incubation. After 24h, the culture media were withdrawn, and 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide solution was added to each well. Then, the cell culture plate was incubated for 2h at 37C. After the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide solution was discarded, 200ml of dimethyl sulfoxide was added to dissolve the formazan crystals. The optical density of the formazan solution was read by an enzyme-linked immunosorbent assay plate reader at 570nm with a reference wavelength of 650nm.

The cytotoxicity of all-natural nanocomposites was evaluated by a cytotoxicity detection kit (Roche). First, the L929 cells were incubated with the all-natural nanocomposite extracts at 37C for 24h, and the medium (100l) was collected and incubated with the reaction mixture from the kit following the manufacturers instructions. LDH content was assessed by enzyme-linked immunosorbent assay and read at an absorbance of 490nm in a plate reader with a reference wavelength of 630nm. To further confirm the cytotoxicity of all-natural nanocomposites, a fluorescence-based live/dead assay (LIVE/DEAD kit, Life) was performed. After the L929 cells were cultured with the extracts for 24h, calcein was mixed with ethidium homodimer-1 according to the manufacturers instructions, and the dye (100l) was mixed with the retained medium (100l), which was added to each well and incubated at 37C for 15min. After the incubation, we used an inverted microscope (Leica DMi8) to capture the images of live (green) and dead (red) cells. Fluorescence with excitation wavelengths of 488nm and 561nm was used to visualize the green (515nm) and red (635nm) fluorescence signals emitted by calcein and ethidium homodimer-1, respectively. ImageJ software was employed to calculate the proportion of live and dead cell areas. The relative percentages of fluorescence intensity were also determined. ImageJ was utilized to quantify the areas of red and green fluorescence, which produced average values. These numerical values were subsequently used in the quantification formula to determine the fluorescence intensity of live/dead cells in equation (4):

$${rm{Fluorescence}},{rm{intensity}}=({rm{Live}}/{rm{Dead}})/({rm{Live}}+{rm{Dead}})times 100 %$$

(4)

The full atomistic simulations utilized the ReaxFF potential within the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) simulation package61. The ReaxFF potential is widely used to describe chemical bonds and weak interactions of cellulose chains and MMT nanosheets62,63. As shown in Supplementary Fig. 41a, the MD model of the MMT/CNF nanocomposite configured as a multilayered microstructure comprising alternating CNF chains and MMT nanosheets, similar to the SEM observations in Supplementary Fig. 41b. The length of the cellulose chains was set to 104, and the scale of the MMT nanosheets was randomly set between 30 and 60, corresponding to the length scale ratio in the experiments (LCNF:LMMT=1:2). The cellulose chains and MMT nanosheets were passivated by polar hydrogens or OH groups. The entire system was equilibrated under the isothermal-isobaric ensemble (that is, NPT ensemble) at 300K and 0atm, using the NosHoover thermostat and barostat. Then, the micro-canonical ensemble was applied in the stretching process. The timestep was set as 0.5fs, and the periodic boundary conditions were applied in all directions (x, y and z) for all models. To better understand intermolecular interactions, both cellulose chains and MMT nanosheets were randomly arranged in alignment in the periodical box. All calculations were relaxed using the conjugate gradient algorithm to minimize the total energy of the system until the total atomic forces were converged to less than 109eV1.

Read more:
Machine intelligence-accelerated discovery of all-natural plastic substitutes - Nature.com

Accurate and rapid antibiotic susceptibility testing using a machine learning-assisted nanomotion technology platform – Nature.com

All experimental procedures and bioinformatic analyses in this work comply with ethical regulations and good scientific practices. An ethics approval for the pre-clinical experiments was not required as anonymized biological material, i.e., anonymized blood for the blood culture incubation, was provided by a blood donation center in Switzerland. The clinical study protocol for the PHENOTECH-1 study (NCT05613322) was approved by the Ethics Committee for Investigation with Medicinal Products (CEIm) in Madrid (ID 239/22), the Cantonal Commission for Ethics in Research on Human Beings (CER-VD) in Lausanne (ID 2022-02085), and the Ethics Committee of the Medical University of Innsbruck in Innsbruck (ID 1271/2022).

The strain collection used in this study consists of ATCC reference strains and clinical isolates either from patient blood samples at hospital sites or procured from strain collections (Supplementary Data1). In order to establish a methodology for nanomotion-based AST, we used the E. coli reference strain ATCC-25922, which is susceptible to ceftriaxone (CRO; ceftriaxone disodium salt hemi(heptahydrate) analytical standard, Merck & Cie, Schaffhausen, Switzerland), cefotaxime (CTX; cefotaxime sodium, Pharmaceutical Secondary Standard, Supelco, Merck & Cie, Schaffhausen, Switzerland), ciprofloxacin (CIP; ciprofloxacin, VETRANAL, analytical standard, Merck & Cie, Schaffhausen, Switzerland), and ceftazidime-avibactam (SigmaAldrich, Merck & Cie, Schaffhausen, Switzerland). Our reference strains for antibiotic resistance were BAA-2452 (resistant to CRO and CTX, blaNDM producer) and BAA-2469 (resistant to CIP). The K. pneumoniae reference isolates ATCC-27736 was susceptible to CRO.

To differentiate between resistant and susceptible phenotypes, clinical isolates were selected based on their MIC in accordance with the European Committee on Antimicrobial Susceptibility Testing (EUCAST) interpretation guidelines59. MIC strips and disk diffusion tests were performed on MH Agar plates (Mueller-Hinton agar VWR International GmbH, Dietikon, Switzerland). During all nanomotion experiments, bacteria in the measurement chamber were incubated with filtered (0.02 m Polyethersulfone, PES, Corning, or Millipore) LB (Millers LB Broth, Corning) half-diluted in deionized water (Molecular Biology Grade Water, Cytiva), hereafter referred to as 50% LB.

All bacterial strains were stored at 80C in 20% glycerol. Bacterial samples for nanomotion experiments were prepared by first thawing new cell aliquots and growing them at 37C on Columbia agar medium solid plates (Columbia blood Agar, 5% sheep blood, VWR International GmbH, Dietikon, Switzerland). These cells were then used to inoculate blood culture medium and subsequently grown for nanomotion experimentation.

We performed MIC gradient tests (MIC strips) to determine the minimal inhibitory concentration (MIC) for each antibiotic used in this study. Cell suspensions were prepared by selecting three to five colonies grown overnight (ON) at 37C on a Columbia agar plate and resuspending them in 0.9% NaCl solution (Sodium Chloride, 0.9%, HuberLab, PanReac Applichem) at a density of 0.5 McFarland units (corresponding to OD600nm=0.07). This suspension was then spread on MH plates using a sterile cotton swab to create a confluent culture. MIC strips (ceftriaxone 0.016256g/mL, ciprofloxacin 0.00232g/mL, cefotaxime 0.016256g/mL, ceftazidime 0.016256g/mL, and ceftazidime-avibactam 0.016/4256/4g/mL MIC test strips, Liofilchem, Roseto degli Abruzzi, Teramo, Italy) were then placed onto inoculated plates using tweezers. The plates were subsequently incubated at 37C for 1620h, with the growth inhibition area surrounding the MIC strip present after this incubation period used to interpret MICs.

While MIC strips served as the primary AST reference method, some situations presented difficult interpretations or exceeded the scale of the CRO MIC strips. Here, broth microdilution assays were performed according to EUCAST recommendations59. Furthermore, a disk diffusion assay (DDA) was performed in parallel to each sample assessed using nanomotion technology for quality assurance purposes20,60.

To facilitate bacterial attachment and prevent cellular detachment during AST recording, we incubated the cantilever with 50l of 0.1mg/ml PDL (Poly-D-Lysine hydrobromide, MP Biomedicals, Santa Ana, California, USA) diluted in molecular biology grade water (HyClone, Logan, Utah, United States) for 20min at room temperature (RT). This treatment created a homogenous positive electric charge that enabled the attachment of negatively charged bacteria. Following incubation, thePDLdrop was removed and discarded, after which the cantilever tip was gently washed with 100l of molecular biology-grade water. The sensors on the cantilever were then allowed to dry for at least 15min before use.

Spiking refers to the process of inoculating blood culture samples with artificially infected blood. Here, we cultured strains of interest on Columbia Agar plates ON at 37C, isolated a single colony, and resuspended it in 0.9% NaCl with volumes adjusted to obtain a 0.5 McFarland density. We then performed two 1:10 serial dilutions, starting with that suspension, to generate a final dilution of 1:100. Finally, 10l of the final dilution were added to 9990l of EDTA blood from a donor provided by a blood donation center in Switzerland. Blood has been received fully anonymized.

To generate spiked blood cultures, we added 10ml of artificially infected blood to either anaerobic (ANH) or aerobic (AEH) blood culture bottles (BD BACTECTM Lytic Anaerobic medium and BD BACTECTM Standard Aerobic medium Culture Vials; Becton Dickinson, Eysins, Switzerland) using a syringe. These culture bottles were then incubated until positivity, as determined by the BACTECTM 9240 automated blood culture system (Becton Dickinson), was reached. In most cases, this process took 12h or an overnight incubation.

To generate and purify bacterial pellets for nanomotion recordings, we used either the MBT Sepsityper IVD Kit (Bruker) or the direct attachment method (DA). When using the MBT Sepsityper IVD Kit, we followed the manufacturers instructions. Briefly, 1ml of blood culture was combined with 200l Lysis Buffer, mixed by vortexing, and then centrifuged for 2min at 12,000g to obtain a bacterial pellet. The supernatant was discarded, while the bacterial pellet was resuspended in 1ml of Washing Buffer. The resuspension was then centrifuged again for 1min at 12,000g to remove debris. For DA, 1ml of positive blood culture (PBC) was syringe filtered (5m pore size, Acrodisc Syringe Filters with Supor Membrane, Pall, Fribourg, Switzerland). The pellet was then used for attachment to the cantilever.

Bacterial cells from prepared pellets needed to be immobilized onto the surface of the functionalized cantilever for nanomotion recording. First, pellets were resuspended in a PBS (Phosphate Buffer Saline, Corning) solution containing 0.04% agarose. Next, the sensor was placed on a clean layer of Parafilm M (Amcor, Victoria, Australia). The tip of the sensor, containing the chip with the cantilever, was placed into contact with a single drop of bacterial cell suspension for 1min. After this, the sensor was removed, gently washed with PBS, and assessed using phase microscopy for attachment quality. In the event of unsatisfactory attachment, the sensor was re-incubated in the cell suspension for an additional 3060s, or until satisfactory attachment was achieved. We aimed for an even bacterial distribution across the sensor (Fig.1b, c, and Supplementary Fig.2). The attachment of bacteria is part of a filed patent (PCT/EP2020/087821).

Our nanomotion measurement platform, the Resistell Phenotech device (Resistell AG, Muttenz, Switzerland), comprises a stainless-steel head with a measurement fluid chamber, an active vibration damping system, acquisition and control electronics, and a computer terminal.

Nanomotion-based AST strategies utilize technologies that are well-established in atomic-force microscopy (AFM). Specifically, our nanomotion detection system is based on an AFM setup for cantilever-based optical deflection detection. However, in contrast to standard AFM devices, in the Phenotech device the light source and the photodetector are placed below the cantilever to facilitate the experimental workflow. A light beam, focused at the cantilever end, originates in a superluminescent diode (SLED) module (wavelength: 650mm, optical power: 2mW), is reflected, and reaches a four-sectional position-sensitive photodetector that is a part of a custom-made precision preamplifier (Resistell AG). The flexural deflection of the cantilever is transformed into an electrical signal, which is further processed by a custom-made dedicated electronic module (Resistell AG) and recorded using a data acquisition card (USB-6212; National Instruments, Austin, TX, USA). The device is controlled using a dedicated AST software (custom-made, Resistell AG).

The custom-made sensors used for the described experiments (Resistell AG) contain quartz-like tipless cantilevers with a gold coating acting as a mirror for the light beam (SD-qp-CONT-TL, spring constant: 0.1N/m, length width thickness: 130400.75m, resonant frequency in air: 32kHz; NanoWorld AG, Neuchtel, Switzerland). During an AST experiment, bacterial nanoscale movements actuate the cantilever to deflect in specific frequencies and amplitudes.

For the development of temperature-controlled experiments with CZA at 37C, we used modular NanoMotion Device (NMD) prototypes. It allowed the reconfiguration of the hardware setup to work with either a standard incubator or a modified measurement head to warm up only the measurement chamber. For the merge of an NMD with a BINDER BD 56 incubator, the size of the incubator fits the entire NMD head with the active vibration damping module, also permitting the user a comfortable manual operation. The incubator shelf was rigid and able to hold the vibration isolator and NMD head (ca. 10kg), and the incubator was modified with an access port to pass through control cables operating the light source, photodetector, and vibration damping module from the outside. Another NMD prototype was equipped with a locally-heated measurement chamber, thermally insulated from the measurement head set-up. A Peltier module as a heating element was installed under the measurement chamber, adapted to temperature control by adding a Pt100 temperature sensor. Temperature was kept at 37C by a Eurotherm EPC3016 PID controller (Eurotherm Ltd, Worthing, United Kingdom) and a custom-made Peltier module driver. Both setups had a temperature stability <0.2C, which is a matching requirement for stable culture conditions.

Each sampled nanomotion signal was split into 10s timeframes. For each timeframe, the linear trend was removed and the variance of the residue frame was estimated. For some experiments, the variance signal was too noisy for classification, necessitating the application of an additional smoothing procedure. A running median with a 1min time window was applied to smooth the variance signal and allow plot interpretation. For the calculation of the SP slope of the variance in the drug phase used for determining the nanomotion dose response in Fig.2b and Supplementary Fig.4, we used the formula log(x)=log(C) + at, where t is time (in min), a is the slope of the common logarithm of the variance trend, and log(C) is the intercept. Variance plots were used here for the visual inspection of results, and are currently the primary tool accessible for investigators. However, more sophisticated SPs are necessary for reliably classifying phenotypes in ASTs.

Nanomotion-based AST was performed using Resistell Phenotech devices (Resistell AG, Muttenz, Switzerland) on a standard laboratory benchtop. Each recording comprises two phases: a 2-h medium phase and a 2-h drug phase. In addition, a short blank phase is conducted to measure the baseline deflections of a new, bare, functionalized cantilever in 50% LB medium for 510min. Raw nanomotion recordings were used to develop classification models using machine learning.

The signal during the blank phase is expected to be constant and primarily flat (variance around 2.6 E-6 or lower). Higher median values or the clear presence of peaks are indicators of potential contamination of the culture medium inside the measurement fluid chamber, sensor manufacturing errors, or an unusual external environmental noise source that should be identified and rectified. In particular, contamination (OD600<0.01) can cause deflection signals that are several orders of magnitude higher than expected for sterile media due to interactions between floating particles in the fluid chamber and the laser beam. The blank phase serves as a quality control but is not used for classification models and, therefore, can be performed several hours prior to recording medium and drug phases.

The medium phase records cantilever deflections after bacterial attachment, showing the oscillations caused by natural bacterial nanomotions stemming from metabolic and cellular activity. Here, variance is expected to be greater (105 to 103) than during the blank phase. The 2-h medium phase duration allows cells to adapt to their new environment within the fluid chamber and generates a baseline that can be compared to bacterial nanomotions during the drug phase. The drug phase measures cellular vibrations after an antibiotic has been introduced to the fluid chamber. The antibiotic is directly pipetted into the medium already present within the measurement chamber.

The Phenotech device detects nanomotion signals resulting from the activity of living cells. However, other sources can create detectable noise during cantilever-based sensing61. Thermal drift occurring on the cantilever62, as well as external sources such as acoustic noise and mechanical vibrations, can all impact measurements. Distinguishing cell-generated vibrations from background noise can be challenging. As such, we employed a supervised machine learning-based approach to extract signal parameters (SPs) containing diagnostic information while minimizing overall background noise. The entire procedure of analyzing motional activity of particles is part of a filed patent (PCT/EP 2023/055596).

First, a batch of initial SPs related to frequency and time domains were extracted, with time and frequency resolution being high to allow for further statistical analysis at this level. Next, different statistical parameters were created with a much coarser time and frequency scale. Finally, various combinations (differences, ratios, etc.) were calculated, forming a final batch of SPs that are more related to antibiotic susceptibility. SPs were estimated for experiments with cells and conditions with well-defined and known outputs (e.g., susceptibility to a given antibiotic could be known through reference AST methods). Here, extracted SPs and outputs formed labeled datasets that could be used for supervised machine learning.

A feature selection algorithm extracted SPs related to the phenomenon of interest. These SPs were selected from the overall batch of SPs to optimize the performance of this so-called machine learning model. In this case, the model was a classifier validated by analysis of metrics measuring the degree of distinguishing antibiotic susceptibility. Therefore, a forward selection method was applied. All SPs were subsequently evaluated in the classifier with repeated stratified cross-validation. The SPs that enabled the classifier to reach maximal accuracy were added to the stack of selected SPs and deleted from the remaining SPs. In the next iteration, all remaining SPs were again tested with the already-selected SPs. The best-performing SP was again added to the selected SP stack. This process was repeated several times until the overall performance reached a plateau or a predefined number of SPs were selected. In the final model (iii), these newly found SPs were then used as machine learning model features. Classifier models were trained using the complete available dataset and could now be used to classify previously unseen data. The Supplementary information elaborates in more detail on that process and lists all SPs used in the different classification models.

After achieving Pareto optimality, the models were tested on independent test datasets consisting exclusively of strains of K. pneumoniae or E. coli that were not used in the training of the corresponding model. We used either spiked blood cultures or directly anonymized remnant PBC from the Lausanne University Hospital (CHUV) in Lausanne. Spiking was predominantly utilized to increase the fraction of resistant strains to obtain more representative specificity (classification performance of resistant strains), as resistance rates at that hospital are around 10 % for CRO and CIP and close to non-existent for CZA. Each nanomotion recording was classified separately and combined using the median to a sample reporting accuracy, sensitivity and specificity exactly as described for reporting the training performance.

In addition to this, we performed an interim analysis of the multicentric clinical performance study PHENOTECH-1 (NCT05613322), conducted in Switzerland (Lausanne University Hospital, Lausanne), Spain (University Hospital Ramn y Cajal, Madrid) and Austria (Medical University of Innsbruck, Innsbruck). The study evaluates the performance of the nanomotion AST with the Phenotech device using the CRO model on E. coli and K. pneumoniae from fresh residual PBC. Ethical review and approval were obtained by the hospital ethics committee at each participating site. In Lausanne and Innsbruck, only samples from patients who had previously agreed to the use of their residual biological material were utilized. In Madrid, consent for participation was not required for this study in accordance with institutional requirements. No compensation was paid to participants. The interim results reported here comprise the first included 85 samples with complete data entry. The eventual sample size of 250 was estimated based on the expected rate of E. coli and K. pneumoniae samples susceptible to the antibiotic in the three countries (i.e., 80%). Allowing for up to 10% samples with missing data or technical errors, an overall sample size of 250 would include 180 truly susceptible samples with 98% power to demonstrate that sensitivity is at least 90%. The PHENOTECH-1 study is expected to conclude in 2024. The endpoints of this study include the accuracy, sensitivity, and specificity of the device according to ISO-20776-2 (2021), as well as the time to result from the start of the AST to the generation of the result in form of a time stamped report. Regarding inclusion criteria, patients aged 18 years or older, with positive blood cultures for either E. coli or K. pneumoniae, are eligible for participation in the study. Additionally, Phenotech AST needs to be performed within 24h of the blood culture turning positive. Patients with polymicrobial samples are excluded from the study.

Qualitative results of the Kirby Bauer disk diffusion assay, i.e., either R or S, were used for benchmarking. Clinical breakpoints for the class definition were according to EUCAST in 2022. The samples coming from one PBC were measured in technical triplicates for 4h. The results from each recording were automatically combined to a sample. Instead of the median score, a majority voting system was in place that is, RRR, RRS and RR- return predicted resistance, SSS, SSR, SS- return susceptibility. In this way even if one recording needed to be excluded because of technical errors, or detection of substantial elongation of the specimen, the sample could be interpretated. Only if two or more recordings were excluded, or the exclusion of one recording resulted in the disagreement between the two remaining recordings, the sample would be classified as non-conclusive. The experiments were not randomized and the investigators were unblinded during experiments and outcome assessment. Information on sex, gender, and age of participants was not collected in this study as having no impact on the generalizability and translation of the findings. At the time of analysis, the data set included 119 samples, of which 12 screening failures, 5 with technical errors or elongation, and 12 incomplete/unverified. Samples with complete, verified and cleaned data accounted to 90. Of these, the first 85 samples were selected of which 20 samples derived from CHUV, 48 samples from Ramon y Cajal Hospital and 17 samples from Medical University of Innsbruck.

Statistical details can be found in the figure legends. Data are presented as mean or medianSD or representative single experiments and provided in the Source data file. In Figs.3, 5, and 6, the performance calculation is based on single recordings for which a score was calculated. Each recording is depicted as a datapoint representing a biological replicate originating from a different PBC. Performance calculation in Fig.4 is based on the median of the scores calculated for each technical replicate originating from the same PBC. Thus, each datapoint represents the median score as it is currently implemented in the PHENOTECH-1 clinical performance study. In each case, scores are logits predicted by the corresponding logistic regression model. In Fig.5e the two-tailed MannWhitney U test was performed for calculating a p-value. Statistical analysis and graphs were generated with GraphPad Prism 10.

Further information on research design is available in theNature Portfolio Reporting Summary linked to this article.

Continued here:
Accurate and rapid antibiotic susceptibility testing using a machine learning-assisted nanomotion technology platform - Nature.com

Expert on how machine learning could lead to improved outcomes in urology – Urology Times

In this video, Glenn T. Werneburg, MD, PhD, shares the take-home message from the abstracts "Machine learning algorithms demonstrate accurate prediction of objective and patient-reported response to botulinum toxin for overactive bladder and outperform expert humans in an external cohort and "Machine learning algorithms predict urine culture bacterial resistance to first line antibiotic therapy at the time of sample collection, which were presented at the Society of Urodynamics, Female Pelvic Medicine & Urogenital Reconstruction 2024 Winter Meeting in Fort Lauderdale, Florida. Werneburg is a urology resident at Glickman Urological & Kidney Institute at Cleveland Clinic, Cleveland, Ohio.

We're very much looking forward to being able to clinically implement these algorithms, both on the OAB side and the antibiotic resistance side. For the OAB, if we can identify who would best respond to sacral neuromodulation, and who would best respond to onabotulinumtoxinA injection, then we're helping patients achieve an acceptable outcome faster. We're improving their incontinence or their urgency in a more efficient way. So we're enthusiastic about this. Once we can implement this clinically, we believe it's going to help us in this way. It's the same for the antibiotic resistance algorithms. When we can get these into the hands of clinicians, we'll be able to have a good suggestion in terms of which is the best antibiotic to use for this patient at this time. And in doing so, we hope to be able to improve our antibiotic stewardship. Ideally, we would use an antibiotic with the narrowest spectrum that would still cover the infecting organism, and in doing so, it reduces the risk for resistance. So if that same patient requires an antibiotic later on in his or her lifetime, chances areand we'd have to determine this with data and experimentsif we're implementing a narrower spectrum antibiotic to treat an infection, they're going to be less likely to be resistant to other antibiotics down the line.

This transcription was edited for clarity.

Read more from the original source:
Expert on how machine learning could lead to improved outcomes in urology - Urology Times

Machine Learning, Quantum Computing Can Transform Health Care, Including Diagnosing Pneumonia – Carnegie Mellon University

Machine learning is used for prediction, and in health care we want to predict if somebody has a disease or not, he said. If you give enough examples of images that have pneumonia and not pneumonia, because there are two cases, this is called binary classification.

Tayur and a team of researchers studied a technique called support vector machine for classification using quantum-inspired computing, then compared it to other methodsin a recent paper.

We showed that it is pretty competitive, he said. It makes fewer mistakes and it takes less time.

Tayur founded theQuantum Technologies Group(opens in new window) at CMU to better understand and apply quantum computing methods to industries such as health care.

People are always looking for more efficient ways of solving problems and novel methods and technologies to tackle it, he said.

In the mid-20th century, scientists who led the first quantum revolution changed the world with innovations such as the transistor, laser and atomic clock. While hardware to compute using qubits is still in development, simulators are capable of tackling problems of realistic size with specially tailored algorithms, which is why this approach is known as quantum-inspired computing.

Assuming that qubit devices of larger size and lower errors are going to be developed, we can simulate them on a regular computer right now, Tayur said.

These technologies, however, are still at the leading edge of considerations when it comes to the application of artificial intelligence in health care.

In order to do so, the industry has four challenges ahead of it, as Tayurdescribed in research(opens in new window) with Tinglong Dai of Johns Hopkins Carey Business School: physician buy-in, patient acceptance, provider investment and payer support.

To achieve these goals, any AI applied to health care systems should consider how physicians will integrate it into their practices, and then review how patients perceive the role of AI in health care delivery.

We wrote that paper in 2022, but things havent changed that much. Its not just about building a better mousetrap, its about getting people to use that mousetrap, he said, referencinga long-held business idea(opens in new window) that success comes from simply designing the best product.

First, as an example, Tayur explained thatmore than 500 medical AI devices(opens in new window) have been approved by the FDA, but wide adoption of these technologies is still just beginning, in part because of the state of the health care industry and where financial incentives lie.

Having a good product is necessary, but its not sufficient, he said. You still need to figure out how people are going to use it, and who is going to pay for it.

Second, a major consideration in health care is liability. When it comes to devices, a company might encourage doctors to adopt them, but what happens if the device gives a faulty diagnosis or a doctor gives an incorrect interpretation of the data from the device?

In the paper, we basically talk about the fact that you have to figure out the business case, both risk and reward, along with training and upfront investments in adopting the technology, he said.

In applying elements of AI and quantum computing to health care, Tayur said while at least some progress has been made, there is still a long way to go.

Many times what happens is a lot of the AI in health care is derived by scientists and research physicians, he said. What they need is a business person who is less enamored by the mousetrap and more sensitive to the patient journey and commercial viability.

See original here:
Machine Learning, Quantum Computing Can Transform Health Care, Including Diagnosing Pneumonia - Carnegie Mellon University

Creation of a machine learning-based prognostic prediction model for various subtypes of laryngeal cancer | Scientific … – Nature.com

Data collection

The Surveillance, Epidemiology, and End Results Program (SEER) database was used to gather the study's data (Incidence-Seer Research Plus Data 17 Registries Nov 2021 Sub). Using the SEER*Stat program (version 8.4.2), we retrieved individuals who had been given a larynx carcinoma diagnosis by the third edition of the International Classification of Oncology Diseases (ICD-O-3). The period frame covers instances handled between 2000 and 2019. The following were the inclusion requirements: The behavior was identified as malignant and encoded by position and shape as "larynx".

In total, 54,613 patients with primary laryngeal malignant tumors were included. The median follow-up duration of the sample in this study is 38months. We used the following exclusion criteria to clean up the data: (1) Patients with limited follow-up information; (2) Patients without T stage (AJCC7), N stage (AJCC7), M stage (AJCC7), or AJCC stage grade information.

We selected variables that were directly related to the clinic, such as age, race, and gender, based on clinical experience. We chose the T stage, N stage, M stage, AJCC stage (AJCC stage 7), tumor size, and pathological categorization to assess the patient's health. Finally, to evaluate the patient's treatment plans, we also included radiation therapy, surgery, and chemotherapy.

A classic model for survival analysis, the Cox proportional hazards (CoxPH) model has been the most commonly applied multifactor analysis technique in survival analysis to date18,19.

CoxPH is a statistical technique for survival analysis, which is mainly used to study the relationship between survival time and one or more predictors. The core of the model is the proportional risk hypothesis.

It is expressed as h(t|x)=h0 (t) exp (|x), h(t|x) is the instantaneous risk function under the given covariable x, h0 (t) is the baseline risk function, on the other hand, exp ( x) represents the multiplicative effect of covariates on risk.

The random survival forest (RSF) model is an extremely efficient integrated learning model that can handle complex data linkages and is made up of numerous decision trees20.

RSF can improve the accuracy and robustness of the prediction, but it does not have a single expression because it is an integrated model consisting of multiple decision trees21. RSF constructs 1000 trees and calculates the importance of variables. To find the optimal model parameters, we adjust three key parameters: the maximum number of features of the tree (mtry), the minimum sample size of each node (nodesize), and the maximum depth of the tree (nodedepth). The values of these parameters are set to mtry from 1 to 10, nodesize from 3 to 30, and nodedepth from 3 to 6. We use a random search strategy (RandomSearch) to optimize the parameters. To evaluate the performance of the model under different parameter configurations, we use tenfold cross-validation and use C-index (ConcordanceIndex) as the evaluation index. The purpose of this process is to find the parameter configuration that can maximize the prediction accuracy of the model through many iterations.

One of the integrated learning methods called Boosting is the gradient boosting machine (GBM) model, which constructs a strong prediction model by combining several weak prediction models (usually decision trees). At each step, GBM adds a new weak learner by minimizing the loss function. The newly added model is trained to reduce the residual generated in the previous step, and the direction is determined by the gradient descent method. It can be expressed as Fm+1(x)=Fm(x)+mhm(x). Where the Fm(x) is a weak model newly added, and the m is the learning rate.

XGBoost is an efficient implementation of GBM, especially in optimizing computing speed and efficiency. To reuse the learner with the highest performance, it linearly combines the base learner with various weights22. eXtreme Gradient Boosting (XGBoost) is an optimization of the Gradient Boosting Decision Tree (GBDT), which boosts the algorithm's speed and effectiveness23. The neural network-based multi-task logic regression model developed by Deepsurv outperforms the conventional linear survival model in terms of performance24. DeepSurv uses a deep neural network to simulate the Cox proportional hazard model. Therefore, deepsurv can be expressed as h(t|x)=h0 (t) exp (g(x)), Where the g (x) is the output of the neural network, which represents the linear combination of the covariable x8.

We categorize five models to adapt to various variable screening techniques used with various models. The RSF, GBM, and XGBoost models are screened using the least absolute shrinkage and selection operator (LASSO) regression analysis, while the CoxPH model is screened using the traditional Univariate and multivariate Cox regression analysis25,26,27.

In contrast, the Deepsurv model can automatically extract features and handle high-dimensional data and nonlinear relationships, so variable screening is not necessary28. We randomly split the data set into t and v datasets (training set and validation set) and test set in the ratio of 9:1 using spss (version 26) to further illustrate the model's dependability. Randomly selected 10% of the data as external verification. Once more, the ratio of 7:3 is used to divide the training set and validation set, and for both splits, the log-rank test is used to evaluate any differences between the two cohorts. The mlr3 package of R (version 4.2.2) uses the grid search approach to fine-tune the hyperparameters in the RSF, GBM, and XGBoost models in the validation set and chooses the most beneficial hyperparameters to build the survival model once the variables have been filtered following the aforementioned stages. Finally, the Deepsurv model is constructed using the Python (version 3.9) sksurv package, and the model is additionally optimized using grid search.

We used the integrated Brier score (IBS), which is appropriate for 1-year, 3-year, and 5-year time points, as the major assessment metric when evaluating the prediction performance of the model in the test set. In addition, the calibration curve is drawn and the conventional time-dependent receiver operating characteristic (ROC) curve as well as the area under the curve (AUC) (1year, 3years, and 5years) are compared. By calculating the clinical net benefit to address the actual needs of clinical decisions, Decision Curve Analysis (DCA), a clinical evaluation prediction model, incorporates the preferences of patients or decision-makers into the analysis. Calculating the various clinicopathological characteristics is also required for the prognosis of contribution. We visualized the survival contribution of several clinicopathological characteristics for 1-year, 3-years, and 5-years using The Shapley Additive Explanations (SHAP) plot.

Clinically speaking, various individuals require personalized care. Consequently, it is crucial to estimate the likelihood that a single patient will survive. The survival probability of a certain patient is predicted using the ggh4x package of R (version 4.2.2), along with the contribution of several clinicopathological characteristics to survival. This has major clinical work implications.

Read more:
Creation of a machine learning-based prognostic prediction model for various subtypes of laryngeal cancer | Scientific ... - Nature.com

Optimize price-performance of LLM inference on NVIDIA GPUs using the Amazon SageMaker integration with NVIDIA … – AWS Blog

NVIDIA NIM microservices now integrate with Amazon SageMaker, allowing you to deploy industry-leading large language models (LLMs) and optimize model performance and cost. You can deploy state-of-the-art LLMs in minutes instead of days using technologies such as NVIDIA TensorRT, NVIDIA TensorRT-LLM, and NVIDIA Triton Inference Server on NVIDIA accelerated instances hosted by SageMaker.

NIM, part of the NVIDIA AI Enterprise software platform listed on AWS marketplace, is a set of inference microservices that bring the power of state-of-the-art LLMs to your applications, providing natural language processing (NLP) and understanding capabilities, whether youre developing chatbots, summarizing documents, or implementing other NLP-powered applications. You can use pre-built NVIDIA containers to host popular LLMs that are optimized for specific NVIDIA GPUs for quick deployment or use NIM tools to create your own containers.

In this post, we provide a high-level introduction to NIM and show how you can use it with SageMaker.

NIM provides optimized and pre-generated engines for a variety of popular models for inference. These microservices support a variety of LLMs, such as Llama 2 (7B, 13B, and 70B), Mistral-7B-Instruct, Mixtral-8x7B, NVIDIA Nemotron-3 22B Persona, and Code Llama 70B, out of the box using pre-built NVIDIA TensorRT engines tailored for specific NVIDIA GPUs for maximum performance and utilization. These models are curated with the optimal hyperparameters for model-hosting performance for deploying applications with ease.

If your model is not in NVIDIAs set of curated models, NIM offers essential utilities such as the Model Repo Generator, which facilitates the creation of a TensorRT-LLM-accelerated engine and a NIM-format model directory through a straightforward YAML file. Furthermore, an integrated community backend of vLLM provides support for cutting-edge models and emerging features that may not have been seamlessly integrated into the TensorRT-LLM-optimized stack.

In addition to creating optimized LLMs for inference, NIM provides advanced hosting technologies such as optimized scheduling techniques like in-flight batching, which can break down the overall text generation process for an LLM into multiple iterations on the model. With in-flight batching, rather than waiting for the whole batch to finish before moving on to the next set of requests, the NIM runtime immediately evicts finished sequences from the batch. The runtime then begins running new requests while other requests are still in flight, making the best use of your compute instances and GPUs.

NIM integrates with SageMaker, allowing you to host your LLMs with performance and cost optimization while benefiting from the capabilities of SageMaker. When you use NIM on SageMaker, you can use capabilities such as scaling out the number of instances to host your model, performing blue/green deployments, and evaluating workloads using shadow testingall with best-in-class observability and monitoring with Amazon CloudWatch.

Using NIM to deploy optimized LLMs can be a great option for both performance and cost. It also helps make deploying LLMs effortless. In the future, NIM will also allow for Parameter-Efficient Fine-Tuning (PEFT) customization methods like LoRA and P-tuning. NIM also plans to have LLM support by supporting Triton Inference Server, TensorRT-LLM, and vLLM backends.

We encourage you to learn more about NVIDIA microservices and how to deploy your LLMs using SageMaker and try out the benefits available to you. NIM is available as a paid offering as part of the NVIDIA AI Enterprise software subscription available on AWS Marketplace.

In the near future, we will post an in-depth guide for NIM on SageMaker.

James Parkis a Solutions Architect at Amazon Web Services. He works with Amazon.com to design, build, and deploy technology solutions on AWS, and has a particular interest in AI and machine learning. In h is spare time he enjoys seeking out new cultures, new experiences, and staying up to date with the latest technology trends.You can find him on LinkedIn.

Saurabh Trikande is a Senior Product Manager for Amazon SageMaker Inference. He is passionate about working with customers and is motivated by the goal of democratizing machine learning. He focuses on core challenges related to deploying complex ML applications, multi-tenant ML models, cost optimizations, and making deployment of deep learning models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.

Qing Lan is a Software Development Engineer in AWS. He has been working on several challenging products in Amazon, including high performance ML inference solutions and high performance logging system. Qings team successfully launched the first Billion-parameter model in Amazon Advertising with very low latency required. Qing has in-depth knowledge on the infrastructure optimization and Deep Learning acceleration.

Nikhil Kulkarni is a software developer with AWS Machine Learning, focusing on making machine learning workloads more performant on the cloud, and is a co-creator of AWS Deep Learning Containers for training and inference. Hes passionate about distributed Deep Learning Systems. Outside of work, he enjoys reading books, fiddling with the guitar, and making pizza.

Harish Tummalacherla is Software Engineer with Deep Learning Performance team at SageMaker. He works on performance engineering for serving large language models efficiently on SageMaker. In his spare time, he enjoys running, cycling and ski mountaineering.

Eliuth Triana Isaza is a Developer Relations Manager at NVIDIA empowering Amazons AI MLOps, DevOps, Scientists and AWS technical experts to master the NVIDIA computing stack for accelerating and optimizing Generative AI Foundation models spanning from data curation, GPU training, model inference and production deployment on AWS GPU instances. In addition, Eliuth is a passionate mountain biker, skier, tennis and poker player.

Jiahong Liuis a Solution Architect on the Cloud Service Provider team at NVIDIA. He assists clients in adopting machine learning and AI solutions that leverage NVIDIA accelerated computing to address their training and inference challenges. In his leisure time, he enjoys origami, DIY projects, and playing basketball.

Kshitiz Guptais a Solutions Architect at NVIDIA. He enjoys educating cloud customers about the GPU AI technologies NVIDIA has to offer and assisting them with accelerating their machine learning and deep learning applications. Outside of work, he enjoys running, hiking and wildlife watching.

Read more from the original source:
Optimize price-performance of LLM inference on NVIDIA GPUs using the Amazon SageMaker integration with NVIDIA ... - AWS Blog

Applying machine learning algorithms to predict the stock price trend in the stock market The case of Vietnam … – Nature.com

Foundation theory

When discussing the stock market, with its inherent and complexity, the predictability of stock returns has always been a subject of debate that attracts much research. Fama (1970) postulates the efficient market hypothesis that determines that the current price of an asset always reflects all prior information available to it immediately. In addition, the random walk hypothesis states that a stocks price changes independently of its history, in other words, tomorrows price will depend only on tomorrows information regardless of todays price (Burton, 2018). These two hypotheses establish that there is no means of accurately predicting stock prices.

On the other hand, there are other authors who argue that, in fact, stock prices can be predicted at least to some extent. And a variety of methods for predicting and modeling stock behavior have been the subject of research in many different disciplines, such as economics, statistics, physics, and computer science (Lo and MacKinlay, 1999).

A popular method for modeling and predicting the stock market is technical analysis, which is a method based on historical data from the market, primarily price and volume. Quantity. Technical analysis follows several assumptions: (1) prices are determined exclusively by supply and demand relationships; (2) prices change with the trend; (3) changes in supply and demand cause the trend to reverse; (4) changes in supply and demand can be identified on the chart; And (5) the patterns on the chart tend to repeat. In other words, technical analysis does not take into account any external factors such as political, social or macroeconomic (Kirkpatrick & Dahlquist, 2010). Research by Biondo et al. (2013) shows that short-term trading strategies based on technical analysis indicators can work better than some traditional methods, such as the moving average convergence divergence (MACD) and the relative strength index (RSI).

Technical analysis is a well method of forecasting future market trends by generating buy or sell signals based on specific information obtained from those prices. The popularity and continued application of technical analysis has become widely recognized with techniques for uncovering any hidden pattern ranging from the very rudimentary analysis of the moving averages to the recognition of rather complex time series patterns. Brock et al. (1992) show that simple trading rules based on the movement of short-term and long-term moving average returns have significant predictive power with daily data for more than a century on the Dow Jones Industrial Average. Fifield et al. (2005) went on to investigate the predictive power of the filter rule and the moving average oscillator rule in 11 European stock markets, including covering the period from January 1991 to December 2000. Their key findings indicate that four emerging markets: Greece, Hungary, Portugal and Turkey, are information inefficient, compared with seven more advanced other markets. Past empirical results support technical analysis (Fifield et al. 2005); however, such evidence can be criticized because of data bias (Brock et al. 1992).

Elman (1990) proposed a Recurrent Neural Network (RNN). Basically, RNN solves the problem of processing sequence data, such as text, voice, and video. There is a sequential relationship between samples of this data type and each sample is associated with its previous sample. For example, in text, a word is related to the word that precedes it. In meteorological data, the temperature of one day is combined with the temperature of the previous few days. A set of observations is defined as a sequence from which multiple sequences can be observed. This feature of the RNN algorithm is very suitable for the properties of time series data in stock analysis as the Fig. 1:

Source: Lai et al. (2019).

Figure 1 shows the structure of an RNN, in which the output of the hidden layer is stored in memory. Memory can be thought of as another input. The main reason for the difficulty of RNN training is the passing of the hidden layer parameter . Since the error propagation on the RNN is not handled, the value of multiplies during both forward and reverse propagation. (1) The problem of Gradient Vanishing is that when the gradient is small, increasing exponentially, it has almost no effect on the output. (2) Gradient Exploding problem: conversely, if the gradient is large, multiplying exponentially leads to gradient explosion. Of course, this problem exists in any deep neural network, but it is especially evident due to the recursive structure of the RNN. Further, RNNs differ from traditional relay networks in that they not only have neural connections in one direction, in other words, neurons can transmit data to a previous layer or same class. Not storing information in a single direction, this is a practical feature of the existence of short-term memory, in addition to the long-term memory that neural networks have acquired through training.

The Long Short Term Memory (LSTM) algorithm introduced by the research of Hochreiter and Schmidhuber (1997) aims to provide better performance by solving the Gradient Vanishing problem that repeated networks will suffer when dealing with long strings of data. In LSTM, each neuron is a memory cell that connects previous information to the current task. An LSTM network is a special type of RNN. The LSTM can capture the error, so that it can be moved back through the layers over time. LSTM keeps the error at a certain maximum constant, so the LSTM network can take a long time to train, and opens the door to setting the correction of parameters in the algorithm (Liu et al. 2018). The LSTM is a special network topology with three gateway structures (shown in Fig. 2). Three ports are placed in an LSTM unit, which are called input, forget, and output ports. While the information enters the network of the LSTM, it can be selected according to the rules. Only information that matches the algorithm will be forwarded, and information that does not match will be forgotten through the forget gate.

Source: Ding et al. (2015).

This gate-based architecture allows information to be selectively forwarded to the next unit based on the principle of the activation function of the LSTM network. LSTM networks are widely used and achieved some positive results when compared with other methods (Graves, 2012), especially in terms of Natural Language Processing, and especially for handwriting recognition (Graves et al. 2008). The LSTM algorithm has branched out into a number of variations, but when compared to the original they do not seem to have made any significant improvements to date (Greff et al. 2016).

Data on the stock market is very large and non-linear in nature. To model this type of data, it is necessary to use models that can analyze the patterns on the chart. Deep learning algorithms are capable of identifying and exploiting information hidden within data through the process of self-learning. Unlike other algorithms, deep learning models can model this type of data efficiently (Agrawal et al. 2019).

The research studies analyzing financial time series data using neural network models using many different types of input variables to predict stock returns. In some studies, the input data used to build the model includes only a single time series (Jia, 2016). Some other studies include both indicators showing market information and macroeconomic variables (White, 1988). In addition, there are many different variations in the application of neural network models to time series data analysis: Ding et al. (2015) combine financial time series analysis and processing natural language data, Roman and Jameel (1996) and Heaton et al. (2016) use deep learning architecture to model multivariable financial time series. The study of Chan et al. (2000) introduces a neural network model using technical analysis variables that has been performed to predict the Shanghai stock market, compared the performance of two algorithms and two different weight initialization methods. The results show that the efficiency of back-propagation can be increased by learning the conjugate gradient with multiple linear regression weight initializations.

With the suitable and high-performance nature of the regression neural network (RNN) model, a lot of research has been done on the application of RNN in the field of stock analysis and forecasting. Roman and Jameel (1996) used back-to-back models and RNNs to predict stock indexes for five different stock markets. Saad, Prokhorov, and Wunsch (1998) apply delay time, recurrence, and probability neural network models to predict stock data by day. Hegazy et al. (2014) applied machine learning algorithms such as PSO and LS-SVM to forecast the S&P 500 stock market. With the advent of LSTM, data analysis became dependent on time becomes more efficient. The LSTM algorithm has the ability to store historical information and is widely used in stock price prediction (Heaton et al. 2016).

For stock price prediction, LSTM network performance has been greatly appreciated when combined with NLP, which uses news text data as input to predict price trends. In addition, there are also a number of studies that use price data to predict price movements (Chen et al. 2015), using historical price data in addition to stock indices to predict whether stock prices will increase, decrease or stay the same during the day (Di Persio and Honchar, 2016), or compare the performance of the LSTM with its own proposed method based on a combination of different algorithms (Pahwa et al. 2017).

Zhuge et al. (2017) combine LSTM with Naiev Bayes method to extract market emotional factors to improve predictive performance. This method can be used to predict financial markets on completely different time scales from other variables. The sentiment analysis model is integrated with the LSTM time series model to predict the stocks opening price and the results show that this model can improve the prediction accuracy.

Jia (2016) discussed the effectiveness of LSTM in stock price prediction research and showed that LSTM is an effective method to predict stock returns. The real-time wavelet transform was combined with the LSTM network to predict the East Asian stock index, which corrected some logic defects in previous studies. Compared with the model using only LSTM, the combined model can greatly improve the prediction degree and the regression error is small. In addition, Glmez (2023) believed that the LSTM model is suitable for time series data on financial markets in the context of stock prices established on supply and demand relationships. Researching on the Down Jones stock index, which is a market for stocks, bonds and other securities in USA, the authors also did the stock forecasts for the period 2019 to 2023. Another research by Usmani Shamsi (2023) on Pakistan stock market research on general market, industry and stock related news categories and its influence on stock price forecast. This confirms that the LSTM model is being used more widely in stock price forecasting recently.

See original here:
Applying machine learning algorithms to predict the stock price trend in the stock market The case of Vietnam ... - Nature.com

Nacha Rules Illustrate Need for AI and ML as Banks Battle Fraud – PYMNTS.com

For financial institutions and stakeholders, the twin threats of business email compromise (BEC) and credit-push fraud demand more monitoring and advanced technologies to do so.

Nachaintroduced new rules Monday (March 18) that create a base level ofACH monitoringfor all parties in the ACH Network with the exception of consumers.

The rules are designed to promote the detection of fraud through the credit-push payment flow, from the point of origination through the point of receipt at an account at the [receiving depository financial institution (RDFI)], PYMNTS reported Monday.

In terms of the mechanics of the new rules, when fraud is detected, the originating depository financial institution (ODFI) can request the return of the payment, the RDFI can delay funds availability (within limits), and the RDFI can return a suspicious transaction, without waiting for a request or a customer claim.

The rules are a bid to reduce BEC and vendor impersonation, among other scams.

PYMNTS Intelligence found last year that 43% of U.S. banks saw a rise infraudulent transactions. Twelve percent of scams came from impersonation schemes. From 2022 to 2023, the share of firms that experienced an increase in payments made via same-day ACH increased by about 35 percentage points, surging from nearly 11% to nearly 46%.

The financial impact from that fraud has been considerable, as the price tag, as it were, due to fraudulent transactions came in at $3.2 million in 2023, up from $2.3 million the year before.

The share of fraudulent transactions due to schemes that impersonated authorized parties grew over the same timeframe from 11.1% in 2022 to 13.9% last year.

Separately, theFBIreported thatBEC scamsgrew by double-digit percentage points. Between December 2021 and December 2022, there was a 17% increase in identified global exposed losses. There were more than 277,900 reported incidents, resulting in more than $50.8 billion in reported losses.

TheFederal Trade Commissionreported earlier this year thatimpostor scamstotaled $2.7 billion, with $800 as the median amount lost when consumers were targeted.

In the United Kingdom, draft legislation would mandate that banks and payment firms would have to reimburse victims of authorizedpush payment fraudup to 415,000 pounds (about $528,000) per incident, as well as implement a policy delaying payments for up to four days if fraud is suspected.

PYMNTS Intelligence found that rules-based algorithms, artificial intelligence and machine learning are the technologies most used to combat fraud, especially among larger banks. Sixty percent of financial institutions reported using rules-based algorithms to combat fraud, up from 50% in 2022.

PYMNTS Intelligence also found that at least 66% of financial institutions with more than $5 billion in assets use AI and ML, exceeding the 44% of smaller banks that do the same.

Elsewhere, 56% of financial institutions overall plan to increase their use of AI and ML models to combatfraud.

Go here to see the original:
Nacha Rules Illustrate Need for AI and ML as Banks Battle Fraud - PYMNTS.com