Category Archives: Machine Learning

Open source observability for AWS Inferentia nodes within Amazon EKS clusters | Amazon Web Services – AWS Blog

Recent developments in machine learning (ML) have led to increasingly large models, some of which require hundreds of billions of parameters. Although they are more powerful, training and inference on those models require significant computational resources. Despite the availability of advanced distributed training libraries, its common for training and inference jobs to need hundreds of accelerators (GPUs or purpose-built ML chips such as AWS Trainium and AWS Inferentia), and therefore tens or hundreds of instances.

In such distributed environments, observability of both instances and ML chips becomes key to model performance fine-tuning and cost optimization. Metrics allow teams to understand workload behavior and optimize resource allocation and utilization, diagnose anomalies, and increase overall infrastructure efficiency. For data scientists, ML chips utilization and saturation are also relevant for capacity planning.

This post walks you through the Open Source Observability pattern for AWS Inferentia, which shows you how to monitor the performance of ML chips, used in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster, with data plane nodes based on Amazon Elastic Compute Cloud (Amazon EC2) instances of type Inf1 and Inf2.

The pattern is part of the AWS CDK Observability Accelerator, a set of opinionated modules to help you set observability for Amazon EKS clusters. The AWS CDK Observability Accelerator is organized around patterns, which are reusable units for deploying multiple resources. The open source observability set of patterns instruments observability with Amazon Managed Grafana dashboards, an AWS Distro for OpenTelemetry collector to collect metrics, and Amazon Managed Service for Prometheus to store them.

The following diagram illustrates the solution architecture.

This solution deploys an Amazon EKS cluster with a node group that includes Inf1 instances.

The AMI type of the node group is AL2_x86_64_GPU, which uses the Amazon EKS optimized accelerated Amazon Linux AMI. In addition to the standard Amazon EKS-optimized AMI configuration, the accelerated AMI includes the NeuronX runtime.

To access the ML chips from Kubernetes, the pattern deploys the AWS Neuron device plugin.

Metrics are exposed to Amazon Managed Service for Prometheus by the neuron-monitor DaemonSet, which deploys a minimal container, with the Neuron tools installed. Specifically, the neuron-monitor DaemonSet runs the neuron-monitor command piped into the neuron-monitor-prometheus.py companion script (both commands are part of the container):

The command uses the following components:

Data is visualized in Amazon Managed Grafana by the corresponding dashboard.

The rest of the setup to collect and visualize metrics with Amazon Managed Service for Prometheus and Amazon Managed Grafana is similar to that used in other open source based patterns, which are included in the AWS Observability Accelerator for CDK GitHub repository.

You need the following to complete the steps in this post:

Complete the following steps to set up your environment:

The following is our sample output:

COA_AMG_ENDPOINT_URL needs to include https://.

The secret will be accessed by the External Secrets add-on and made available as a native Kubernetes secret in the EKS cluster.

The first step to any AWS CDK deployment is bootstrapping the environment. You use the cdk bootstrap command in the AWS CDK CLI to prepare the environment (a combination of AWS account and AWS Region) with resources required by AWS CDK to perform deployments into that environment. AWS CDK bootstrapping is needed for each account and Region combination, so if you already bootstrapped AWS CDK in a Region, you dont need to repeat the bootstrapping process.

Complete the following steps to deploy the solution:

The actual settings for Grafana dashboard JSON files are expected to be specified in the AWS CDK context. You need to update context in the cdk.json file, located in the current directory. The location of the dashboard is specified by the fluxRepository.values.GRAFANA_NEURON_DASH_URL parameter, and neuronNodeGroup is used to set the instance type, number, and Amazon Elastic Block Store (Amazon EBS) size used for the nodes.

You can replace the Inf1 instance type with Inf2 and change the size as needed. To check availability in your selected Region, run the following command (amend Values as you see fit):

Complete the following steps to validate the solution:

The following screenshot shows our sample output.

The following is our expected output:

The following is our expected output:

The following screenshot shows our expected output.

The following screenshot shows our expected output.

Log in to your Amazon Managed Grafana workspace and navigate to the Dashboards panel. You should see a dashboard named Neuron / Monitor.

To see some interesting metrics on the Grafana dashboard, we apply the following manifest:

This is a sample workload that compiles the torchvision ResNet50 model and runs repetitive inference in a loop to generate telemetry data.

To verify the pod was successfully deployed, run the following code:

You should see a pod named pytorch-inference-resnet50.

After a few minutes, looking into the Neuron / Monitor dashboard, you should see the gathered metrics similar to the following screenshots.

Grafana Operator and Flux always work together to synchronize your dashboards with Git. If you delete your dashboards by accident, they will be re-provisioned automatically.

You can delete the whole AWS CDK stack with the following command:

In this post, we showed you how to introduce observability, with open source tooling, into an EKS cluster featuring a data plane running EC2 Inf1 instances. We started by selecting the Amazon EKS-optimized accelerated AMI for the data plane nodes, which includes the Neuron container runtime, providing access to AWS Inferentia and Trainium Neuron devices. Then, to expose the Neuron cores and devices to Kubernetes, we deployed the Neuron device plugin. The actual collection and mapping of telemetry data into Prometheus-compatible format was achieved via neuron-monitor and neuron-monitor-prometheus.py. Metrics were sourced from Amazon Managed Service for Prometheus and displayed on the Neuron dashboard of Amazon Managed Grafana.

We recommend that you explore additional observability patterns in the AWS Observability Accelerator for CDK GitHub repo. To learn more about Neuron, refer to the AWS Neuron Documentation.

Riccardo Freschi is a Sr. Solutions Architect at AWS, focusing on application modernization. He works closely with partners and customers to help them transform their IT landscapes in their journey to the AWS Cloud by refactoring existing applications and building new ones.

Go here to read the rest:
Open source observability for AWS Inferentia nodes within Amazon EKS clusters | Amazon Web Services - AWS Blog

Brake Noise And Machine Learning (4 of 4) – The BRAKE Report

Article by: Antonio Rubio, Project Engineer, Braking Systems in Applus IDIADA

ReviewPart One| ReviewPart Two | Review Part Three

The field of artificial intelligence (AI) has made significant progress in recent years, with applications ranging from natural language processing to computer vision. In recent years, Applus IDIADA Brakes department has presented several studies about artificial intelligence application for detection of brake noises. In this paper, Applus IDIADA presents the research done in this area, but focusing on the development of an AI model for predicting subjective ratings for squeal brake noises based on objective measurements collected through the instrumentation in a typical Brake Noise Durability programme. Subjective ratings are based on human opinions and can be challenging to quantify. Objective measurements, on the other hand, can be objectively quantified and provide a more reliable basis for prediction.

The first part of the article introduced the data processing, whereas the second and third parts focused on the AI model creation and validation, respectively. This fourth part, on the other hand, summarizes the main results and draws the conclusions.

Other drivers evaluations

Subjective ratings from two different highly skilled drivers were used (different from the reference driver selected for the model trained). With that, the noises and conditions of noises should be similar, but drivers evaluations are different. Dataset per rating used to evaluate other drivers evaluations is shown in table 9.

Using different drivers for validation, we are validating at the same time:

Ideally, model prediction accuracy should be similar to the accuracy result that comes from the validation performed on the model with the reference driver. Differences between accuracy of the model of the reference driver and the accuracy with the data set of other drivers, could be attributed to differences in subjective criteria between reference driver and the driver evaluated.

It can be seen that there are more subjective ratings available in the data set with high ratings than for low ratings.

Similar to the validation of the model for the reference driver, results for each driver are presented in terms of accuracy. Results can be checked in table 10 and accuracy per driver/rating in table 11.

Accuracy is calculated comparing the subjective rating prediction from the model with the actual ones of the drivers, meaning a 100% of accuracy a correct prediction (same as driver) of the model for all subjective ratings. In addition, the % of ratings not correctly assigned with a difference error of 1 rating, 2 rating and 3 rating is calculated.

It can be seen that:

It can be seen that:

Summary results

Regarding reference driver validation, close to 70% of prediction ratings are the same as the reference driver rating. Rating discrepancies between model and driver rating are mainly with a 1 rating error. Rating discrepancies between model and driver rating more than 2 points are minimal. Accuracy for rating 9, rating 8 and rating 7 is around 70%. Accuracy for rating 6 or lower decrease to 50% or lower.

Regarding other drivers evaluations, the accuracy is around 50% for both of them. Same tendency in comparison with reference driver results can be shown. There is an increase of rating discrepancies mainly of 1 rating. The decrease of accuracy can be explained with the difference of subjective criteria of the drivers in comparison with the reference driver.

Conclusion

The goal of the project is to replicate the evaluation of brake noise annoyance performed by an expert driver using a model. Data containing noise samples collected during several years of testing at Applus IDIADA from a reference driver and their corresponding subjective ratings are provided for this purpose.

The data analysis revealed that there is a feasible opportunity to clean and preprocess the dataset by removing variables that do not contribute value to the model. Outliers were removed from the dataset. Data has been split in three parts: 70% noise events for training, 20% for test and 10% for validation.

Two artificial intelligence models were trained with the dataset: a classification and a regression model. According to the test phase results of training, it is shown that models achieve a good knowledge of the dataset. Finally, according to the different trials, the final model involves combining the classification and regression models. A threshold is set to determine when to rely on the classification models prediction and when to prioritize the rounded output from the regression model.

The model underwent validation by comparing its results with evaluations from the reference driver using different vehicles in conditions that were used for training. An accuracy of 68.5% was achieved, with rating discrepancies between model and driver rating mainly with a 1 rating.

In addition, predicted ratings from different drivers with model from the reference driver have been compared. It can be seen that accuracy in comparison with the reference driver decreased, but it can be explained as differences in subjective criteria with the other drivers.

The results of the study were promising, obtaining with the model an important level of accuracy in predicting subjective ratings based on objective measurements, indicating that the models predictions were close to the actual subjective ratings. Actually, it can be seen during models training that characterization of the subjective criteria is learnt by the models. Main rating discrepancies between model and driver rating are mainly with a 1 rating error that it could be explained as some uncertainty in the subjective criteria of the reference driver. This uncertainty in the subjective criteria of the driver could be explained by a variety of uncontrolled variables which can result in different subjective ratings for the same noise event. These differences appear mainly for low rating below rating 6. In addition, dataset contained a smaller number of lower rating 6 or below than above 6.

In conclusion, the development of an AI model for predicting subjective ratings based on objective measurements is an important step towards the understanding of subjective ratings and objective measurements for brake squeal noise. Prediction results from the current artificial intelligence model are based in objective measurements from 20 variables at the same time that characterize the most important features of the noise as frequency, amplitude, duration or corner source. Furthermore, the results of this study demonstrate the potential of AI models to be implemented in the near-to-medium future on autonomous vehicles providing more accurate subjective rating based on objective data. Future work in this area could involve expanding the model to include additional variables or incorporating other machine learning techniques to further improve performance.

About Applus IDIADA

With over 25 years experience and 2,450 engineers specializing in vehicle development, Applus IDIADA is a leading engineering company providing design, testing, engineering, and homologation services to the automotive industry worldwide.

Applus IDIADA is located in California and Michigan, with further presence in 25 other countries, mainly in Europe and Asia.

http://www.applusidiada.com

See the original post:
Brake Noise And Machine Learning (4 of 4) - The BRAKE Report

Artificial Intelligence Tool to Improve Heart Failure Care – UVA Health Newsroom

Heart failure occurs when the heart is unable to pump enough blood. Symptoms can include fatigue, weakness, swollen legs and feet and, ultimately, death.

UVA Health researchers have developed a powerful new risk assessment tool for predicting outcomes in heart failure patients. The researchers have made the tool publicly available for free to clinicians.

The new tool improves on existing risk assessment tools for heart failure by harnessing the power of machine learning (ML) and artificial intelligence (AI) to determine patient-specific risks of developing unfavorable outcomes with heart failure.

Heart failure is a progressive condition that affects not only quality of life but quantity as well. All heart failure patients are not the same. Each patient is on a spectrum along the continuum of risk of suffering adverse outcomes, said researcher Sula Mazimba, MD, a heart failure expert. Identifying the degree of risk for each patient promises to help clinicians tailor therapies to improve outcomes.

Heart failure occurs when the heart is unable to pump enough blood for the bodys needs. Thiscan lead to fatigue, weakness, swollen legs and feet and, ultimately, death.Heart failure isa progressive condition, so it is extremely important forclinicians to be able to identify patients at risk ofadverse outcomes.

Further, heart failure is a growing problem. More than 6 million Americans already have heart failure, and that number is expected to increase to more than 8 million by 2030. The UVA researchers developed their new model, called CARNA, to improve care for these patients. (Finding new ways to improve care for patients across Virginia and beyond is a key component of UVA Healths first-ever10-year strategic plan.)

The researchersdeveloped their model using anonymized data drawn from thousands of patients enrolled in heart failure clinical trialspreviously funded by the National Institutes of Healths National Heart, Lung and Blood Institute. Putting the model to the test, they found it outperformed existing predictors for determining how a broad spectrum of patients would fare in areas such as the need for heart surgery or transplant, the risk of rehospitalization and the risk of death.

The researchers attribute the models successto the use of ML/AI and the inclusion of hemodynamic clinical data, which describe how blood circulates through the heart, lungs and the rest of the body.

This model presents a breakthrough because it ingests complex sets of data and can make decisions even among missing and conflicting factors, said researcher Josephine Lamp, of the University of Virginia School of Engineerings Department of Computer Science. It is really exciting because the model intelligently presents and summarizes risk factors reducing decision burden so clinicians can quickly make treatment decisions.

By using the model, doctors will be better equipped to personalize care to individual patients, helping them live longer, healthier lives, the researchers hope.

The collaborative research environment at the University of Virginia made this work possible by bringing together experts in heart failure, computer science, data science and statistics, said researcher Kenneth Bilchick, MD, a cardiologist at UVA Health. Multidisciplinary biomedical research that integrates talented computer scientists like Josephine Lamp with experts in clinical medicine will be critical to helping our patients benefit from AI in the coming years and decades.

The researchers have made their new tool available online for free athttps://github.com/jozieLamp/CARNA.

In addition, they havepublished the results of their evaluation of CARNA in the American Heart Journal. The research team consisted of Lamp, Yuxin Wu, Steven Lamp, Prince Afriyie, Nicholas Ashur, Bilchick, Khadijah Breathett, Younghoon Kwon, Song Li, Nishaki Mehta, Edward Rojas Pena, Lu Feng and Mazimba. The researchers have no financial interest in the work.

The project was based on one of the winning submissions to the National Heart, Lung and Blood Institutes Big Data Analysis Challenge: Creating New Paradigms for Heart Failure Research. The work was supported by the National Science Foundation Graduate Research Fellowship, grant 842490, and NHLBI grants R56HL159216, K01HL142848 and L30HL148881.

To keep up with the latest medical research news from UVA, subscribe to theMaking of Medicineblog.

See the original post here:
Artificial Intelligence Tool to Improve Heart Failure Care - UVA Health Newsroom

Machine learning could help reveal undiscovered particles within data from the Large Hadron Collider – Newswise

Newswise Scientists used a neural network, a type of brain-inspired machine learning algorithm, to sift through large volumes of particle collision data.

For over two decades, theATLASparticle detector has recorded the highest energy particle collisions in the world within the Large Hadron Collider (LHC) located atCERN, the European Organization for Nuclear Research. Beams of protons are accelerated around theLHCat close to the speed of light, and upon their collision atATLAS, they produce a cascade of new particles, resulting in over a billion particle interactions per second.

Particle physicists are tasked with mining this massive and growing store of collision data for evidence of undiscovered particles. In particular, theyre searching for particles not included in theStandard Modelof particle physics, our current understanding of the universes makeup that scientists suspect is incomplete.

As part of theATLAScollaboration, scientists at the U.S. Department of Energys (DOE) Argonne National Laboratory and their colleagues recently used a machine learning approach called anomaly detection to analyze large volumes ofATLASdata. The method has never before been applied to data from a collider experiment. It has the potential to improve the efficiency of the collaborations search for something new. The collaboration involves scientists from 172 research organizations.

The team leveraged a brain-inspired type of machine learning algorithm called a neural network to search the data for abnormal features, or anomalies. The technique breaks from more traditional methods of searching for new physics. It is independent of and therefore unconstrained by the preconceptions of scientists.

Rather than looking for very specific deviations, the goal is to find unusual signatures in the data that are completely unexplored, and that may look different from what our theories predict. Physicist Sergei Chekanov

Traditionally,ATLASscientists have relied on theoretical models to help guide their experiment and analysis in the directions most promising for discovery. This often involves performing complex computer simulations to determine how certain aspects of collision data would look according to the Standard Model. Scientists compare these Standard Model predictions to real data fromATLAS. They also compare them to predictions made by new physics models, like those attempting to explaindark matterand other phenomena unaccounted for by the Standard Model.

But so far, no deviations from the Standard Model have been observed in the billions of billions of collisions recorded atATLAS. And since the discovery of theHiggs bosonin 2012, theATLASexperiment has yet to find any new particles.

Anomaly detection is a very different way of approaching this search, said Sergei Chekanov, a physicist in Argonnes High Energy Physics division and a lead author on the study.Rather than looking for very specific deviations, the goal is to find unusual signatures in the data that are completely unexplored and that may look different from what our theories predict.

To perform this type of analysis, the scientists represented each particle interaction in the data as an image that resembles aQRcode. Then, the team trained their neural network by exposing it to 1% of the images.

The network consists of around 2 million interconnected nodes, which are analogous to neurons in the brain. Without human guidance or intervention, it identified and remembered correlations between pixels in the images that characterize Standard Model interactions. In other words, it learned to recognize typical events that fit within Standard Model predictions.

After training, the scientists fed the other 99% of the images through the neural network to detect any anomalies. When given an image as input, the neural network is tasked with recreating the image using its understanding of the data as a whole.

If the neural network encounters something new or unusual, it gets confused and has a hard time reconstructing the image, said Chekanov.If there is a large difference between the input image and the output it produces, it lets us know that there might be something interesting to explore in that direction.

Using computational resources at Argonnes Laboratory Computing Resource Center, the neural network analyzed around 160 million events withinLHCRun-2 data collected from 2015 to 2018.

Although the neural network didnt find any glaring signs of new physics in this data set, it did spot one anomaly that the scientists think is worth further study. An exotic particle decay at an energy of around 4.8 teraelectronvolts results in a muon (a type of fundamental particle) and a jet of other particles in a way that does not fit with the neural networks understanding of Standard Model interactions.

Well have to do more investigation, said Chekanov.It is likely a statistical fluctuation, but theres a chance this decay could indicate the existence of an undiscovered particle.

The team plans to apply this technique to data collected during theLHCRun-3 period, which began in 2022.ATLASscientists will continue to explore the potential of machine learning and anomaly detection as tools for charting unknown territory in particle physics.

The results of the study were published inPhysical Review Letters. This work was funded in part by theDOEOffice of Sciences Office of High Energy Physics and the National Science Foundation.

Argonne National Laboratoryseeks solutions to pressing national problems in science and technology. The nations first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance Americas scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed byUChicago Argonne,LLCfor theU.S. Department of Energys Office of Science.

The U.S. Department of Energys Office of Scienceis the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visithttps://energy.gov/science.

Read more:
Machine learning could help reveal undiscovered particles within data from the Large Hadron Collider - Newswise

4 Hats of a Full-Stack Data Scientist | by Shaw Talebi – Towards Data Science

When I first learned data science (5+ years ago), data engineering and ML engineering were not as widespread as they are today. Consequently, the role of a data scientist was often more broadly defined than what we may see these days.

For example, data scientists may have written ETL scripts, set up databases, performed feature engineering, trained ML models, and deployed models into production.

Although it is becoming more common to split these tasks across multiple roles (e.g., data engineers, data scientists, and ML engineers), many situations still call for contributors who are well-versed in all aspects of ML model development. I call these contributors full-stack data scientists.

More specifically, I see a full-stack data scientist as someone who can manage and implement an ML solution end-to-end. This involves formulating business problems, designing ML solutions, sourcing and preparing data for development, training ML models, and deploying models so their value can be realized.

Given the rise of specialized roles for implementing ML projects, this notion of FSDS may seem outdated. At least, that was what I thought in my first corporate data science role.

These days, however, the value of learning the full tech stack is becoming increasingly obvious to me. This all started last year when I interviewed top data science freelancers from Upwork.

Almost everyone I spoke to fit the full stack data scientist definition given above. This wasnt just out of fun and curiosity but from necessity.

A key takeaway from these interviews was data science skills (alone) are limited in their potential business impact. To generate real-world value (that a client will pay for), building solutions end-to-end is a must.

But this isnt restricted to freelancing. Here are a few other contexts where FSDS can be beneficial

In other words, full-stack data scientists are generalists who can see the big picture and dive into specific aspects of a project as needed. This makes them a valuable resource for any business looking to generate value via AI and machine learning.

While FSDS requires several skills, the role can be broken down into four key hats: Project Manager, Data Engineer, Data Scientist, and ML Engineer.

Of course, no one can be world-class in all hats (probably). But one can certainly be above average across the board (it just takes time).

Here, Ill break down each of these hats based on my experience as a data science consultant and interviews with 27 data/ML professionals.

The key role of a project manager (IMO) is to answer 3 questions: what, why, and how. In other words, what are we building? Why are we building it? How will we do it?

While it might be easy to skip over this work (and start coding), failing to put on the PM hat properly risks spending a lot of time (and money) solving the wrong problem. Or solving the right problem in an unnecessarily complex and expensive way.

The starting point for this is defining the business problem. In most contexts, the full-stack data scientist isnt solving their problem, so this requires the ability to work with stakeholders to uncover the problem's root causes. I discussed some tips on this in a previous article.

Once the problem is clearly defined, one can identify how AI can solve it. This sets the target from which to work backward to estimate project costs, timelines, and requirements.

In the context of FSDS, data engineering is concerned with making data readily available for model development or inference (or both).

Since this is inherently product-focused, the DE hat may be more limited than a typical data engineering role. More specifically, this likely wont require optimizing data architectures for several business use cases.

Instead, the focus will be on building data pipelines. This involves designing and implementing ETL (or ELT) processes for specific use cases.

ETL stands for extract, transform, and load. It involves extracting data from their raw sources, transforming it into a meaningful form (e.g., data cleaning, deduplication, exception handling, feature engineering), and loading it into a database (e.g., data modeling and database design).

Another important area here is data monitoring. While the details of this will depend on the specific use case, the ultimate goal is to give ongoing visibility to data pipelines via alerting systems, dashboards, or the like.

I define a data scientist as someone who uses data to uncover regularities in the world that can be used to drive impact. In practice, this often boils down to training a machine learning model (because computers are much better than humans at finding regularities in data).

For most projects, one must switch between this Hat and Hats 1 and 2. During model development, it is common to encounter insights that require revisiting the data preparation or project scoping.

For example, one might discover that an exception was not properly handled for a particular field or that the extracted fields do not have the predictive power that was assumed at the project's outset.

An essential part of model training is model validation. This consists of defining performance metrics that can be used to evaluate models. Bonus points if this metric can be directly translated into a business performance metric.

With a performance metric, one can programmatically experiment with and evaluate several model configurations by adjusting, for example, train-test splits, hyperparameters, predictor choice, and ML approach. If no model training is required, one may still want to compare the performance of multiple pre-trained models.

The final hat involves taking the ML model and turning it into an ML solutionthat is, integrating the model into business workflows so its value can be realized.

A simple way to do this is to containerize the model and set up an API so external systems can make inference calls. For example, the API could be connected to an internal website that allows business users to run a calculation.

Some use cases, however, may not be so simple and require more sophisticated solutions. This is where an orchestration tool can help define complex workflows. For example, if the model requires monthly updates as new data become available, the whole model development process, from ETL to training to deployment, may need to be automated.

Another important area of consideration is model monitoring. Like data monitoring, this involves tracking model predictions and performance over time and making them visible through automated alerts or other means.

While many of these processes can run on local machines, deploying these solutions using a cloud platform is common practice. Every ML engineer (MLE) I have interviewed uses at least 1 cloud platform and recommended cloud deployments as a core skill of MLEs.

While a full-stack data scientist may seem like a technical unicorn, the point (IMO) isnt to become a guru of all aspects of the tech stack. Rather, it is to learn enough to be dangerous.

In other words, its not about mastering everything but being able to learn anything you need to get the job done. From this perspective, I surmise that most data scientists will become full stack given enough time.

Toward this end, here are 3 principles I am using to accelerate my personal FSDS development.

A full-stack data scientist can manage and implement an ML solution end-to-end. While this may seem like overkill for contexts where specialized roles exist for key stages of model development, this generalist skillset is still valuable in many situations.

As part of my journey toward becoming a full-stack data scientist, future articles of this series will walk through each of the 4 FSDS Hats via the end-to-end implementation of a real-world ML project.

In the spirit of learning, if you feel anything is missing here, I invite you to drop a comment (they are appreciated)

Here is the original post:
4 Hats of a Full-Stack Data Scientist | by Shaw Talebi - Towards Data Science

Sea-surface pCO2 maps for the Bay of Bengal based on advanced machine learning algorithms | Scientific Data – Nature.com

Friedlingstein, P. et al. Global carbon budget 2020. Earth System Science Data 12, 32693340 (2020).

Article ADS Google Scholar

Friedlingstein, P. et al. Global carbon budget 2021. Earth System Science Data Discussions 1191 (2021).

Friedlingstein, P. et al. Global carbon budget 2022. Earth System Science Data Discussions 2022, 1159 (2022).

Google Scholar

Chen, C.-T. et al. Airsea exchanges of CO2 in the worlds coastal seas. Biogeosciences 10, 65096544 (2013).

Article ADS CAS Google Scholar

Laruelle, G. G., Lauerwald, R., Pfeil, B. & Regnier, P. Regionalized global budget of the CO2 exchange at the air-water interface in continental shelf seas. Global biogeochemical cycles 28, 11991214 (2014).

Article ADS CAS Google Scholar

Laruelle, G. G. et al. Continental shelves as a variable but increasing global sink for atmospheric carbon dioxide. Nature communications 9, 454 (2018).

Article ADS PubMed PubMed Central Google Scholar

Dai, M. et al. Why are some marginal seas sources of atmospheric CO2? Geophysical Research Letters 40, 21542158 (2013).

Article ADS CAS Google Scholar

Zhai, W.-D. et al. Seasonal variations of the seaair CO2 fluxes in the largest tropical marginal sea (South China sea) based on multiple-year underway measurements. Biogeosciences 10, 77757791 (2013).

Article ADS Google Scholar

Li, Q., Guo, X., Zhai, W., Xu, Y. & Dai, M. Partial pressure of CO2 and air-sea CO2 fluxes in the South China sea: Synthesis of an 18-year dataset. Progress in Oceanography 182, 102272 (2020).

Article Google Scholar

Borges, A. V. Do we have enough pieces of the jigsaw to integrate CO2 fluxes in the coastal ocean? Estuaries 28, 327 (2005).

Article CAS Google Scholar

Anderson, T. R. Plankton functional type modelling: running before we can walk? Journal of Plankton Research 27, 10731081 (2005).

Article Google Scholar

Anderson, T. R. Progress in marine ecosystem modelling and the unreasonable effectiveness of mathematics. Journal of Marine Systems 81, 411 (2010).

Article ADS Google Scholar

Sarma, V., Krishna, M. & Srinivas, T. Sources of organic matter and tracing of nutrient pollution in the coastal Bay of Bengal. Marine Pollution Bulletin 159, 111477 (2020).

Article CAS PubMed Google Scholar

Sarma, V., Prasad, M. & Dalabehera, H. Influence of phytoplankton pigment composition and primary production on pCO2 levels in the Indian ocean. Journal of Earth System Science 130, 116 (2021).

Article Google Scholar

Joshi, A., Chowdhury, R. R., Warrior, H. & Kumar, V. Influence of the freshwater plume dynamics and the barrier layer thickness on the CO2 source and sink characteristics of the Bay of Bengal. Marine Chemistry 236, 104030 (2021).

Article CAS Google Scholar

Sarma, V. et al. East India coastal current controls the Dissolved Inorganic Carbon in the coastal Bay of Bengal. Marine Chemistry 205, 3747 (2018).

Article ADS CAS Google Scholar

Joshi, A., Roychowdhury, R., Kumar, V. & Warrior, H. Configuration and skill assessment of the coupled biogeochemical model for the carbonate system in the Bay of Bengal. Marine Chemistry 103871 (2020).

Joshi, A. & Warrior, H. Comprehending the role of different mechanisms and drivers affecting the sea-surface pCO2 and the air-sea CO2 fluxes in the Bay of Bengal: A modelling study. Marine Chemistry 243, 104120 (2022).

Article CAS Google Scholar

Chakraborty, K., Valsala, V., Bhattacharya, T. & Ghosh, J. Seasonal cycle of surface ocean pCO2 and pH in the northern Indian ocean and their controlling factors. Progress in Oceanography 198, 102683 (2021).

Article Google Scholar

Chakraborty, K., Valsala, V., Gupta, G. & Sarma, V. Dominant biological control over upwelling on pCO2 in sea east of sri lanka. Journal of Geophysical Research: Biogeosciences 123, 32503261 (2018).

Article ADS CAS Google Scholar

Sutton, A. J. et al. A high-frequency atmospheric and seawater pCO2 data set from 14 open-ocean sites using a moored autonomous system. Earth System Science Data 6, 353366 (2014).

Article ADS Google Scholar

Bakker, D. C. et al. Surface ocean CO2 atlas database version 2022 (SOCATv2022)(ncei accession 0253659). Earth System Science Data (2022).

Lauvset, S. K. et al. GLODAPv2. 2022: the latest version of the global interior ocean biogeochemical data product. Earth System Science Data Discussions 2022, 137 (2022).

Google Scholar

Takahashi, T. et al. Climatological distributions of pH, pCO2, total CO2, alkalinity, and CaCO3 saturation in the global surface ocean, and temporal changes at selected locations. Marine Chemistry 164, 95125 (2014).

Article CAS Google Scholar

Chau, T. T. T., Gehlen, M. & Chevallier, F. A seamless ensemble-based reconstruction of surface ocean pCO2 and airsea CO2 fluxes over the global coastal and open oceans. Biogeosciences 19, 10871109 (2022).

Article ADS CAS Google Scholar

Gregor, L., Lebehot, A. D., Kok, S. & Scheel Monteiro, P. M. A comparative assessment of the uncertainties of global surface ocean CO2 estimates using a machine-learning ensemble (csir-ml6 version 2019a)have we hit the wall? Geoscientific Model Development 12, 51135136 (2019).

Article ADS Google Scholar

Dixit, A., Lekshmi, K., Bharti, R. & Mahanta, C. Net seaair CO2 fluxes and modeled partial pressure of CO2 in open ocean of Bay of Bengal. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 12, 24622469 (2019).

Article ADS Google Scholar

Sridevi, B. & Sarma, V. Role of river discharge and warming on ocean acidification and pCO2 levels in the Bay of Bengal. Tellus B: Chemical and Physical Meteorology 73, 120 (2021).

Article CAS Google Scholar

Mohanty, S., Raman, M., Mitra, D. & Chauhan, P. Surface pCO2 variability in two contrasting basins of north Indian ocean using satellite data. Deep Sea Research Part I: Oceanographic Research Papers 179, 103665 (2022).

Article CAS Google Scholar

Joshi, A., Kumar, V. & Warrior, H. Modeling the sea-surface pCO2 of the central Bay of Bengal region using machine learning algorithms. Ocean Modelling 178, 102094 (2022).

Article Google Scholar

Sathyendranath, S. et al. An ocean-colour time series for use in climate studies: the experience of the ocean-colour climate change initiative (oc-cci). Sensors 19, 4285 (2019).

Article ADS CAS PubMed PubMed Central Google Scholar

Chevallier, F. et al. Inferring CO2 sources and sinks from satellite observations: Method and application to tovs data. Journal of Geophysical Research: Atmospheres 110 (2005).

Chevallier, F. et al. CO2 surface fluxes at grid point scale estimated from a global 21 year reanalysis of atmospheric measurements. Journal of Geophysical Research: Atmospheres 115 (2010).

Chevallier, F. On the parallelization of atmospheric inversions of CO2 surface fluxes within a variational framework. Geoscientific Model Development 6, 783790 (2013).

Article ADS Google Scholar

Pedregosa, F. et al. Scikit-learn: Machine learning in python. the Journal of machine Learning research 12, 28252830 (2011).

MathSciNet Google Scholar

Friedrich, T. & Oschlies, A. Neural network-based estimates of north Atlantic surface pCO2 from satellite data: A methodological study. Journal of Geophysical Research: Oceans 114 (2009).

Jo, Y.-H., Dai, M., Zhai, W., Yan, X.-H. & Shang, S. On the variations of sea surface pCO2 in the northern South China sea: A remote sensing based neural network approach. Journal of Geophysical Research: Oceans 117 (2012).

Moussa, H., Benallal, M., Goyet, C. & Lefvre, N. Satellite-derived CO2 fugacity in surface seawater of the tropical atlantic ocean using a feedforward neural network. International Journal of Remote Sensing 37, 580598 (2016).

Article ADS Google Scholar

Wang, Y. et al. Carbon sinks and variations of pCO2 in the southern ocean from 1998 to 2018 based on a deep learning approach. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14, 34953503 (2021).

Article ADS Google Scholar

OMalley, T. et al. Keras tuner. Retrieved May 21, 2020 (2019).

Google Scholar

Agarap, A. F. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375 (2018).

Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Anon. International Conference on Learning Representations. SanDego: ICLR 7 (2015).

Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785794 (2016).

Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 26232631 (2019).

Breiman, L. Random forests. Machine learning 45, 532 (2001).

Article Google Scholar

Lawrence, R. L., Wood, S. D. & Sheley, R. L. Mapping invasive plants using hyperspectral imagery and breiman cutler classifications (randomforest). Remote Sensing of Environment 100, 356362 (2006).

Article ADS Google Scholar

Akhil, V. P. et al. Bay of Bengal sea surface salinity variability using a decade of improved smos re-processing. Remote Sensing of Environment 248, 111964 (2020).

Article Google Scholar

Wanninkhof, R. Relationship between wind speed and gas exchange over the ocean. Journal of Geophysical Research: Oceans 97, 73737382 (1992).

Article Google Scholar

Hersbach, H. et al. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 19992049 (2020).

Article ADS Google Scholar

Wanninkhof, R. Relationship between wind speed and gas exchange over the ocean revisited. Limnology and Oceanography: Methods 12, 351362 (2014).

Google Scholar

Weiss, R. Carbon dioxide in water and seawater: the solubility of a non-ideal gas. Marine chemistry 2, 203215 (1974).

Article CAS Google Scholar

Joshi, A., Ghoshal, K., Prasanna, Chakraborty, K. & Sarma, V. Sea-surface pCO2 maps for the Bay of Bengal based on machine learning algorithms. Zenodo https://doi.org/10.5281/zenodo.8375320 (2024).

Taylor, K. E. Summarizing multiple aspects of model performance in a single diagram. Journal of Geophysical Research: Atmospheres 106, 71837192 (2001).

Article Google Scholar

Willmott, C. J. On the validation of models. Physical geography 2, 184194 (1981).

Article Google Scholar

Sabine, C., Wanninkhof, R., Key, R., Goyet, C. & Millero, F. Seasonal CO2 fluxes in the tropical and subtropical Indian ocean. Marine Chemistry 72, 3353 (2000).

Article CAS Google Scholar

Bates, N. R., Pequignet, A. C. & Sabine, C. L. Ocean carbon cycling in the Indian ocean: 1. spatiotemporal variability of inorganic carbon and air-sea CO2 gas exchange. Global Biogeochemical Cycles 20 (2006).

Schott, F. A. & McCreary, J. P. Jr The monsoon circulation of the Indian ocean. Progress in Oceanography 51, 1123 (2001).

Read this article:
Sea-surface pCO2 maps for the Bay of Bengal based on advanced machine learning algorithms | Scientific Data - Nature.com

Early OpenAI investor bets on alternative Sam Altmans approach to AI – Semafor

Each major breakthrough in AI has occurred by removing human involvement from part of the process. Before deep learning, machine learning involved humans labeling data meticulously so that algorithms could then understand the task, deciphering patterns and making predictions. But now, deep learning obviates the need for labeling. The software can, in essence, teach itself the task.

But humans have still been needed to build the architecture that told a computer how to learn. Large language models like ChatGPT came from a breakthrough in architecture known as the transformer. It was a major advance that allowed a deep learning method called neural networks to keep improving as they grew to unfathomably large sizes. Before the transformer, neural networks plateaued after reaching a certain size.

That is why Microsoft and others are spending tens of billions on AI infrastructure: It is a bet that bigger will continue to mean better.

The big downside of this kind of neural network, though, is that the transformer is imperfect. It tells the model to predict the next word in a sentence based on how groups of letters relate to one another. But there is nothing inherent in the model about the deeper meaning of those words.

It is this limitation that leads to what we call hallucinations; transformer-based models dont understand the concept of truth.

Morgan and many other AI researchers believe if there is an AI architecture that can learn concepts like truth and reasoning, it will be developed by the AI itself, and not humans. Now, humans no longer have to describe the architecture, he said. They just describe the constraints of what they want.

The trick, though, is getting the AI to take on a task that seems to exist beyond the comprehension of the human brain. The answer, he believes, has something to do with a mathematical concept known as category theory.

Increasingly popular in computer science and artificial intelligence, category theory can turn real-world concepts into mathematical formulas, which can be converted into a form of computer code. Symbolica employees, along with researchers from Google DeepMind, published a paper on the subject last month.

The idea is that category theory could be a method to instill constraints in a common language that is precise and understandable to humans and computers. Using category theory, Symbolica hopes its method will lead to AI with guardrails and rules baked in from the beginning. In contrast, foundation models based on transformer architecture require those factors to be added on later.

Morgan said it will be the key to creating AI models that are reliable and dont hallucinate. But like OpenAI, its aiming big in hopes that its new approach to machine learning will lead to the holy grail: Software that knows how to reason.

Symbolica, though, is not a direct competitor to foundation model companies like OpenAI and views its core product as bespoke AI architectures that can be used to build AI models for customers.

That is an entirely new concept in the field. For instance, Google did not view the transformer architecture as a product. In fact, it published the research so that anyone could use it.

Symbolica plans to build customized architectures for customers, which will then use them to train their own AI models. If they give us their constraints, we can just build them an architecture that meets those constraints and we know its going to work, Morgan said.

Morgan said the method will lead to interpretability, a buzzword in the AI industry these days that means the ability to understand why models act the way they do. The lack of interpretability is a major shortcoming of large language models, which are so vast that it is extremely challenging to understand how, exactly, they came up with their responses.

The limitation of Symbolicas models, though, is that they will be more narrowly focused on specific tasks compared to generalist models like GPT-4. But Morgan said thats a good thing.

It doesnt make any sense to train one model that tries to be good at everything when you could train many, tinier models for less money that are way better than GPT-4 could ever be at a specific task, he said.

(Correction: An earlier version of this article incorrectly said that some Symbolica employees had worked at Google DeepMind.)

See more here:
Early OpenAI investor bets on alternative Sam Altmans approach to AI - Semafor

Reducing Toxic AI Responses – Neuroscience News

Summary: Researchers developed a new machine learning technique to improve red-teaming, a process used to test AI models for safety by identifying prompts that trigger toxic responses. By employing a curiosity-driven exploration method, their approach encourages a red-team model to generate diverse and novel prompts that reveal potential weaknesses in AI systems.

This method has proven more effective than traditional techniques, producing a broader range of toxic responses and enhancing the robustness of AI safety measures. The research, set to be presented at the International Conference on Learning Representations, marks a significant step toward ensuring that AI behaviors align with desired outcomes in real-world applications.

Key Facts:

Source: MIT

A user could ask ChatGPT to write a computer program or summarize an article, and the AI chatbot would likely be able to generate useful code or write a cogent synopsis. However, someone could also ask for instructions to build a bomb, and the chatbot might be able to provide those, too.

To prevent this and other safety issues, companies that build large language models typically safeguard them using a process called red-teaming. Teams of human testers write prompts aimed at triggering unsafe or toxic text from the model being tested. These prompts are used to teach the chatbot to avoid such responses.

But this only works effectively if engineers know which toxic prompts to use. If human testers miss some prompts, which is likely given the number of possibilities, a chatbot regarded as safe might still be capable of generating unsafe answers.

Researchers from Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab used machine learning to improve red-teaming. They developed a technique to train a red-team large language model to automatically generate diverse prompts that trigger a wider range of undesirable responses from the chatbot being tested.

They do this by teaching the red-team model to be curious when it writes prompts, and to focus on novel prompts that evoke toxic responses from the target model.

The technique outperformed human testers and other machine-learning approaches by generating more distinct prompts that elicited increasingly toxic responses. Not only does their method significantly improve the coverage of inputs being tested compared to other automated methods, but it can also draw out toxic responses from a chatbot that had safeguards built into it by human experts.

Right now, every large language model has to undergo a very lengthy period of red-teaming to ensure its safety. That is not going to be sustainable if we want to update these models in rapidly changing environments.

Our method provides a faster and more effective way to do this quality assurance, says Zhang-Wei Hong, an electrical engineering and computer science (EECS) graduate student in the Improbable AI lab and lead author of apaper on this red-teaming approach.

Hongs co-authors include EECS graduate students Idan Shenfield, Tsun-Hsuan Wang, and Yung-Sung Chuang; Aldo Pareja and Akash Srivastava, research scientists at the MIT-IBM Watson AI Lab; James Glass, senior research scientist and head of the Spoken Language Systems Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author Pulkit Agrawal, director of Improbable AI Lab and an assistant professor in CSAIL. The research will be presented at the International Conference on Learning Representations.

Automated red-teaming

Large language models, like those that power AI chatbots, are often trained by showing them enormous amounts of text from billions of public websites. So, not only can they learn to generate toxic words or describe illegal activities, the models could also leak personal information they may have picked up.

The tedious and costly nature of human red-teaming, which is often ineffective at generating a wide enough variety of prompts to fully safeguard a model, has encouraged researchers to automate the process using machine learning.

Such techniques often train a red-team model using reinforcement learning. This trial-and-error process rewards the red-team model for generating prompts that trigger toxic responses from the chatbot being tested.

But due to the way reinforcement learning works, the red-team model will often keep generating a few similar prompts that are highly toxic to maximize its reward.

For their reinforcement learning approach, the MIT researchers utilized a technique called curiosity-driven exploration. The red-team model is incentivized to be curious about the consequences of each prompt it generates, so it will try prompts with different words, sentence patterns, or meanings.

If the red-team model has already seen a specific prompt, then reproducing it will not generate any curiosity in the red-team model, so it will be pushed to create new prompts, Hong says.

During its training process, the red-team model generates a prompt and interacts with the chatbot. The chatbot responds, and a safety classifier rates the toxicity of its response, rewarding the red-team model based on that rating.

Rewarding curiosity

The red-team models objective is to maximize its reward by eliciting an even more toxic response with a novel prompt. The researchers enable curiosity in the red-team model by modifying the reward signal in the reinforcement learning set up.

First, in addition to maximizing toxicity, they include an entropy bonus that encourages the red-team model to be more random as it explores different prompts. Second, to make the agent curious they include two novelty rewards.

One rewards the model based on the similarity of words in its prompts, and the other rewards the model based on semantic similarity. (Less similarity yields a higher reward.)

To prevent the red-team model from generating random, nonsensical text, which can trick the classifier into awarding a high toxicity score, the researchers also added a naturalistic language bonus to the training objective.

With these additions in place, the researchers compared the toxicity and diversity of responses their red-team model generated with other automated techniques. Their model outperformed the baselines on both metrics.

They also used their red-team model to test a chatbot that had been fine-tuned with human feedback so it would not give toxic replies. Their curiosity-driven approach was able to quickly produce 196 prompts that elicited toxic responses from this safe chatbot.

We are seeing a surge of models, which is only expected to rise. Imagine thousands of models or even more and companies/labs pushing model updates frequently. These models are going to be an integral part of our lives and its important that they are verified before released for public consumption.

Manual verification of models is simply not scalable, and our work is an attempt to reduce the human effort to ensure a safer and trustworthy AI future, says Agrawal.

In the future, the researchers want to enable the red-team model to generate prompts about a wider variety of topics. They also want to explore the use of a large language model as the toxicity classifier. In this way, a user could train the toxicity classifier using a company policy document, for instance, so a red-team model could test a chatbot for company policy violations.

If you are releasing a new AI model and are concerned about whether it will behave as expected, consider using curiosity-driven red-teaming, says Agrawal.

Funding: This research is funded, in part, by Hyundai Motor Company, Quanta Computer Inc., the MIT-IBM Watson AI Lab, an Amazon Web Services MLRA research grant, the U.S. Army Research Office, the U.S. Defense Advanced Research Projects Agency Machine Common Sense Program, the U.S. Office of Naval Research, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator.

Author: Adam Zewe Source: MIT Contact: Adam Zewe MIT Image: The image is credited to Neuroscience News

Original Research: The findings will be presented at the International Conference on Learning Representations

More here:
Reducing Toxic AI Responses - Neuroscience News

A Vision of the Future: Machine Learning in Packaging Inspection – Packaging Digest

As we navigate through the corridors of modern manufacturing, the influence of machine vision and machine learning on the packaging industry stands as a testament to technological evolution. This integration, though largely beneficial, introduces a spectrum of complexities, weaving a narrative that merits a closer examination.

In unpacking the layers of this technological marvel, we should not only tout its enhancements but also recognize its challenges and ethical considerations.

Machine vision, equipped with the power of machine learning algorithms, has ushered in a new era for packaging. This synergy has transcended traditional boundaries, offering precision, efficiency, and adaptability previously unattainable. With the ability to analyze visual data and learn from it, these systems have revolutionized quality control, ensuring that products meet the high standards consumers have come to expect.

Machine vision systems, with their tireless eyes, can inspect products at speeds and accuracies far beyond human capabilities.

The benefits are manifold. Machine vision systems, with their tireless eyes, can inspect products at speeds and accuracies far beyond human capabilities. They detect even the minutest defects, from misaligned labels to imperfect seals, ensuring that only flawless products reach the market. This not only enhances brand reputation but also significantly reduces waste, contributing to more sustainable manufacturing practices.

Moreover, machine learning algorithms enable these systems to improve over time. They learn from every product inspected, becoming more adept at identifying defects and adapting to new packaging designs without the need for extensive reprogramming. This adaptability is crucial in an era where product cycles are rapid and consumer demands are ever-evolving.

One of the most significant impacts of machine vision and learning in packaging is the leap in operational efficiency it enables. Automated inspection lines reduce downtime, allowing for continuous production that keeps pace with demand.

Furthermore, the integration of these technologies facilitates personalized packaging at scale. Machine vision systems can adjust to package products according to individual specifications, catering to the growing market for personalized goods, from custom-labeled beverages to bespoke cosmetic kits.

Yet, as with any technological advancement, the integration of machine vision and machine learning in packaging is not without its challenges.

The initial investment in sophisticated equipment and the ongoing need for skilled personnel to manage and interpret data can widen the technological divide, potentially pushing smaller players out of the competition.

The complexity of these systems necessitates a high level of expertise, posing a significant hurdle for smaller manufacturers. The initial investment in sophisticated equipment and the ongoing need for skilled personnel to manage and interpret data can widen the technological divide, potentially pushing smaller players out of the competition.

Data privacy and security emerge as paramount concerns. Machine learning algorithms thrive on data, raising questions about the ownership and protection of the data collected during the packaging process. As these systems become more integrated into manufacturing operations, ensuring the security of sensitive information against breaches becomes a critical issue that manufacturers must address.

Moreover, the reliance on machine vision and learning systems introduces the risk of over-automation. While these technologies can enhance efficiency, there is a fine line between leveraging them to support human workers and replacing them altogether. The potential for job displacement raises ethical questions about the responsibility of manufacturers to their workforce and the broader societal implications of widespread automation.

The path forward requires a careful balancing act. Manufacturers must embrace the benefits of machine vision and learning while remaining cognizant of the potential pitfalls.

Investing in training and development programs can help mitigate the risk of job displacement, ensuring that workers are equipped with the skills needed to thrive in a technologically advanced workplace.

manufacturers can adopt a phased approach to the integration of these technologies, allowing for gradual adaptation and minimizing disruption.

Transparency in data collection and processing, coupled with robust cybersecurity measures, can address privacy concerns, building trust among consumers and stakeholders. Moreover, manufacturers can adopt a phased approach to the integration of these technologies, allowing for gradual adaptation and minimizing disruption.

The impact of machine vision and machine learning on the packaging industry is undeniable, offering unparalleled enhancements in quality control, efficiency, and customization. Yet, as we chart this course of technological integration, we must navigate the complexities it introduces with foresight and responsibility.

By addressing the challenges head-on and adhering to ethical standards, the packaging industry can harness the full potential of these advancements, propelling itself towards a future that is not only more efficient and adaptable but also equitable and secure.

In this journey, the clear sight of progress must be guided by the wisdom to recognize its potential shadows, ensuring that the path we tread is illuminated by both innovation and integrity.

View post:
A Vision of the Future: Machine Learning in Packaging Inspection - Packaging Digest

From data to decision-making: the role of machine learning and digital twins in Alzheimers Disease – UCI MIND

For patients experiencing cognitive decline due to Alzheimers Disease (AD), choosing the most appropriate treatment course at the right time is of great importance. A key element to these decisions is the careful consideration of the available scientific evidence, particularly from randomized clinical trials (RCTs) such as the recent lecanemab trial. Translating RCT results into patient-level decisions, however, can be challenging. This is because trial results tell us about the outcomes of groups rather than individuals. A doctor must judge how similar their patient is to the groups studied in trials. For AD, where patients vary widely in clinical presentations and rates of cognitive decline, this may be a difficult task.

As a step towards more personalized decision-making, prescribing physicians may focus on specific patient characteristics that would affect the disease course and response to treatment, like demographics (e.g., sex, age, education) or genetic factors. In fact, subgroup analyses from some RCTs suggest that at least some drugs could differ in safety or efficacy based on these factors. Nevertheless, the main limitations of these types of results are that the group sizes are often small, increasing the risk of spurious findings. Furthermore, they do not consider the overall impact of many different factors simultaneously. This is where machine learning (ML) may close the gap between data and decision-making.

ML uses patterns found in large datasets to predict health outcomes and treatment response by considering many patient characteristics at once and, further, how they may interact. This underlying model can subsequently be used to form a digital twin for a patient, or the best possible copy of their characteristics and health status. We can use this twin to ask what if questions. For example, If we prescribed this patient this drug at this time, what would be their most likely outcome six months from now? Under the hood, an ML algorithm would utilize previously collected data, such as from RCTs, to locate potential twins and use their outcomes to formulate a response. This could give us a more pinpointed prediction of patient outcomes compared to subgroup analyses. Ideally, this targeted view on patients would help facilitate better care for AD patients.

Roy S. Zawadzki

The stage is set for digital twins to play a bigger role in clinical research and practice in AD: we have the methodology, the data, and, most importantly, a large unmet clinical need for new and more effective treatments. Digital twins can be integrated in a wide variety of contexts that can potentially save clinical trial costs, quicken the time until approval, and better utilize the treatments we already have for the patients that need them the most. For these reasons, biotech companies, academic researchers, and healthcare systems alike should be investigating how digital twins can help assist their particular goals.

To learn more about real-world opportunities and considerations surrounding digital twins, please check out my latest post on my Substack

Roy S. Zawadzki, graduate trainee with Professor Daniel Gillen and supported by the TITAN T32 training grant

Read this article:
From data to decision-making: the role of machine learning and digital twins in Alzheimers Disease - UCI MIND