Category Archives: Machine Learning

High-growth supply chain businesses adopting AI and Machine Learning at faster pace than competitors, Epicor study … – Intelligent CIO

According to the 2024 Agility Index research study from Epicor and Nucleus Research, nearly half of surveyed companies across the make, move and sell industries cited concern over escalating costs as the foremost challenge confronting supply chains, with more than half using Artificial Intelligence, automation or Machine Learning for at least one supply chain management application to address.

Notably, a higher percentage of businesses (63%) that identify as high-growth defined by revenue growth of 20% or more over the past three years have already integrated generative AI into their respective supply chain operations to manage cost and operational challenges.

Nucleus Research surveyed more than 1,700 supply chain management leaders worldwide to understand how they are leveraging powerful technologies like artificial intelligence and machine learning to thrive while navigating challenges like supply chain disruptions, escalating costs and skilled labor gaps. The study also uncovered anticipated future investments in these technologies.

When workers are empowered to spend more time innovating what humans do best thats where the real value creation happens. That is agility, said Vaibhav Vohra, Chief Product and Technology Officer at Epicor. Our 2024 Agility Index underscores the growing adoption of AI and other automation technologies as an essential factor in enabling supply chain businesses to better thrive and compete. These cognitive capabilities are coming together to empower workers and their businesses to more readily adapt to shifting market conditions and better serve their customers.

Survey respondents indicated they are integrating generative AI into digital supply chain operations across various functions such as product descriptions, customer service chatbots, natural language querying, reporting and in-application assistance. Specifically, the adoption of generative AI in customer service chatbots, noted by 72% of organisations, is highlighted as the most prevalent use case. This widespread implementation is attributed to the technologys ability to streamline customer interactions across various sectors.

Similarly, 67% of organisations currently employ generative AI for crafting product descriptions, leveraging the technologys capacity to analyse customer sentiment and forecast market demand. This enables a more informed approach to product design and feature development.

Businesses are also implementing machine learning most frequently in inventory optimisation (45%) and demand forecasting (40%), underlining the critical role of these technologies in managing inventory levels and accurately predicting future demand.

According to survey respondents, the greatest hope for the impact of automation technologies lies in increased efficiency and productivity (32%), cost savings (26%), and improved supply chain automation (23%). This reflects a strong belief in the potential of these technologies to drive significant improvements in supply chain management.

Facebook Twitter LinkedIn Email WhatsApp

Read the original here:
High-growth supply chain businesses adopting AI and Machine Learning at faster pace than competitors, Epicor study ... - Intelligent CIO

The Interesting Applications of AI in Nutrition – AutoGPT

Its soothing to know that AI is making significant strides in the field of nutrition. According to research by MarketsandMarkets, AI in the healthcare market, including nutrition, will reach $148.4 billion by 2029. Now thats a staggering figure!

The WHO has never stopped to emphasize that dietary factors are a leading cause of death and disability globally. Yet, maintaining a healthy diet in todays fast-paced world can be tough. With so many options and varying opinions, how do you know whats best for you?

In this article, Ill let you in on the incredible ways AI is transforming personalized nutrition, making healthy eating easier and more effective than ever before.

Imagine having a personal nutritionist available 24/7. Thats the promise of diet AI. The rise of AI in personalized nutrition is transforming the way we approach our diets, offering tailored recommendations based on individual needs and preferences.

Traditional dietary guidelines often follow a one-size-fits-all approach, which might not be effective for everyone. AI technology, however, enables a more customized approach to nutrition, considering everything from DNA to daily habits to recommend the best foods for individuals.

These intelligent systems can provide real-time advice and adjustments to your diet, ensuring you stay on track with your health goals.

The answer is simple Machine learning dietary analysis!

Machine learning (ML) is a subset of AI that enables systems to learn from data, identify patterns, and make decisions with minimal human intervention. In nutrition, machine learning is proving to be a powerful tool for dietary analysis.

By processing vast amounts of dietary data, machine learning algorithms can provide insights that help individuals make healthier food choices tailored to their specific needs.

Lets talk a bit about how machine learning is revolutionizing dietary analysis, shall we?

Machine learning algorithms analyze an individuals dietary habits, health data, and lifestyle choices to create personalized diet plans. These plans are continuously refined as more data is collected, ensuring that the dietary recommendations remain relevant and effective.

AI can analyze your genetic data to determine how your body responds to different nutrients. By examining specific genetic markers, AI systems can predict your susceptibility to certain conditions like diabetes or heart disease, and recommend dietary changes to mitigate these risks.

For example, if your genetic profile indicates a higher risk for high cholesterol, AI can suggest a diet lower in saturated fats and higher in fiber. Companies like 23andMe and AncestryDNA already provide genetic data that AI can analyze to determine your nutritional needs.

Machine learning models can predict potential health outcomes based on an individuals diet. By analyzing historical dietary data and health records, these models identify patterns that correlate specific eating habits with health risks or benefits. This predictive capability enables proactive dietary adjustments to prevent or manage health conditions.

For individuals with chronic conditions like diabetes, AI can continuously monitor health data and provide real-time dietary suggestions to maintain optimal health. What would that look like? AI will typically analyze health data and dietary patterns, and machine learning models will suggest foods that help manage these conditions effectively.

For diabetes, for example, AI will analyze blood sugar levels and recommend meals that help stabilize glucose levels, improving overall well-being.

AI systems can also take into account an individuals lifestyle and dietary preferences.

AI considers how active you are, adjusting calorie and nutrient intake accordingly. Whether youre vegan, gluten-free, or have specific food allergies, AI can curate meal plans that align with your dietary choices while ensuring nutritional adequacy.

This personalized approach helps individuals adhere to their dietary goals without feeling deprived or restricted.

AI integrates various data points, including genetic information, health records, dietary habits, and lifestyle choices, to create a comprehensive nutritional profile.

Machine learning algorithms then analyze this data to identify patterns and correlations that human nutritionists might overlook. This holistic view enables more accurate and personalized dietary recommendations.

Tracking nutrient intake manually can be tedious and prone to error. Machine learning algorithms simplify this process by accurately identifying and logging the nutritional content of meals based on user input or even photos of food.

This automated tracking helps individuals ensure they meet their nutritional goals.

Machine learning can also analyze behavioral data to understand how different factors, such as stress or sleep patterns, influence dietary habits. This comprehensive analysis helps in creating more effective and holistic dietary plans that consider the users overall lifestyle.

Nutrigenomix is a leading AI-driven platform that uses genetic testing to provide personalized dietary recommendations. By analyzing an individuals genetic makeup, Nutrigenomix offers insights into how different nutrients affect the body.

Features:

Visit Nutrigenomix

DayTwo focuses on personalized nutrition through gut microbiome analysis. It predicts blood sugar responses to various foods, helping users manage conditions like diabetes and maintain overall health.

Features:

Visit DayTwo

DNAfit offers genetic testing to create personalized diet and fitness plans. It helps users understand their genetic predispositions and tailor their diet accordingly.

Features:

Visit DNAfit

Nutrino uses AI to provide personalized nutrition insights and meal recommendations. It integrates data from various sources, including wearables, to create a holistic view of an individuals dietary needs.

Features:

Visit Nutrino

Habit offers personalized nutrition plans based on genetic, blood, and lifestyle data. It provides a comprehensive approach to individualized diet planning.

Features:

Visit Habit

InsideTracker combines DNA testing with blood analysis to create personalized diet and lifestyle plans. It focuses on optimizing health and performance through tailored recommendations.

Features:

Visit InsideTracker

GenoPalate provides personalized nutrition recommendations based on DNA analysis. It focuses on helping users make better food choices aligned with their genetic makeup.

Features:

Visit GenoPalate

Baze uses blood testing to determine nutrient deficiencies and offers personalized supplement and diet recommendations. It aims to optimize nutrition based on individual needs.

Features:

Visit Baze

myDNA offers personalized health and wellness plans based on genetic insights. It provides users with DNA-based recommendations for diet, fitness, and overall wellness.

Features:

Visit myDNA

Nutripal leverages AI to offer personalized nutrition advice based on user data, preferences, and goals. It helps users achieve their health and wellness objectives through tailored diet plans.

Features:

Visit Nutripal

The future of AI in nutrition looks promising. As machine learning technology continues to advance, its applications in dietary analysis will become even more sophisticated. We can expect even more accurate and personalized dietary recommendations.

Future developments may include:

AI will continue to revolutionize how we approach nutrition, making it easier to stay healthy and fit.

AI in nutrition is transforming how we approach our diets. By leveraging genetic data, health conditions, and lifestyle preferences, AI creates personalized diet plans that are more effective than generic advice.

With the help of diet AI and machine learning, maintaining a healthy diet has never been easier or more tailored to your unique needs.

AI is used in nutrition to create personalized diet plans, analyze dietary patterns, predict health outcomes, and optimize nutrient intake based on individual data like genetics, health conditions, and lifestyle.

An example of AI in food is the use of AI-driven apps like Nutrigenomix, which analyzes genetic data to provide personalized nutrition recommendations and meal plans.

AI can help in food by offering personalized dietary advice, improving food safety through advanced detection methods, optimizing supply chains, and reducing food waste by predicting demand and managing inventory.

While AI can provide valuable insights and personalized recommendations, it cannot fully replace nutritionists. Human nutritionists offer personalized care, empathy, and expertise in interpreting data within a broader health context.

Continued here:
The Interesting Applications of AI in Nutrition - AutoGPT

THINKFREE Launches Beta Version of Global Corporate AI Search Service Refinder AI – AiThority

THINKFREE, a subsidiary of the world-class AI technology company HANCOM, has launched the beta version of Refinder AI, an AI search and Q&A solution targeting the global corporate market.

Refinder AI is an AI service that enables integrated searches of massive data scattered across numerous business platforms used by an enterprise, regardless of the data sources and relations. Linking all productivity and collaboration platforms such as Gmail, Google Drive, Confluence, Jira, Slack and Notion, it provides an all-in-one service for finding web content, office documents, PDF files, emails, messages, etc., saved in the platforms. It is characterized by fast and accurate answers provided on the basis of verified data within the boundaries of the respective enterprise.

Also Read: Niva, Backed by Gradient, Googles AI Fund, Emerges to Tackle Global Business Verification

Notably, in addition to simple data searches, Refinder AI also plays the role of an assistant. When a user enters a query or a search word, the AI understands the meaning of the query and the users intention, and provides an answer in natural language by combining information with the highest accuracy and relevance from the data scattered across the enterprise. It provides results by accessing all platforms through a single search so the user doesnt have to search every platform or memorize where information is saved.

As the service handles corporate data saved in various platforms, the level of security has been reinforced. Refinder AI is designed to not search critical data for unauthorized users. The AI provides answers by referring only to data that have been authorized by the company that introduced the solution. In particular, unlike other corporate search solutions, it does not require a separate development process. And various applications used by an enterprise can be conveniently loaded.

Also Read: Quali Uses AI to Simplify Infrastructure as Code and Automate Application Environment Orchestration

While the amount of data generated and retained each year by enterprises across the world is increasing at an exponential rate, the rate of effective data use in business is very low, said THINKFREE CEOKim Du-yeong. THINKFREE will target the global cloud market with Refinder AI, and grow into a company that draws the worlds attention by combining HANCOMs world-class document technology with advanced AI technologies.

Also Read: Revolutionizing Customer Interactions: Introducing Converse AI by Qwary

[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]

Read this article:
THINKFREE Launches Beta Version of Global Corporate AI Search Service Refinder AI - AiThority

Artificial Intelligence (AI) in Computer Vision Market Booming with USD 148.8 billion by 2031 Fueled by AI-Driven … – PR Newswire

WESTFORD, Mass., July 8, 2024 /PRNewswire/ -- According to SkyQuest, the global Artificial Intelligence (AI) in Computer Vision Market size was valued at USD 20.7 billion in 2022 and is poised to grow from USD 25.8 billion in 2023 to USD 148.8 billion by 2031, growing at a CAGR of 24.5% during the forecast period (2024-2031).

Intelligence (AI) in computer vision is growing rapidly due to high demand in various industries. Computer-based intelligence (AI) has become central to predictive maintenance, usingCCTV and deep machine learning algorithms to accurately detect faults in many systems that highlight the importance of such technology in industries. Following the introduction of image sensors, smart cameras and deep learning algorithms, computer vision systems are on the rise and their application in various technologies is driving market growth and innovation.

Download a detailed overview:

https://www.skyquestt.com/sample-request/ai-in-computer-vision-market

Global Artificial Intelligence (AI) in Computer Vision Market Overview:

Report Coverage

Details

Market Revenue in 2023

USD 25.8 billion

Estimated Value by 2031

USD 148.8 billion

Growth Rate

Poised to grow at a CAGR of 24.5%

Forecast Period

20242031

Forecast Units

Value (USD Billion)

Report Coverage

Revenue Forecast, Competitive Landscape, Growth Factors, and Trends

Segments Covered

Application, Component, Function, Machine Learning Models and End Use Industry

Geographies Covered

North America, Europe, Asia Pacific, Middle East & Africa, Latin America

Report Highlights

Updated financial information / product portfolio of players

Key Market Opportunities

Inception of Computer Vision Technologies Need to Inspire the AI

Key Market Drivers

Rise in Demand for Automation

Segments covered in Artificial Intelligence (AI) In Computer Vision Market are as follows:

Request Free Customization of this report:

https://www.skyquestt.com/speak-with-analyst/ai-in-computer-vision-market

End User Innovation: Harnessing AI in Computer Vision in Healthcare Segment

The healthcare industry is a major player in the global market in computer vision as it is a multi-use area with a very significant impact on care and disease diagnosis. Today, computer vision technology powered by artificial intelligence (AI) is used in medical imaging, diagnosis, surgical planning and patient health outside Computer vision systems can view medical images such as X-rays, MRI and CT scans for abnormalities. Early detection and insights required for healthcare professionals with AI working. The reason for its use is the desire to provide medical imaging and diagnostic tests with accuracy, efficiency and it's expensive.

On the other hand, the fastest growing area of AI in the computer vision industry is in the automotive field. With the integration of AI into cars, the automotive industry is being transformed with the help of computer vision technology. Computer vision systems provide ADAS and automation features with their own technological capabilities. These systems can analyze real-time visual data from cameras and sensors to detect objects, identify pedestrians, understand traffic signs, and act as navigational guides. The automotive world is allocating significant funding to AI-powered computer vision systems when changing safety, driving, and even autonomous driving.

View report summary and Table of Contents (TOC):

https://www.skyquestt.com/report/ai-in-computer-vision-market

Software Segment: Powering the AI Revolution

The software segment emerged as the largest market segment in the market. With increased deep learning algorithms and neural networks, the accuracy and efficiency of deep vision algorithms has dramatically increased the dominance of this segment the size of AI, deep learning algorithms, image recognition software, video analysis tools and AI based algorithms. The software component represents the basis for training AI models, for object recognition, image segmentation, and facial recognition, as well as most tasks associated with computer vision.

On the other hand, by 2023, the hardware segment will grow at aCAGR of 19.5% and has emerged as the fastest growing segment in artificial intelligence (AI) in the computer vision market. As artificial intelligence is increasingly being used in industry, these solutions are needed in a variety of applications to maximize efficiency.

Envisioning Tomorrow: The Future of Artificial Intelligence (AI) in Computer Vision Market

Artificial intelligence (AI) has revolutionized computer vision, unlocking unprecedented capabilities in image and video analysis. The market is poised for tremendous growth driven by advances in deep learning, neural networks and computing power. These technologies have enabled deployments from autonomous and front-end vehicles discovery to medical imaging and industrial devices.

Looking ahead, AI is set to spread further into the market as AI systems become more sophisticated and accessible. Emerging trends such as edge AI, interpretable AI, and integration of AI into Internet of Things (IoT) devices will shape the future.

Related Report:

AI Market

Artificial Intelligence of Things (AIoT) Market

Edge Artificial Intelligence (AI) Market

Mobile Artificial Intelligence (AI) Market

Artificial Intelligence (AI) Hardware Market

About Us:

SkyQuest is an IP focused Research and Investment Bank and Accelerator of Technology and assets. We provide access to technologies, markets and finance across sectors viz. Life Sciences, CleanTech, AgriTech, NanoTech and Information & Communication Technology.

We work closely with innovators, inventors, innovation seekers, entrepreneurs, companies and investors alike in leveraging external sources of R&D. Moreover, we help them in optimizing the economic potential of their intellectual assets. Our experiences with innovation management and commercialization has expanded our reach across North America, Europe, ASEAN and Asia Pacific.

Contact:Mr. Jagraj Singh SkyQuest Technology 1 Apache Way, Westford, Massachusetts 01886 USA (+1) 351-333-4748 Email: [emailprotected] Visit Our Website:https://www.skyquestt.com/

Logo : https://mma.prnewswire.com/media/2446095/SkyQuest_Logo.jpg

SOURCE SkyQuest Technology

Continued here:
Artificial Intelligence (AI) in Computer Vision Market Booming with USD 148.8 billion by 2031 Fueled by AI-Driven ... - PR Newswire

China Steams Ahead of the US in the AI Patent Race – Technology Magazine

China has surged ahead in the generative AI (Gen AI) landscape, filing over 38,210 AI patents between 2014 and 2023.

This places it ahead of other nations focused on digital transformation, such as the US (6,276), South Korea (4,155), Japan (3,409) and India (1,350), according to a recent report by Reuters.

The sharp rise in patenting activity reflects the recent technological advances and the potential within Gen AI, the UN report explains.

In recent years, China has been committed to innovating the next generation of Gen AI, powering ahead with powerful innovations across several key industries, including supply chain, manufacturing and electric vehicles (EV). Now, it aims to continue surging ahead with its innovation, further committing itself to a transformative AI market.

Gen AI, which generates text, images and computer code, has seen a rapid rise in popularity across multiple business sectors in recent months. As a result, organisations worldwide are eager to develop their own Gen AI technology, inspired by models like OpenAI's ChatGPT and Anthropic's Claude.

According to the World Intellectual Property Organization (WIPO), more than 50,000 Gen AI patent applications have been filed in the past decade, with a quarter of these in 2023 alone. Chatbots in particular, a prominent application of Gen AI, are transforming sectors by enhancing workplace productivity and customer service.

As Gen AI technology evolves, businesses globally are developing policies and strategies to harness its potential responsibly. If used responsibly, the technology could revolutionise company operations and be integrated effectively within global workforces.

Top patent applicants currently in China include TikTok owner ByteDance, Tencent, Ping An Insurance Group and Baidu.

China's AI market continues to thrive despite US-imposed restrictions on AI chips. The country supports its businesses and prioritises AI investments, aiming to be a global AI leader by 2030. Forecasts suggest that China's AI market will triple from US$23.196bn in 2021 to approximately US$61.85bn by 2025.

Significantly, the country has successfully managed to keep its AI momentum going, having introduced world-first regulations into AI technology in areas such as deepfakes and generative models. Likewise, the countrys interim measures on Gen AI in 2023 stated that AI technology must adhere to the core socialist values of China and should not endanger national security.

Initiatives like Made in China 2025 and the Next Generation Artificial Intelligence Development Plan (2017) illustrate China's commitment to AI sector growth, enterprise innovation and AI governance.

Quoted in our sister publication AI Magazine on the subject, Baidu co-founder, chairman and CEO of Baidu, Robin Li, said: We believe that artificial intelligence will revolutionise every industry we know today. The immense long-term value of AI and its transformative impact on all aspects of life are only in their infancy.

******

Make sure you check out the latest edition of Technology Magazine and also sign up to our global conference series - Tech & AI LIVE 2024

******

Technology Magazine is a BizClik brand

Original post:
China Steams Ahead of the US in the AI Patent Race - Technology Magazine

Medical content creation in the age of generative AI | Amazon Web Services – AWS Blog

Generative AI and transformer-based large language models (LLMs) have been in the top headlines recently. These models demonstrate impressive performance in question answering, text summarization, code, and text generation. Today, LLMs are being used in real settings by companies, including the heavily-regulated healthcare and life sciences industry (HCLS). The use cases can range from medical information extraction and clinical notes summarization to marketing content generation and medical-legal review automation (MLR process). In this post, we explore how LLMs can be used to design marketing content for disease awareness.

Marketing content is a key component in the communication strategy of HCLS companies. Its also a highly non-trivial balance exercise, because the technical content should be as accurate and precise as possible, yet engaging and empowering for the target audience. The main goal of the marketing content is to raise awareness about certain health conditions and disseminate knowledge of possible therapies among patients and healthcare providers. By accessing up-to-date and accurate information, healthcare providers can adapt their patients treatment in a more informed and knowledgeable way. However, medical content being highly sensitive, the generation process can be relatively slow (from days to weeks), and may go through numerous peer-review cycles, with thorough regulatory compliance and evaluation protocols.

Could LLMs, with their advanced text generation capabilities, help streamline this process by assisting brand managers and medical experts in their generation and review process?

To answer this question, the AWS Generative AI Innovation Center recently developed an AI assistant for medical content generation. The system is built upon Amazon Bedrock and leverages LLM capabilities to generate curated medical content for disease awareness. With this AI assistant, we can effectively reduce the overall generation time from weeks to hours, while giving the subject matter experts (SMEs) more control over the generation process. This is accomplished through anautomatedrevisionfunctionality, which allows the user to interact and send instructions and comments directly to the LLM via an interactive feedback loop. This is especially important since the revision of content is usually the main bottleneck in the process.

Since every piece of medical information can profoundly impact the well-being of patients, medical content generation comes with additional requirements and hinges upon the contents accuracy and precision. For this reason, our system has been augmented with additional guardrails for fact-checking and rules evaluation. The goal of these modules is to assess the factuality of the generated text and its alignment with pre-specified rules and regulations. With these additional features, you have more transparency and control over the underlying generative logic of the LLM.

This post walks you through the implementation details and design choices, focusing primarily on thecontent generationandrevision modules. Fact-checking and rules evaluation require special coverage and will be discussed in an upcoming post.

Image1:High-level overview of the AI-assistant and its different components

The overall architecture and the main steps in the content creation process are illustrated inImage 2.The solution has been designed using the following services:

Image 2: Content generation steps

The workflow is as follows:

To generate accurate medical content, the LLM is provided with a set of curated scientific data related to the disease in question, e.g. medical journals, articles, websites, etc. These articles are chosen by brand managers, medical experts and other SMEs with adequate medical expertise.

The input also consists of a brief, which describesthe general requirements and rules the generated content should adhere to (tone, style, target audience, number of words, etc.). In the traditional marketing content generation process, this brief is usually sent to content creation agencies.

It is also possible to integrate more elaborate rules or regulations, such as the HIPAA privacy guidelines for theprotection of health information privacy and security. Moreover, these rules can either be general and universally applicable or they can be more specific to certain cases. For example, some regulatory requirements may apply to some markets/regions or a particular disease. Our generative system allows a high degree of personalization so you can easily tailor and specialize the content to new settings, by simply adjusting the input data.

The content should be carefully adapted to the target audience, either patients or healthcare professionals. Indeed, the tone, style, and scientific complexity should be chosen depending on the readers familiarity with medical concepts.The content personalization is incredibly important for HCLS companies with a large geographical footprint, as it enables synergies and yields more efficiencies across regional teams.

From a system design perspective, we may need to process a large number of curated articles and scientific journals. This is especially true if the disease in question requires sophisticated medical knowledge or relies on more recent publications. Moreover, medical references contain a variety of information, structured in either plain text or more complex images, with embedded annotations and tables. To scale the system, it is important to seamlessly parse, extract, and store this information. For this purpose, we use Amazon Textract, a machine learning (ML) service for entity recognition and extraction.

Once the input data is processed, it is sent to the LLM as contextual information through API calls. With a context window as large as 200K tokens for Anthropic Claude 3, we can choose to either use the original scientific corpus, hence improving the quality of the generated content (though at the price of increased latency), or summarize the scientific references before using them in the generative pipeline.

Medical reference summarization is an essential step in the overall performance optimization and is achieved by leveraging LLM summarization capabilities. We use prompt engineering to send our summarization instructions to the LLM. Importantly, when performed, summarization should preserve as much articles metadata as possible, such as the title, authors, date, etc.

Image 3: A simplified version of the summarization prompt

To start the generative pipeline, the user can upload their input data to the UI. This will trigger the Textract and optionally, the summarization Lambda functions, which, upon completion, will write the processed data to an S3 bucket. Any subsequent Lambda function can read its input data directly from S3. By reading data from S3, we avoid throttling issues usually encountered with Websockets when dealing with large payloads.

Image 4: A high-level schematic of the content generation pipeline

Our solution relies primarily on prompt engineering to interact with Bedrock LLMs. All the inputs (articles, briefs and rules) are provided as parameters to the LLM via a LangChain PrompteTemplate object. We can guide the LLM further with few-shot examples illustrating, for instance, the citation styles. Fine-tuning in particular, Parameter-Efficient Fine-Tuning techniques can specialize the LLM further to the medical knowledge and will be explored at a later stage.

Image 5: A simplified schematic of the content generation prompt

Our pipeline is multilingual in the sense it can generate content in different languages. Claude 3, for example, has been trained on dozens of different languages besides English and can translate content between them. However, we recognize that in some cases, the complexity of the target language may require a specialized tool, in which case, we may resort to an additional translation step using Amazon Translate.

Image 6: Animation showing the generation of an article on Ehlers-Danlos syndrome, its causes, symptoms, and complications

Revision is an important capability in our solution because it enables you to further tune the generated content by iteratively prompting the LLM with feedback. Since the solution has been designed primarily as an assistant, these feedback loops allow our tool to seamlessly integrate with existing processes, hence effectively assisting SMEs in the design of accurate medical content. The user can, for instance, enforce a rule that has not been perfectly applied by the LLM in a previous version, or simply improve the clarity and accuracy of some sections. The revision can be applied to the whole text. Alternatively, the user can choose to correct individual paragraphs. In both cases, the revised version and the feedback are appended to a new prompt and sent to the LLM for processing.

Image 7: A simplified version of the content revision prompt

Upon submission of the instructions to the LLM, a Lambda function triggers a new content generation process with the updated prompt. To preserve the overall syntactic coherence, it is preferable to re-generate the whole article, keeping the other paragraphs untouched. However, one can improve the process by re-generating only those sections for which feedback has been provided. In this case, proper attention should be paid to the consistency of the text. This revision process can be applied recursively, by improving upon the previous versions, until the content is deemed satisfactory by the user.

Image 8: Animation showing the revision of the Ehlers-Danlos article. The user can ask, for example, for additional information

With the recent improvements in the quality of LLM-generated text, generative AI has become a transformative technology with the potential to streamline and optimize a wide range of processes and businesses.

Medical content generation for disease awareness is a key illustration of how LLMs can be leveraged to generate curated and high-quality marketing content in hours instead of weeks, hence yielding a substantial operational improvement andenabling more synergies between regional teams. Through its revision feature, our solution canbe seamlessly integrated with existing traditional processes, making it a genuine assistant tool empowering medical experts and brand managers.

Marketing content for disease awareness is also a landmark example of a highly regulated use case, where precision and accuracy of the generated content are critically important. To enable SMEs to detect and correct any possible hallucination and erroneous statements, we designed a factuality checking module with the purpose of detecting potential misalignment in the generated text with respect to source references.

Furthermore, our rule evaluation feature can help SMEs with the MLR process by automatically highlighting any inadequate implementation of rules or regulations. With these complementary guardrails, we ensure both scalability and robustness of our generative pipeline, and consequently, the safe and responsible deployment of AI in industrial and real-world settings.

Sarah Boufelja Y. is a Sr. Data Scientist with 8+ years of experience in Data Science and Machine Learning. In her role at the GenAII Center, she worked with key stakeholders to address their Business problems using the tools of machine learning and generative AI. Her expertise lies at the intersection of Machine Learning, Probability Theory and Optimal Transport.

Liza (Elizaveta) Zinovyeva is an Applied Scientist at AWS Generative AI Innovation Center and is based in Berlin. She helps customers across different industries to integrate Generative AI into their existing applications and workflows. She is passionate about AI/ML, finance and software security topics. In her spare time, she enjoys spending time with her family, sports, learning new technologies, and table quizzes.

Nikita Kozodoi is an Applied Scientist at the AWS Generative AI Innovation Center, where he builds and advances generative AI and ML solutions to solve real-world business problems for customers across industries. In his spare time, he loves playing beach volleyball.

Marion Eigneris a Generative AI Strategist who has led the launch of multiple Generative AI solutions. With expertise across enterprise transformation and product innovation, she specializes in empowering businesses to rapidly prototype, launch, and scale new products and services leveraging Generative AI.

Nuno Castro is a Sr. Applied Science Manager at AWS Generative AI Innovation Center. He leads Generative AI customer engagements, helping AWS customers find the most impactful use case from ideation, prototype through to production. Hes has 17 years experience in the field in industries such as finance, manufacturing, and travel, leading ML teams for 10 years.

Aiham Taleb, PhD, is an Applied Scientist at the Generative AI Innovation Center, working directly with AWS enterprise customers to leverage Gen AI across several high-impact use cases. Aiham has a PhD in unsupervised representation learning, and has industry experience that spans across various machine learning applications, including computer vision, natural language processing, and medical imaging.

Read the original post:
Medical content creation in the age of generative AI | Amazon Web Services - AWS Blog

Advancing common bean (Phaseolus vulgaris L.) disease detection with YOLO driven deep learning to enhance … – Nature.com

Building a diverse dataset for common bean disease detection

Common beans, also known as nearly perfect food for their nutritional richness, stand as a linchpin for economic stability, uplifting the livelihoods of smallholder farmers worldwide67. Yet, the specter of disease looms large over common bean cultivation, presenting a daunting challenge. Detecting and validating these diseases constitutes a second major hurdle for pathologists, a complex and time-consuming endeavor that invariably demands expert supervision.

To expedite disease detection and facilitate timely management interventions, a comprehensive image dataset, recognizing the inadequacy of existing public resources like the PlantVillage dataset for common bean diseases, collaborating with CGIAR bean network experts were developed. Collectively, 9564 original field images from diverse disease hotspots were amassed. A subset of these images formed our annotated image dataset, outlined in Table 1. These images were curated by expert pathologists for precise disease identification.

To ensure heterogeneity, images were captured in real-field settings, documenting the intricacies of actual field conditions and plant interactions across different growth stages (Supplementary Table 2). Additionally, various cameras were used to capture the images, introducing variations in image quality and background complexity. A realistic spectrum of disease presentations within the context of agricultural variability depicts the challenges that crops encounter in the dynamic environmental conditions during various growth stages, which is an essential step in developing a globally beneficial, mobile-assisted disease detection tool35. This strategic preparation equips our model for deployment in diverse and unpredictable environments where common beans are cultivated.

To enhance the performance of our CNN in identifying common bean diseases, we implemented micro-annotations and data augmentation techniques to create a more robust training dataset. Data augmentation techniques, such as flipping and brightness adjustments, were applied strategically to diversify the dataset and address overfitting to effectively generate additional data variations (Table 1). These techniques were selectively applied to datasets with least amount of data to maximize their impact. This includes CBMV, Rust, and ANTH classes for whole leaf annotations; Rust class for micro leaf annotations; and healthy class for pod annotations. These augmentation strategies enriched the training dataset, introducing diversity into the samples and enhancing the performance and generalization of the deep-learning models used for disease detection.

Conversely, micro annotations focus on identifying specific disease symptoms at a micro level, which are essential for training highly accurate and sensitive models. While manually annotating each small symptom can be challenging due to resource constraints, micro annotations have the potential to enhance the generalization of models, allowing them to recognize a wider range of disease variations. However, their performance is highly dependent on the factors like data complexity, data quantity and annotation quality.

The dataset was split (70% training, 20% testing, and 10% validation) to ensure representation across different disease classes. Each image underwent rigorously validation by a bean phytopathologist, resulting in a comprehensive set of 44,022 annotations before data augmentation, and expanding to 54,264 after data augmentation (Table 1). This labor-intensive annotation process, conducted by three experts over 4 months, underpins the datasets quality and reliability. This precise level of labeling, surpasses the scope of publicly available datasets, bolstering our model more robust against common issues like overfitting and underfitting. Consequently, the system demonstrates greater efficacy and adaptability for real-world disease detection in diverse agricultural settings.

This study represents a trailblazing effort in evaluating one-stage YOLO object detectors, including YOLOv7, YOLOv8, and YOLO-NAS, specifically for detecting CB diseases. The YOLO series is known for its single-stage detection capability and real-time processing efficiency68,69. Notably, YOLO-NAS stands out within the YOLO family for its advanced performance in detection metrics and its rapid inference speed. We comprehensively assess the performance of our advanced YOLO-based detectors using a range of detailed metrics. The metrics encompass various annotation resolutions (whole and micro) for both leaf and pod datasets. This multifaceted evaluation approach allows us compare the detectors performance across different plant parts, to providing a comprehensive analysis.

Training loss analysis plays a crucial role in emphasizing the efficiency, adaptability, and stability of the YOLOv7 and YOLOv8 models during the learning process. Both models exhibited a rapid initial decline in loss for both the leaf and pod datasets (see Fig.5). This rapid decrease signifies the overall effectiveness of the learning and adaptation to the training data. This observation is consistent with prior findings on training duration and loss convergence70, affirming the diverse convergence rates observed during training. The consistent decline in training loss further validates the effectiveness of the model.

Total loss for different models, at whole annotations level; (a) train loss for leaves and pods, (b) validation loss for leaves and pods.

However, YOLOv8 model displayed an anomaly in the annotated pod dataset, where the loss starts to increase around epoch 16 before continuing on a downward trend. This could be attributed to the increased complexity of the annotated pod dataset compared to its previous training datasets. Once it overcome the initial hurdle, the model begins to effectively learn from the data, signifying a positive stability in the learning process.

Despite this anomaly, both models exhibited a consistent and steady decline in loss over time, indicating a positive stability in their learning process. Lower loss values on the training set compared to the validation set align with expectations. The relative stability in the difference between training and validation losses across epochs indicates the absence of significant overfitting in both the models, highlighting effective generalization, a common challenge in model training. This means that the models appear to learn underlying patterns in the data rather than memorizing specific training examples.

On the contrary, while the YOLO-NAS model exhibited similar trends for the leaf dataset (Fig.6), in both full annotation and micro-annotation levels (Supplementary Fig.1), its validation losses for the pod dataset displayed significant fluctuations (Fig.7, Supplementary Fig.2). These fluctuations suggest potential overfitting, likely stemming from non-representative validation data or inadequate regularization techniques. This behavior could elucidate the lower mAP scores observed for YOLO-NAS in the pod dataset. It underscores the critical importance of careful dataset curation and the potential need for adjusting regularization while training such models.

YOLO-NAS model evaluation indicators (loss and mAP@0.5) during training, for whole leaf annotations.

YOLO-NAS model evaluation indicators (loss and mAP@0.5) during training, for whole pod annotations.

Notably, at the micro annotation level, the YOLOv8 training on the pod dataset stopped at epoch 217 (Supplementary Fig.3) due to the absence of improvement in the validation loss for 50 consecutive epochs. Additionally, YOLOv7 training on the leaf dataset showed a slight overfitting tendency as the validation loss followed an increasing trend around epoch 40 instead of decreasing, signifying an augmented difference between validation and training loss. The individual contributions of each loss to the total loss of the YOLOv7 model can be seen in Supplementary Figs. 4, 5, 6 and 7, and those of the YOLOv8 model in Supplementary Figs. 8, 9, 10 and 11.

To assess the performance of object detection models, particularly focusing on the widely recognized mean Average Precision (mAP) metric35, we evaluated different YOLO models for leaf and pod detection. This metric has been the benchmark in competitions like PASCAL VOC, ImageNet, and COCO datasets. We complemented this analysis with confusion matrix to gain deeper insights into model performance, with a particular emphasis on unseen data at both whole and micro annotation levels.

The mAP scores of different YOLO models for leaf and pod detection were detailed in Table 3. YOLO-NAS stands out with a remarkable performance for whole leaf disease annotation, achieving an impressive mAP score of 97%. Nevertheless, for micro annotation on leaves, YOLOv8 excelled with a notable mAP score of 67%. In pod detection tasks, YOLOv8 continued to perform superior, achieving mAP scores of 96% and 87% for whole and micro annotations, respectively (Table 4). It is worth noting that YOLOv7 closely mirrored the performance of YOLOv8, achieving high mAP scores for both leaf and pod datasets.

Across all classes and models, whole annotations generally yielded better results than micro annotations. Specifically, for the healthy class, YOLOv7 and YOLOv8 achieved high mAP accuracies of 96% and 98%, respectively, for pods across both annotation levels, and for leaves at the micro annotation level. In the case of whole leaf annotations, these models also performed well, with mAP scores of 96% and 85%, respectively. The evolution of the validation mAP during model training is illustrated in Fig.8, demonstrating a continual increase of the mAP throughout the epochs until it becomes relatively uniform.

mAP@0.5 evaluation metric for different models; (a) whole leaf annotations, (b) whole pod annotations.

These findings provide further context to our study, emphasizing that although all three YOLO versions can achieve high accuracy, their performance nuances become apparent based on the complexity of annotation levels and the specific nature of the tasks. The noteworthy performance of YOLO-NAS under certain conditions and the close competition between YOLOv7 and YOLOv8 highlight the continuous advancements in object detection technologies, showcasing their potential applications in precision agriculture.

The differences in loss patterns and mAP scores among the models suggest that while YOLOv7 and YOLOv8 exhibit robustness in a various scenario, YOLO-NAS may require more specific tuning, especially when confronted with datasets of higher variability or complexity. This insight proves invaluable for future model development and application, particularly in precision agriculture, where precise and reliable disease detection is imperative. These findings underscore the necessity for continuous model evaluation and adjustment to cater to the specific characteristics of diverse datasets and detection tasks.

In our study, confusion matrices served as a pivotal tool for assessing the performance of various YOLO model variants. These matrices, delineating true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), played a crucial role in evaluating disease-specific accuracy and identifying misclassifications (Fig.9). The analysis revealed valuable instances, where class complexity resulted in reduced accuracy, providing insights into areas prone to errors for targeted improvements.

Confusion matrix using whole annotations for different models, (a) YOLO-NAS model, (b) YOLOv7 model, (c) YOLOv8 model.

The YOLO-NAS model demonstrated robustness with over 97% accuracy in detecting symptoms of leaf diseases across all classes. However, for ALS pod symptoms, YOLOv7 and YOLOv8 outperformed YOLO-NAS with accuracies of 95% and 93%, respectively. YOLO-NASs detection rate for these symptoms dropped from 79% for whole annotations to 48% at the micro annotation level, compared to 66% each for YOLOv7 and YOLOv8. Diseases like rust and anthracnose, particularly challenging at micro levels, showed lower accuracies around 56%. No misclassifications occur at the micro annotation level (Supplementary Fig.12), but the number of undetected objects increases considerably.

Interestingly, the confusion matrices highlighted additional detections made by the models. Despite these additional detections slightly affected precision, they were mostly correct and indicated the models effectiveness in identifying objectsa critical factor in complex agricultural scenarios where exhaustive annotation might be challenging. This aspect is significant as it showcases the models capability in comprehensive detection despite the inherent difficulties in annotating every detail in a small diseased image. Furthermore, recognizing the widespread occurrence of CB leaf spot disease across diverse regions such as Asia, Latin America, and Africa, we are actively compiling and annotating early-stage symptom images. This endeavor aims at enhancing models accuracy and adaptability for real agricultural settings.

Our study also focused on evaluating the performance of various YOLO models in detecting CB diseases, utilizing precision and recall as key metrics. Precision, indicating the ratio of correctly identified positive cases to all predicted positives, and recall, measuring the ratio of correctly identified positives out of all actual positive cases, are essential metrics for assessing the diagnostic accuracy of the models.

The YOLOv7 and YOLOv8 models exhibited excellent performance, achieving precision and recall scores of 0.9 at the whole annotation level for both leaves and pods, as shown in Table 5. However, a decrease in these scores was observed for micro annotations, suggesting a variation in model performance based on annotation detail. YOLO-NAS demonstrated high effectiveness, particularly in whole-leaf annotations, with a precision of 0.7 and an impressive recall of 0.988. Despite its lower precision due to a higher number of extra detections, its high recall confirms its strong detection capability. For whole pod annotations, YOLO-NAS showed a precision of 0.6, primarily impacted by the model misclassifying some diseased pods as healthy (Fig.9). The model maintained high recall levels even at the micro annotation level, though there was a slight drop compared to whole annotations.

In the evaluation of the performance of models, a confidence threshold (conf) of 0.35 was set uniformly across all three models. This threshold choice involves a trade-off between precision and recall, where a higher confidence threshold tends to increase precision by reducing false positives but may decrease recall as some true positives with lower confidence might be overlooked (Fig.10a,b, Supplementary Figs. 1320b,c). This delicate balance is visually represented in the Precision-Recall (PR) curve (Fig.10c,d, Supplementary Figs. 1320a), with the area under the PR curve (AUPRC) serving as a comprehensive metric summarizing of model performance across all thresholds. A similar behavior is observed at the micro annotation level (Supplementary Fig.21).

Operation results curve for YOLO-NAS model, using whole leaf and pod annotations; (a) precision-confidence curve, (b) recall-confidence curve, (c) precision-recall curve for leaf, and (d) precision-recall curve for pod.

The standardized dataset employed in our study ensures rigorous evaluation, enabling reliable comparisons of each models performance.

After analyzing the results obtained from all the metrics mentioned above, Fig.11 shows the predictions made by the YOLO-NAS model on the leaf dataset at the whole annotation level, Fig.12 at the micro annotation level, and finally Fig.13 shows the predictions on the pod dataset at both annotation levels.

Some examples of common bean disease detection results using YOLO-NAS and whole annotations.

Some examples of common bean disease detection results using YOLO-NAS and micro annotations.

Some examples of common bean disease detection results using YOLO-NAS on the pod dataset.

Surprisingly, our analysis found that micro annotation yielded lower performance than whole annotation across all explored YOLO models, regardless of disease classes. This result contradicts the hypothesis that micro annotations will improve detection accuracy. The discrepancy suggests that the effectiveness of annotation methods may vary depending on factors such as dataset complexity, disease characteristics, and model requirements. However, the detection accuracy may be lower with micro annotation models because we did not fully annotate all lesions in each image, particularly due to the high number of lesions per image. Developing improved annotation techniques could significantly enhance the accuracy and efficiency of these annotations. Further investigation is warranted to understand this finding and optimize annotation strategies for future CB disease detection research.

Conversely, Figs.14 and 15 show situations where specific model outperform others. For example, in Fig.14, the YOLOv7 and YOLOv8 models successfully detect the POD within the image, whereas YOLO-NAS does not. This aligns with the mAP, precision, and recall results as shown above. However, Fig.15, demonstrates YOLO-NAS ability to more accurately identify healthy leaves, while the YOLOv8 model fails to detect many of the healthy leaves. This aligns with the models respective mAP scores for the healthy class.

Example of prediction in the same image. (a) YOLO-NAS model, (b) YOLOv7 model, (c) YOLOv8 model.

Example of prediction in the same image. (a) YOLO-NAS model, (b) YOLOv8 model.

To bridge the gap between research and practical application, we seamlessly integrated our promising whole annotation YOLO-NAS models into a user-friendly Android app. The app boasts a straightforward design, allowing users to either upload existing photos from their storage or capture new ones in real time for immediate analysis (Fig.16). This real-time capability played a pivotal role in evaluating the functional accuracy of the models within the practical context of a real-world app.

Developed mobile application for bean disease detection. (a) Initial screen, (b) image taking and scan, (c) diagnostic screen for leaf, (d) recommendations screen.

To evaluate the real-time performance of the app, we tested real-field images from disease hotspots in Latin America and Africa. Our results demonstrate that the YOLO-NAS model achieved outstanding accuracy in accurately detecting almost all classes (Table 6). Specifically, the model achieved close to 100% accuracy with high confidence scores across all disease classes, except for the pod classes (Table 6). The lower confidence in pod predictions can be attributed to the inherent complexity of the task. In real-field environments, bean pods are often surrounded by diverse background elements that can pose challenges to distinguish between bean pods and background elements, leading to lower confidence levels in its predictions. Furthermore, factors such as varying lighting conditions, shadows, and cluttered backgrounds can contribute to the difficulty of the task. Despite these limitations, the model successfully manages to detect almost all pods correctly.

This successful field deployment not only validates the reliability of the app but also underscores its potential as a robust agricultural tool for enabling timely interventions in CB disease management practices that can significantly enhance crop yields and reduce reliance of pesticides.

Our extensive analysis revealed the prowess of YOLO-NAS model in real-time CB disease detection within agricultural settings (Supplementary Fig.22, Table 6). Engineered through the cutting-edge technique of Neural Architecture Search (NAS), YOLO-NAS adeptly strikes a balance between speed and accuracy amidst the challenges in field conditions, which is a pivotal attribute for prompt and precise disease diagnosis71. Notably, it effectively combines the quick detection characteristic of one-stage detectors and the precision akin to two-stage detectors, achieving high efficiency and precision, with reduced risk of overfitting72. The performance metrics of YOLO-NAS stands out for its exceptional performance, particularly for its high mAP scores on benchmark datasets like COCO and its lower latency compared to counterparts such as YOLOv7 and YOLOv8.

In a continuation of exploring the model performance, YOLOv7 and YOLOv8 models have demonstrated robustness for edge computing applications in agricultural contexts, especially in remote areas with limited internet connectivity. Our findings demonstrated high accuracy, precision, and recall achieved by these models, proving them reliable tools for rapid and effective CB disease management. Their ability to function independently on local devices empowers farmers to conduct immediate on-site diagnostics. This aligns with the findings44 which emphasized the proficiency of the YOLOv7 model in tea leaf disease detection. Their research, utilizing key metrics such as precision and recall, recorded values exceeding 90%, reinforces the effectiveness of YOLOv7 in accurately identifying plant diseases, mirroring our own observations with both YOLOv7 and YOLOv8 models.

In contrast, while YOLO-NAS exhibits good precision and generalization, its current complexity introduces challenges for offline use in the field conditions. This primarily stems from the lack of readily available tools to efficiently convert the model into a lightweight format suitable for mobile devices. This limitation renders YOLO-NAS as less viable for immediate use in the field without cloud support. Nevertheless, the ongoing advancements in YOLO models, particularly the online proficiency of YOLO-NAS, paint a promising future. Researchers aim to amalgamate the accuracy of YOLO-NAS with the offline capabilities of YOLOv7 and YOLOv8, expanding the accessibility of AI-powered disease detection tools to agricultural professionals worldwide, irrespective of their internet connectivity. This transformative interaction holds the potential to revolutionize disease management strategies in agriculture, offering a seamless blend of precision and accessibility.

Follow this link:
Advancing common bean (Phaseolus vulgaris L.) disease detection with YOLO driven deep learning to enhance ... - Nature.com

Machine Learning Revolutionizes Flood Relief Efforts in Assam – TechiExpert.com

Floods have become an annual phenomenon in Assam. Millions of people are displaced every year. They are forced to take refuge at relief camps. Hence, the Indian government allocates hundreds of crores for relief efforts. It is important to ensure that the funds and resources are efficiently utilized as well as reaches the right locations. This has been always a challenge due to fragmented government data stored across various departments.

A breakthrough has emerged in the form of CivicDataLabs (CDL) lately. It is a startup that specializes in data unification and analysis. It has partnered with the Assam government to streamline and standardize disaster-related data.

CDL is tackling the fragmented data issue. It cleans and standardized datasets from 18 different state departments and central agencies. It introduced the Open Contracting Data Standard as it has been adopted by more than 50 countries. Assam becomes the first Indian state to adopt the standard.

CDL employed simple machine learning techniques to perform hazard analysis and disaster modeling. The analysis thereafter identified high-risk districts and revenue circles. Valuable insights were provided for the Assam government.

CDL enables comprehensive assessment of past trends and helps in planning for the upcoming monsoon cycles and restoration efforts. The Assam government has seen positive results with the model in identifying previously overlooked areas. The data-driven approach has significantly improved the allocation of resources and simultaneously ensured more effective flood relief efforts.

Moreover, the capabilities of the Assam government departments have expanded. Generating analytical reports during disasters was a slow process earlier due to lack of suitable software and data visualization techniques.

With the new development, there is a renewed enthusiasm for data science. Government officials are now actively pursuing online courses in geospatial data management and AI.

Excerpt from:
Machine Learning Revolutionizes Flood Relief Efforts in Assam - TechiExpert.com

Accelerated PyTorch inference with torch.compile on AWS Graviton processors | Amazon Web Services – AWS Blog

Originally PyTorch used an eager mode where each PyTorch operation that forms the model is run independently as soon as its reached. PyTorch 2.0 introduced torch.compile to speed up PyTorch code over the default eager mode.In contrast to eager mode, the torch.compile pre-compiles theentire model into a single graph in a manner thats optimal for running on a given hardware platform.AWS optimized the PyTorch torch.compile feature for AWS Graviton3 processors. This optimization results in up to 2x better performance for Hugging Face model inference (based on geomean of performance improvement for 33 models) and up to 1.35x better performance for TorchBench model inference (geomean of performance improvement for 45 models) compared to the default eager mode inference across several natural language processing (NLP), computer vision (CV), and recommendation models on AWS Graviton3-based Amazon EC2 instances.Starting with PyTorch 2.3.1, the optimizations are available in torch Python wheels and AWS Graviton PyTorch deep learning container (DLC).

In this blog post, we show how we optimized torch.compile performance onAWS Graviton3-based EC2 instances, how to use the optimizations to improve inference performance, and the resulting speedups.

In eager mode, operators in a model are run immediately as they are encountered. Its easier to use, more suitable for machine learning (ML) researchers, and hence is the default mode. However, eager mode incurs runtime overhead because of redundant kernel launch and memory read overhead. Whereas in torch compile mode, operators are first synthesized into a graph, wherein one operator is merged with another to reduce and localize memory reads and total kernel launch overhead.

The goal for the AWS Graviton team was to optimize torch.compile backend for Graviton3 processors. PyTorch eager mode was already optimized for Graviton3 processors with Arm Compute Library (ACL) kernels using oneDNN (also known as MKLDNN). So, the question was, how to reuse those kernels in torch.compile mode to get the best of graph compilation and the optimized kernel performance together?

The AWS Graviton team extended the torch inductor and oneDNN primitives that reused the ACL kernels and optimized compile mode performance on Graviton3 processors. Starting with PyTorch 2.3.1, the optimizations are available in the torch Python wheels and AWS Graviton DLC. Please see the Running an inference section that follows for the instructions on installation, runtime configuration, and how to run the tests.

To demonstrate the performance improvements, we used NLP, CV, and recommendation models from TorchBenchand the most downloaded NLP models from Hugging Face across Question Answering, Text Classification, Token Classification, Translation, Zero-Shot Classification, Translation, Summarization, Feature Extraction, Text Generation, Text2Text Generation, Fill-Mask, and Sentence Similarity tasks to cover a wide variety of customer use cases.

We started with measuring TorchBench model inference latency, in milliseconds (msec), for the eager mode, which is marked 1.0 with a red dotted line in the following graph. Then we compared the improvements from torch.compile for the same model inference, the normalized results are plotted in the graph. You can see that for the 45 models we benchmarked, there is a 1.35x latency improvement (geomean for the 45 models).

Image 1:PyTorch model inference performance improvement with torch.compile on AWS Graviton3-based c7g instance using TorchBench framework. The reference eager mode performance is marked as 1.0. (higher is better)

Similar to the preceding TorchBench inference performance graph, we started with measuring the Hugging Face NLP model inference latency, in msec, for the eager mode, which is marked 1.0 with a red dotted line in the following graph. Then we compared the improvements from torch.compile for the same model inference, the normalized results are plotted in the graph. You can see that for the 33 models we benchmarked, there is around 2x performance improvement (geomean for the 33 models).

Image 2: Hugging Face NLP model inference performance improvement with torch.compile on AWS Graviton3-based c7g instance using Hugging Face example scripts. The reference eager mode performance is marked as 1.0. (higher is better)

Starting with PyTorch 2.3.1, the optimizations are available in the torch Python wheel and in AWS Graviton PyTorch DLC. This section shows how to run inference in eager and torch.compile modes using torch Python wheels and benchmarking scripts from Hugging Face and TorchBench repos.

To successfully run the scripts and reproduce the speedup numbers mentioned in this post, you need an instance from the Graviton3 family (c7g/r7g/m7g/hpc7g) of hardware. For this post, we used the c7g.4xl (16 vcpu) instance.The instance, the AMI details, and the required torch library versions are mentioned in the following snippet.

The generic runtime tunings implemented for eager mode inference are equally applicable for the torch.compile mode, so, we set the following environment variables to further improve the torch.compile performance on AWS Graviton3 processors.

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance. We benchmarked 45 models using the scripts from the TorchBench repo. Following code shows how to run the scripts for the eager mode and the compile mode with inductor backend.

On successful completion of the inference runs, the script stores the results in JSON format. The following is the sample output:

Google T5 Small Text Translation model is one of the around 30 Hugging Face models we benchmarked. Were using it as a sample model to demonstrate how to run inference in eager and compile modes. The additional configurations and APIs required to run it in compile mode are highlighted in BOLD. Save the following script asgoogle_t5_small_text_translation.py .

Run the script with the following steps.

On successful completion of the inference runs, the script prints the torch profiler output with the latency breakdown for the torch operators. The following is the sample output from torch profiler:

Next, were extending the torch inductor CPU backend support to compile Llama model, and adding support for fused GEMM kernels to enable torch inductor operator fusion optimization on AWS Graviton3 processors.

In this tutorial, we covered how we optimized torch.compile performance onAWS Graviton3-based EC2 instances, how to use the optimizations to improve PyTorch model inference performance, and demonstrated the resulting speedups. We hope that you will give it a try! If you need any support with ML software on Graviton, please open an issue on the AWS Graviton Technical Guide GitHub.

Sunita Nadampalli is a Software Development Manager and AI/ML expert at AWS. She leads AWS Graviton software performance optimizations for AI/ML and HPC workloads. She is passionate about open source software development and delivering high-performance and sustainable software solutions for SoCs based on the Arm ISA.

See the article here:
Accelerated PyTorch inference with torch.compile on AWS Graviton processors | Amazon Web Services - AWS Blog

Create an end-to-end serverless digital assistant for semantic search with Amazon Bedrock | Amazon Web Services – AWS Blog

With the rise of generative artificial intelligence (AI), an increasing number of organizations use digital assistants to have their end-users ask domain-specific questions, using Retrieval Augmented Generation (RAG) over their enterprise data sources.

As organizations transition from proofs of concept to production workloads, they establish objectives to run and scale their workloads with minimal operational overhead, while optimizing on costs. Organizations also require the implementation of common security practices such as identity and access management, to make sure that only authorized and authenticated users are allowed to perform specific actions or access specific resources.

This post covers a solution to create an end-to-end digital assistant as a web application using a serverless architecture to address these requirements. Because the solution components primarily use serverless technologies, it provides several benefits, such as automatic scaling, built-in high availability, and a pay-per-use billing model to optimize on costs. The solution also includes an authentication layer and an authorization layer to manage identities and permissions.

This solution also uses the hybrid search feature of Knowledge Bases for Amazon Bedrock to increase the relevancy of retrieved results using RAG. When receiving a query from an end-user, hybrid search performs both a semantic search and a keyword search:

For example, if a user submits a prompt that includes keywords, a text-based search may provide better results than a semantic search. This is why hybrid search combines the two approaches: the precision of semantic search and coverage of keywords. For more information about hybrid search, see Knowledge Bases for Amazon Bedrock now supports hybrid search.

In this post, we provide an operational overview of the solution, and then describe how to set it up with the following services:

The solution architecture involves the following steps:

After Step 9, the foundation model generates a response back that will be returned to the user in the web applications digital assistant.

The following diagram illustrates this workflow.

To follow along and set up this solution, you must have the following:

In this section, we create a knowledge base in Amazon Bedrock. The knowledge base will enrich the prompt submitted to an Amazon Bedrock foundation model with contextual information derived from our data source (in our case, documents uploaded in a S3 bucket).

During the creation of the knowledge base, a vector store will also be created to ingest documents encoded as vectors, using an embeddings model. An embeddings model encodes data as vectors in order to capture the meaning and context of our sample documents. This allows us to find data relevant to our end-user prompts.

For our use case, we use the vector engine for OpenSearch Serverless as a vector store and Titan Text Embeddings G1 model as the embeddings model.

Complete the following steps to create an S3 bucket to upload documents, and synchronize them with a knowledge base in Amazon Bedrock:

In this section, we create the following resources:

Complete the following steps to create the API and the backend of the digital assistants web application, using AWS CloudFormation templates:

You can retrieve the knowledge base ID by running the following AWS CLI command:

In this section, we create a user in our Amazon Cognito user pool. This user will be used to log in to our web application.

Complete the following steps to configure the Amazon Cognito user pool created in the previous section:

You can also complete these steps by running the script cognito-create-testuser.sh available in the api folder as follows (provide your email address):

After you create the user, you should receive an email with a temporary password in this format: Your username is #your-email-address# and temporary password is #temporary-password#.

Keep note of these login details (email address and temporary password) to use later when testing the web application.

In this section, we build a web application using Amplify and publish it to make it accessible through an endpoint URL. To complete this section, you must first install and set up the Amplify CLI, as discussed in the prerequisites.

Complete the following steps to create the web application of the digital assistant:

The amplify-setup.sh script creates an Amplify application and configures it to integrate with resources you created in the previous modules:

In this step, we configure how the web application will be deployed and hosted:

The web application is now available for testing and a URL should be displayed, as shown in the following screenshot. Take note of the URL to use in the following section.

In this section, you test the web application of the digital assistant:

You should receive a response along with sources, as shown in the following screenshot

To make sure that no additional cost is incurred, remove the resources provisioned in your account. Make sure youre in the correct AWS account before deleting the following resources.

You should exercise caution when performing the preceding steps. Make sure you are deleting the resources in the correct AWS account.

In this post, we walked through a solution to create a digital assistant using serverless services. First, we created a knowledge base and ingested documents into it from an S3 bucket. Then we created an API and a Lambda function to submit prompts to the knowledge base. We also configured a user pool to grant a user access to the digital assistants web application. Finally, we created the frontend of the web application in Amplify.

For further information on the services used, consult the Amazon Bedrock, Security in Amazon Bedrock, Amazon OpenSearch Serverless, AWS Amplify, Amazon API Gateway, AWS Lambda, Amazon Cognito, and Amazon S3 product pages.

To dive deeper into this solution, a self-paced workshop is available in AWS Workshop Studio, at this location.

Mehdi Amrane is a Senior Solutions Architect at Amazon Web Services. He supports customers on their initiatives and provides them prescriptive guidance to achieve their goals, and accelerate their cloud journey. He is passionate about creating content on application architecture, DevOps and Serverless technologies.

View original post here:
Create an end-to-end serverless digital assistant for semantic search with Amazon Bedrock | Amazon Web Services - AWS Blog