Category Archives: Machine Learning
Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker | Amazon Web Services – AWS Blog
In the rapidly evolving landscape of artificial intelligence (AI), the rise of generative AI models has ushered in a new era of personalized and intelligent experiences. Organizations are increasingly using the power of these language models to drive innovation and enhance their services, from natural language processing to content generation and beyond.
Using generative AI models in the enterprise environment, however, requires taming their intrinsic power and enhancing their skills to address specific customer needs. In cases where an out-of-the-box model is missing knowledge of domain- or organization-specific terminologies, a custom fine-tuned model, also called a domain-specific large language model (LLM), might be an option for performing standard tasks in that domain or micro-domain. BloombergGPT is an example of LLM that was trained from scratch to have a better understanding of highly specialized vocabulary found in the financial domain. In the same sense, domain specificity can be addressed through fine-tuning at a smaller scale. Customers are fine-tuning generative AI models based on domains including finance, sales, marketing, travel, IT, HR, finance, procurement, healthcare and life sciences, customer service, and many more. Additionally, independent software vendors (ISVs) are building secure, managed, multi-tenant, end-to-end generative AI platforms with models that are customized and personalized based on their customers datasets and domains. For example, Forethought introduced SupportGPT, a generative AI platform for customer support.
As the demands for personalized and specialized AI solutions grow, businesses often find themselves grappling with the challenge of efficiently managing and serving a multitude of fine-tuned models across diverse use cases and customer segments. With the need to serve a wide range of AI-powered use cases, from resume parsing and job skill matching, domain-specific to email generation and natural language understanding, these businesses are often left with the daunting task of managing hundreds of fine-tuned models, each tailored to specific customer needs or use cases. The complexities of this challenge are compounded by the inherent scalability and cost-effectiveness concerns that come with deploying and maintaining such a diverse model ecosystem. Traditional approaches to model serving can quickly become unwieldy and resource intensive, leading to increased infrastructure costs, operational overhead, and potential performance bottlenecks.
Fine-tuning enormous language models is prohibitively expensive in terms of the hardware required and the storage and switching cost for hosting independent instances for different tasks. LoRA (Low-Rank Adaptation) is an efficient adaptation strategy that neither introduces inference latency nor reduces input sequence length while retaining high model quality. Importantly, it allows for quick task switching when deployed as a service by sharing the vast majority of the model parameters.
In this post, we explore a solution that addresses these challenges head-on using LoRA serving with Amazon SageMaker. By using the new performance optimizations of LoRA techniques in SageMaker large model inference (LMI) containers along with inference components, we demonstrate how organizations can efficiently manage and serve their growing portfolio of fine-tuned models, while optimizing costs and providing seamless performance for their customers.
The latest SageMaker LMI container offers unmerged-LoRA inference, sped up with our LMI-Dist inference engine and OpenAI style chat schema. To learn more about LMI, refer to LMI Starting Guide, LMI handlers Inference API Schema, and Chat Completions API Schema.
There are two kinds of LoRA that can be put onto various engines:
The new LMI container offers out-of-box integration and abstraction with SageMaker for hosting multiple unmerged LoRA adapters with higher performance (low latency and high throughput) using the vLLM backend LMI-Dist backend that uses vLLM, which in-turn uses S-LORA and Punica. The LMI container offers two backends for serving LoRA adapters: the LMI-Dist backend (recommended) and the vLLM Backend. Both backends are based on the open source vLLM library for serving LoRA adapters, but the LMI-Dist backend provides additional optimized continuous (rolling) batching implementation. You are not required to configure these libraries separately; the LMI container provides the higher-level abstraction through the vLLM and LMI-Dist backends. We recommend you start with the LMI-Dist backend because it has additional performance optimizations related to continuous (rolling) batching.
S-LoRA stores all adapters in the main memory and fetches the adapters used by the currently running queries to the GPU memory. To efficiently use the GPU memory and reduce fragmentation, S-LoRA proposes unified paging. Unified paging uses a unified memory pool to manage dynamic adapter weights with different ranks and KV cache tensors with varying sequence lengths. Additionally, S-LoRA employs a novel tensor parallelism strategy and highly optimized custom CUDA kernels for heterogeneous batching of LoRA computation. Collectively, these features enable S-LoRA to serve thousands of LoRA adapters on a single GPU or across multiple GPUs with a small overhead.
Punica is designed to efficiently serve multiple LoRA models on a shared GPU cluster. It achieves this by following three design guidelines:
Punica uses a new CUDA kernel design called Segmented Gather Matrix-Vector Multiplication (SGMV) to batch GPU operations for concurrent runs of multiple LoRA models, significantly improving GPU efficiency in terms of memory and computation. Punica also implements a scheduler that routes requests to active GPUs and migrates requests for consolidation, optimizing GPU resource allocation. Overall, Punica achieves high throughput and low latency in serving multi-tenant LoRA models on a shared GPU cluster. For more information, read the Punica whitepaper.
The following figure shows the multi LoRA adapter serving stack of the LMI container on SageMaker.
As shown in the preceding figure, the LMI container provides the higher-level abstraction through the vLLM and LMI-Dist backends to serve LoRA adapters at scale on SageMaker. As a result, youre not required to configure the underlying libraries (S-LORA, Punica, or vLLM) separately. However, there might be cases where you want to control some of the performance driving parameters depending on your use case and application performance requirements. The following are the common configuration options the LMI container provides to tune LoRA serving. For more details on configuration options specific to each backend, refer to vLLM Engine User Guide and LMI-Dist Engine User Guide.
Enterprises grappling with the complexities of managing generative AI models often encounter scenarios where a robust and flexible design pattern is crucial. One common use case involves a single base model with multiple LoRA adapters, each tailored to specific customer needs or use cases. This approach allows organizations to use a foundational language model while maintaining the agility to fine-tune and deploy customized versions for their diverse customer base.
An enterprise offering a resume parsing and job skill matching service may use a single high-performance base model, such as Mistral 7B. The Mistral 7B base model is particularly well-suited for job-related content generation tasks, such as creating personalized job descriptions and tailored email communications. Mistrals strong performance in natural language generation and its ability to capture industry-specific terminology and writing styles make it a valuable asset for such an enterprises customers in the HR and recruitment space. By fine-tuning Mistral 7B with LoRA adapters, enterprises can make sure the generated content aligns with the unique branding, tone, and requirements of each customer, delivering a highly personalized experience.
On the other hand, the same enterprise may use the Llama 3 base model for more general natural language processing tasks, such as resume parsing, skills extraction, and candidate matching. Llama 3s broad knowledge base and robust language understanding capabilities enable it to handle a wide range of documents and formats, making sure their services can effectively process and analyze candidate information, regardless of the source. By fine-tuning Llama 3 with LoRA adapters, such enterprises can tailor the models performance to specific customer requirements, such as regional dialects, industry-specific terminology, or unique data formats. By employing a multi-base model, multi-adapter design pattern, enterprises can take advantage of the unique strengths of each language model to deliver a comprehensive and highly personalized job profile to a candidate resume matching service. This approach allows enterprises to cater to the diverse needs of their customers, making sure each client receives tailored AI-powered solutions that enhance their recruitment and talent management processes.
Effectively implementing and managing these design patterns, where multiple base models are coupled with numerous LoRA adapters, is a key challenge that enterprises must address to unlock the full potential of their generative AI investments. A well-designed and scalable approach to model serving is crucial in delivering cost-effective, high-performance, and personalized experiences to customers.
The following sections outline the coding steps to deploy a base LLM, TheBloke/Llama-2-7B-Chat-fp16, with LoRA adapters on SageMaker. It involves preparing a compressed archive with the base model files and LoRA adapter files, uploading it to Amazon Simple Storage Service (Amazon S3), selecting and configuring the SageMaker LMI container to enable LoRA support, creating a SageMaker endpoint configuration and endpoint, defining an inference component for the model, and sending inference requests specifying different LoRA adapters like Spanish (es) and French (fr) in the request payload to use those fine-tuned language capabilities. For more information on deploying models using SageMaker inference components, see Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency.
To showcase multi-base models with their LoRA adapters, we add another base model, mistralai/Mistral-7B-v0.1, and its LoRA adapter to the same SageMaker endpoint, as shown in the following diagram.
You need to complete some prerequisites before you can run the notebook:
To prepare the LoRA adapters, create a adapters.tar.gz compressed archive containing the LoRA adapters directory. The adapters directory should contain subdirectories for each of the LoRA adapters, with each adapter subdirectory containing the adapter_model.bin file (the adapter weights) and the adapter_config.json file (the adapter configuration). We typically obtain these adapter files by using the PeftModel.save_pretrained() method from the Peft library. After you assemble the adapters directory with the adapter files, you compress it into a adapters.tar.gz archive and upload it to an S3 bucket for deployment or sharing. We include the LoRA adapters in the adapters directory as follows:
Download LoRA adapters, compress them, and upload the compressed file to Amazon S3:
SageMaker provides optimized containers for LMI that support different frameworks for model parallelism, allowing the deployment of LLMs across multiple GPUs. For this post, we employ the DeepSpeed container, which encompasses frameworks such as DeepSpeed and vLLM, among others. See the following code:
Create an endpoint configuration using the appropriate instance type. Set ContainerStartupHealthCheckTimeoutInSeconds to account for the time taken to download the LLM weights from Amazon S3 or the model hub, and the time taken to load the model on the GPUs:
Create a SageMaker endpoint based on the endpoint configuration defined in the previous step. You use this endpoint for hosting the inference component (model) inference and make invocations.
Now that you have created a SageMaker endpoint, lets create our model as an inference component. The SageMaker inference component enables you to deploy one or more foundation models (FMs) on the same SageMaker endpoint and control how many accelerators and how much memory is reserved for each FM. See the following code:
With the endpoint and inference model ready, you can now send requests to the endpoint using the LoRA adapters you fine-tuned for Spanish and French languages. The specific LoRA adapter is specified in the request payload under the "adapters" field. We use "es" for the Spanish language adapter and "fr" for the French language adapter, as shown in the following code:
Lets add another base model and its LoRA adapter to the same SageMaker endpoint for multi-base models with multiple fine-tuned LoRA adapters. The code is very similar to the previous code for creating the Llama base model and its LoRA adapter.
Configure the SageMaker LMI container to host the base model (mistralai/Mistral-7B-v0.1) and its LoRA adapter (mistral-lora-multi-adapter/adapters/fr):
Create a new SageMaker model and inference component for the base model (mistralai/Mistral-7B-v0.1) and its LoRA adapter (mistral-lora-multi-adapter/adapters/fr):
Invoke the same SageMaker endpoint for the newly created inference component for the base model (mistralai/Mistral-7B-v0.1) and its LoRA adapter (mistral-lora-multi-adapter/adapters/fr):
Delete the SageMaker inference components, models, endpoint configuration, and endpoint to avoid incurring unnecessary costs:
The ability to efficiently manage and serve a diverse portfolio of fine-tuned generative AI models is paramount if you want your organization to deliver personalized and intelligent experiences at scale in todays rapidly evolving AI landscape. With the inference capabilities of SageMaker LMI coupled with the performance optimizations of LoRA techniques, you can overcome the challenges of multi-tenant fine-tuned LLM serving. This solution enables you to consolidate AI workloads, batch operations across multiple models, and optimize resource utilization for cost-effective, high-performance delivery of tailored AI solutions to your customers. As demand for specialized AI experiences continues to grow, weve shown how the scalable infrastructure and cutting-edge model serving techniques of SageMaker position AWS as a powerful platform for unlocking generative AIs full potential. To start exploring the benefits of this solution for yourself, we encourage you to use the code example and resources weve provided in this post.
Michael Nguyen is a Senior Startup Solutions Architect at AWS, specializing in leveraging AI/ML to drive innovation and develop business solutions on AWS. Michael holds 12 AWS certifications and has a BS/MS in Electrical/Computer Engineering and an MBA from Penn State University, Binghamton University, and the University of Delaware.
Dhawal Patel is a Principal Machine Learning Architect at AWS. He has worked with organizations ranging from large enterprises to mid-sized startups on problems related to distributed computing, and Artificial Intelligence. He focuses on Deep learning including NLP and Computer Vision domains. He helps customers achieve high performance model inference on SageMaker.
Vivek Gangasani is a AI/ML Startup Solutions Architect for Generative AI startups at AWS. He helps emerging GenAI startups build innovative solutions using AWS services and accelerated compute. Currently, he is focused on developing strategies for fine-tuning and optimizing the inference performance of Large Language Models. In his free time, Vivek enjoys hiking, watching movies and trying different cuisines.
Qing Lan is a Software Development Engineer in AWS. He has been working on several challenging products in Amazon, including high performance ML inference solutions and high performance logging system. Qings team successfully launched the first Billion-parameter model in Amazon Advertising with very low latency required. Qing has in-depth knowledge on the infrastructure optimization and Deep Learning acceleration.
See more here:
Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker | Amazon Web Services - AWS Blog
Human Touch vs. Machine Precision: Debating the Role of AI in Content Creation – hackernoon.com
Back in the day, whenever you had an idea and wanted to bring it to life, you had to get a pen and paper. At some later point, you could do this using a computer but you still had to type and the idea still has to come from you. Today, bringing an idea to life in the form of a written piece is as easy as entering a prompt and having an AI-driven tech do the rest.
Artificial intelligence isnt just transforming the world of content creation, it's creating a new playing field. But is it all that it's cracked up to be? The debate between AI precision and human creativity has been an ongoing one. Some argue that AI is incapable of embodying the nuances of human emotions and applying the same to outputs. In this article, we'll explore these arguments by looking at both ends and prescribing a balance between AI precision and human creativity.
Generative AI is a phrase that almost everyone in the field of content creation must have heard of in some way. While many know it in its finished state as a tool for eliminating the time and stress it takes to generate ideas, there's a lot more that has happened before it's able to do this.
Generative AI involves the design and creation of machine learning algorithms that can learn a set of rules by studying an incomprehensible amount of already existing data.
At its core, this training data, as it's called, comprises a multitude of known information with rules that are learned to create something new.
For example, generative AI that has been trained on 2,000 descriptions of the moon might establish rules that say the moon is solid, rocky, and roughly 238,855 miles away. Therefore, when it is asked to generate a piece of text describing the moon, it has every information it needs to do it. Although AI-powered programs can produce intrinsically accurate content, they often struggle with adding a personal touch to it, making the work feel robotic, impersonal, or lacking in depth.
While AI certainly has its strong points, it is both hard to ignore and crucial to acknowledge the unique nuances that humans bring to content creation. People, unlike machines, possess emotional intelligence, critical thinking, and dynamic mental creativity that can be tweaked as the need arises. Artificial intelligence, on the other hand, excels in identifying patterns, processing large amounts of data, and making data-driven decisions with no emotional undertones.
The flexibility of the human mind makes it easy for content creators to adjust the style, tone, essence, and approach of their writing based on the purpose, audience, and medium of the content. There are also intricacies involved in understanding and expressing emotions, which ultimately adds depth to human-written content and makes it more relatable to another person.
Also, the ability to imagine what someone else might be thinking or feeling as they're reading content is a crucial component of content creation that generative AI is unable to do.
The utilization of AI in content creation has become increasingly prevalent in the digital landscape today. Being able to process massive amounts of data in a few seconds and producing content can significantly boost efficiency and productivity. Articles, reports, guest posts, and product descriptions can be generated at a much faster rate than humans.
There's also the fact that AI-generated content gives off a feel of consistency in writing style and tone. This uniformity can help businesses build a strong and reliable brand identity that customers can come to relate with.
Furthermore, scaling up your content to meet a growing and diverse audience is seamless with Artificial Intelligence. Businesses, writers, or brand designers who need to create content for a growing customer base can adapt, incorporate, and produce content accordingly to save time and effort.
Connecting to each customer on a personal level is not far-fetched here, as AI can also be used to automate the process of analyzing customer data and personalizing the content to suit the varied tastes of your audience.
One of the major drawbacks of AI in content creation is the apparent lack of creativity, empathy, and originality in the finished work. Being able to produce massive amounts of content and personalize them if need be, is great for efficiency, but every information produced relies on pre-existing templates and algorithms.
This quickly makes AI output become repetitive, formulaic, and lacking the creative spark of the human mind. There can also be struggles in fully grasping the context and nuances of certain topics. AI may generate content that is scientifically and factually accurate but includes references that are unfamiliar to certain people in terms of relevance and depth.
It is also important to emphasize the potential biases that may have been injected into the training data which an AI tool uses as its information source. If generative AI was trained on data that is skewed in favor of one group over another, then every content generated by the system will perpetuate those biases. These biases can range from selection and confirmation to stereotyping and measurement bias. The effects of this in the long run may lead to reduced interactions, as people start to feel less connected to the essence of the generated content.
A pivotal concept in the ever-changing digital landscape of today is the collaborative approach combining AI precision with human creativity. This collaboration involves the seamless integration of machine learning and capability with the human mind in solving problems. In this symbiotic relationship, both entities can complement each other's strengths and create content that is both factual and full of emotional depth.
AI systems are complex and occasionally unpredictable, so the importance of human oversight acts as a buffer to limit mistakes and ensure these mistakes are corrected to prevent future reoccurrences.
While AI has excelled at creating large amounts of content in little time, there's something about the human touch that adds meaning to information. Generating data that can stand the test of time is something the best AI tools can manage, but connecting with it on a deeper level can only be achieved with the human factor.
For this reason, it's crucial to explore more ways by which humans and AI can collaborate to bring about the best possible piece of content.
Follow this link:
Human Touch vs. Machine Precision: Debating the Role of AI in Content Creation - hackernoon.com
Machine Learning Advances Toward Certification For Cockpit Use – Aviation Week
https://aviationweek.com/themes/custom/particle/dist/app-drupal/assets/awn-logo.svg
PilotEye is a vision-based situational-awareness system using a neural network to detect objects.
Credit: Avidyne/Daedalean
Swiss startup Daedalean anticipates certification of the first cockpit avionics to incorporate machine learning (ML) by year-end or early in 2025. The company is working with U.S. avionics manufacturer Avidyne to certify the PilotEye visual awareness system, which uses a neural network to recognize...
Machine Learning Advances Toward Certification For Cockpit Use is published in Advanced Air Mobility Report, an Aviation Week Intelligence Network (AWIN) Market Briefing and is included with your AWIN Premium membership.
Already a member of AWIN or subscribe to Advanced Air Mobility through your company? Loginwith your existing email and password
Not a member? Learn how to access the market intelligence and data you need to stay abreast of what's happening in the air transport community.
Apple app store ID
6447645195
Apple app name
apple-itunes-app
App argument URL
https://shownews.aviationweek.com
Visit link:
Machine Learning Advances Toward Certification For Cockpit Use - Aviation Week
Optimizing multi-spectral ore sorting incorporating wavelength selection utilizing neighborhood component analysis for … – Nature.com
Mineral-intensive technologies such as renewable energy and electric vehicles will be in high demand as climate change is addressed and a sustainable energy future is transitioned. Despite this, the mining sector, especially in copper, is facing considerable difficulties due to the growing demand1,2. The depletion of high-grade ore and the rise of high-arsenic copper resources are more prominent issues in this field. Arsenic in low-grade copper ores not only makes mineral processing more difficult, but also causes environmental and health concerns due to the presence of arsenic in wastewater and exhaust gas3,4. The correlation between arsenic exposure and a range of health problems highlights the pressing requirement for ecologically viable methods in the mining sector. The incorporation of modern technologies such as hyperspectral imaging into sensor-based ore sorting systems has significant potential in this situation5. Ore sorting systems can successfully separate high-arsenic ores from valuable material by utilizing the precise and accurate analysis of mineral composition provided by hyperspectral imaging. This not only mitigates environmental dangers but also reduces processing costs. This strategy not only tackles the difficulties presented by ores containing high levels of arsenic, but also aids in the advancement of a mining industry that is environmentally friendly, in line with the objectives of the Paris Agreement.
Sensor-based ore sorting has become a crucial technology in mineral processing, providing numerous advantages that transform conventional mining methods. With sensor-based ore sorting systems, valuable minerals can be selectively extracted from ore streams according to their unique physical and chemical properties based on advanced sensor technologies. This process of selective extraction maximizes the efficient use of resources by effectively separating valuable ore from nonvaluable materials (or gangue minerals). In the field of mineral processing, Sensor-based ore sorting is a vital component as it enhances ore grades and minimizes the amount of waste material that is processed6. Evidence demonstrates that it effectively decreases the usage of energy, water, and reagents, while also minimizing the formation of fine waste, by disposing of trash before undergoing additional processing7,8. To successfully apply sensor-based sorting, it is crucial to select a sensing approach that can efficiently distinguish between ore and waste9. Sensor fusion, the integration of data from several sensing systems, has the potential to enhance the characterization of the detected material and improve sorting capability4. Microwave imaging (MWI) is a promising technique that can penetrate deeply into rock particles and serve as an additional approach for analyzing ores with significant differences in electromagnetic characteristics10. The efficacy of MWI in ore sorting has been validated by simulations and tests, affirming its capability to segregate valuable minerals or metals from unproductive particles. The utilization of sensor-based ore sorting presents substantial advantages in terms of reducing costs and enhancing efficiency in mineral processing.
Ore sorting techniques can be significantly enhanced by leveraging hyperspectral imaging technology, which offers unparalleled capabilities for mineral characterization and classification. Hyperspectral imaging allows ore sorting systems to analyze the distinct spectral fingerprints of minerals over a wide range of wavelengths, unlike traditional sorting methods that only consider physical attributes like size, shape, and density. This enables the identification and differentiation of minerals by analyzing their unique chemical compositions and optical features. Hyperspectral imaging is used in sensor-based ore sorting to analyze ore streams in real-time without causing damage5. This technique offers important details on the mineralogy and quality of the material being processed. By using hyperspectral imaging technology into sorting systems, mining companies can enhance their efficiency, precision, and selectivity in segregating valuable minerals from waste material. As a result, mineral processing enterprises have higher rates of recovery, lower costs of processing, and increased profitability.
The processing of hyperspectral data is more challenging than that of other types of data due to the sheer volume of information collected, which may be affected by issues with its dimensions. High-dimensional spectral bands in hyperspectral images are often highly similar, which makes them susceptible to the "curse of dimensionality," a phenomenon that affects many traditional algorithms11. Within the domain of hyperspectral ore sorting systems, the notion of wavelength selection arises as a crucial strategy for enhancing sorting efficiency and precision. Wavelength selection is the process of strategically identifying and using wavelengths of electromagnetic radiation (light) that provide the most useful information for differentiating between various minerals or compounds in an ore stream. Through the analysis of distinct spectral patterns displayed by minerals at different wavelengths, the process of wavelength selection allows ore sorting systems to concentrate on the specific spectral bands that are most effective in distinguishing the desired minerals. By employing this focused method, the precision, effectiveness, and dependability of mineral identification and separation procedures are enhanced, resulting in better utilization of resources and increased operational performance in mineral processing. The process of choosing the right wavelength is also extremely important to reduce the likelihood of incorrect positive and negative results, maximize the rate at which valuable minerals are recovered, and to reduce the waste stream losses of potentially valuable materials. The significance of ore sorting lies in its ability to facilitate efficient and precise separation of valuable ore from waste or gangue materials. Based on their unique reflectance or absorption properties, sensors can effectively distinguish ore from gangue by using specific wavelengths, such as visible or mid-infrared ones. This enhances the system's ability to choose and efficiently sort materials, especially when working with intricate ores or comparable substances. Utilizing wavelength selection can improve the ability of photometric sensors to distinguish between different substances and simplify the creation of new sensors for the purpose of sorting ores and characterizing minerals12. A variety of advanced techniques are used to analyze multidimensional spectrum data and extract relevant features from hyperspectral data, such as spectral features extraction and machine learning algorithms i.e. linear regression, K-means clustering, neural network13,14,15.
The intricate nature and extensive dimensions of multi-spectral data require the application of sophisticated data classification techniques such as Neighborhood Component Analysis (NCA). Advanced data classification techniques like NCA are needed to handle multi-spectral data due to several reasons. To begin with, hyperspectral data typically includes a substantial number of spectral bands, which might provide computing difficulties during processing and analysis16. The issue can be addressed by using NCA, which involves lowering the dimensionality of the data. This would lead to improved processing and classification efficiency17. Additionally, it is essential to note that conventional classification methods designed for multispectral data may not be appropriate for hyperspectral data, as the latter offers more intricate and comprehensive spectral information18. The NCA method can effectively handle hyperspectral data with a high number of dimensions. It achieves improved classification accuracies by taking into account both spectral and spatial information19. Additionally, NCA offers advantages such as low computational requirements and shorter classification times20. Therefore, advanced techniques like NCA are essential for accurately classifying hyperspectral data while overcoming the challenges associated with high dimensionality and detailed spectral information.
In this study, Neighborhood Component Analysis (NCA) was applied as a preprocessing step to reduce the dimension of Hyperspectral (HS) data of arsenic-bearing minerals by identifying several wavelength bands important for mineral identification. Then the identified wavelength bands were used as inputs to train machine learning algorithms for identifying Arsenic (As) minerals concentration in simulated ore materials. Multispectral (MS) cameras are more cost-effective and provide faster data collecting and processing compared to HS cameras; hence, they are projected to enable mineral identification utilizing data from a few wavelength bands. The HS data of arsenic-bearing minerals (enargite) were used NCA a machine learning method, as a band selector, and identified several wavelength bands important for mineral identification. Then, the data containing only the minimum number of wavelengths were analyzed for identification of mineral contents/ratios using machine learning algorithms. These will improve the selectivity of wavelengths, considering the ore characteristics produced by each mine. In addition, the application of the herein proposed machine learning algorithm for HS images analysis is expected to improve the efficiency of ore selectivity, i.e. improve the speed of the ore sorting process.
To develop environmentally sustainable resources, its essential to develop advanced metal recovery technology for these high-grade arsenic ores, and Sensor-Based Ore Sorting (SBOS) can achieve this. SBOS, when implemented as a presorting process before the normal beneficiation process, can reduce the amount of ore that must be processed to produce a certain amount of value-added metal, which has a significant impact on the economics of the mine and the plant as a whole21,22,23. It can also reduce the environmental impact by reducing the tailings produced in the subsequent beneficiation process. Non-destructively classified tailings are geotechnically stable and can be easily stored due to their low moisture content24. Robben and Wotruba highlighted that the introduction of SBOS would have an impact on both the environmental and economic aspects of the mineral processing process25. However, the authors pointed out that SBOS is still in the market entry stage of the mineral industry and further technological development is required.
Mineral analysis requires knowledge of crystallography as well as chemical analysis26. However, the methods commonly used for mineral analysis, such as Electron Probe Micro Analyzer (EPMA), X-ray diffraction (XRD) and Scanning Electron Microscope (SEM), are relatively time-consuming and depend on experience27, they are not realistic in terms of identification speed, convenience, and economy when used in actual mineral processing operations.
Therefore, SBOS has been developed as a form of mineral identification suitable for beneficiation. In recent years, more and more equipment has been installed that can withstand larger production scales25. SBOS methods have utilized a range of sensing technologies, including X-ray transmission, X-ray fluorescence, optical sensing, and inductive sensing9,10,28. Furthermore, the utilization of area-scan cameras and predictive tracking systems that rely on machine learning approaches have demonstrated potential in minimizing characterization and separation errors29. Researchers have also studied the combination of data from several sensing approaches to improve the sorting ability of these systems6. While different SBOS methods have been developed and introduced particularly focusing on SWIR data, there are few studies or methods on mineral identification/sorting using VNIR short-wavelength HS data. However, there is growing interest in visible to near-infrared (VNIR) spectroscopy for mineral identification, and in some recent studies VNIR wavelengths have been used to classify rocks and minerals30,31.
Sensor-based ore sorting methods and technologies have the potential to significantly improve ore grades and reduce waste in mineral processing6. These methods, which rely on the electromagnetic spectrum, can be classified based on their characteristics and limitations28. An example of a successful method for sorting complicated ores is the utilization of hyperspectral short-wave infrared sensors in conjunction with machine learning, as demonstrated by Tusa32. Sensor-based ore sorting can be applied at various stages in the process flow diagram, making it a versatile and valuable tool in the mining industry33.
In the field of remote sensing, mineral identification in the near-infrared region has been widely used34,35 and they have shown excellent performance in ore classification. On the other hand, the high cost of HS cameras and the time required for data acquisition have been barriers to their application in actual operations, where immediate classification is required. In a previous study24,36, HS data of minerals were analyzed by deep learning to identify minerals. The use of deep learning allows the creation of more versatile and simplified learning models compared to conventional machine learning or identification methods that combine multiple machine learning models. However, since HS images consist of several hundred spectral bands, there is a high correlation between proximity spectra, and data analysis without preprocessing is highly redundant and computationally intensive. Therefore, dimensionality reduction is necessary as a preprocessing step for a large amount of data to be generated.
Dimensionality reduction methods for HS commonly fall into two categories: band extraction and wavelength selection. The band extraction methods map a high-dimensional feature space to a low-dimensional space; therefore, cannot preserve the original physical interpretation of the image and is not applicable as a dimensionality reduction method37. While the wavelength selection method can maintain the original physical interpretation of the images. According to a review by Sun and Du38, wavelength selection methods can be categorized into six groups: ranking-based methods, searching-based methods, clustering-based methods, sparsity-based methods, embedding learning-based methods, and hybrid scheme-based methods.
A variety of studies have explored different techniques for wavelength selection and spectral data classification in mineral processing. Ghosh39 introduced an infrared thermography-based method for sorting alumina-rich iron ores, while Kern40 suggested utilizing short-wavelength infrared and dual-energy X-ray transmission sensors for the Hammerlein SnInZn deposit. Phiri41 investigated the potential of near-infrared sensors for separating carbonate-rich gangue from copper-bearing particles in a coppergold ore sample. Tusa32 advanced the field by evaluating hyperspectral short-wave infrared sensors, combined with machine learning methods, for pre-sorting complex ores. These studies collectively illustrate the potential of various wavelength selection techniques for enhancing the efficiency and effectiveness of ore sorting systems.
Furthermore, numerous research endeavors have investigated the implementation of machine learning algorithms to automate the task of wavelength selection and spectral data classification. Passos42 introduced an automated deep learning pipeline to optimize neural architecture and hyperparameters for spectral classification. Duan43 proposed a template matching approach achieving high accuracy without training, while Wang44 developed a multifunctional optical spectrum analysis technique utilizing support vector machines for optimal accuracy and speed. Baskir45 presented a MATLAB toolbox for user-friendly wavelength selection and automated spectral region selection. These investigations collectively underscore the potential of machine learning in automating and enhancing the process of wavelength selection and spectral data classification.
In addition to this, Advancements in hyperspectral imaging technology have significantly expanded the potential applications of this technology46. However, the complexity of hyperspectral data, including its high dimensionality and size, requires innovative methodologies for effective processing and analysis47. These challenges have led to the development of a range of image processing and machine learning analysis pipelines46. Notably, hyperspectral imaging finds application in microscopy, enabling the capture and identification of different spectral signatures in a single-pass evaluation48.
The effectiveness of machine learning algorithms, particularly Neighborhood Component Analysis (NCA), for multi-spectral data classification in mineral processing has been highlighted in recent research. Jahoda49 and Sinaice50 both emphasize the advantages of combining spectroscopic methods with machine learning for mineral identification. Jahoda49specifically highlights the superiority of machine learning methods in this context, while Sinaice50 proposes a system integrating hyperspectral imaging, NCA, and machine learning for rock and mineral classification. These findings are further supported by Carey51, who stresses the importance of spectrum preprocessing and a weighted-neighbors classifier for optimal mineral spectrum matching performance.
In their previous study Okada et al.24, developed a basic technology of SBOS, using hyperspectral (HS) imaging and deep learning as an expert system for mineral identification. HS is promising as SBOS to avoid As-containing copper minerals technic instead. In that study, HS imaging was used as a sensor to collect the intensity of wavelength data, which was then used to train deep learning algorithms for mineral identification. The HS image is cube-shaped data with dimensions in the wavelength and spatial directions, with wavelength data from the visible to near-infrared regions (400~1000nm, 204 bands). Minerals (hematite, chalcopyrite, galena) identification was performed by analyzing detailed wavelength data of 204 bands in the shorter wavelength range of 4001000nm (from the visible light region to a part of the near-infrared region) using deep learning. However, the HS data used in that study consisted of 204 high-dimensional data, which required heavy computational resources. In addition, the HS camera itself is expensive, which was a barrier to its introduction/implementation in the operating site (mineral processing plant).
Yokoya and Iwasaki52 reported that, since each pixel provides continuous electromagnetic spectral characteristics, its possible to obtain detailed information about the target object. Owing to the high spectral resolution HS imaging is applied in fields such as remote sensing and quality control of food deep and pharmaceuticals. Robben et al.53 pointed out that, minerals show specific absorption characteristics in the near-infrared region from 1300 to 2550nm due to vibrations of the bonding molecules contained in each mineral. A skilled expert can identify some minerals visually (Color), and the continuous electromagnetic spectrum in the short wavelength region is considered to contain optical data with mineral-specific physical properties.
Sorting machines that use MS images with a reduced number of dimensions are now technically feasible. They sort by acquiring specific wavelength information predetermined for each ore type. However, even for the same type of mineral, there are subtle differences in the formation and shape of each mine that affect the spectra. Additionally, the light environment inside each plant varies, which also affects the spectrum. Based on these factors, it is suggested that ore selectivity could be improved by defining the wavelength to be acquired for each ore type. To achieve this, we propose a framework that allows for the selection of spectral bands based on the characteristics of the ore produced. This framework will greatly support the tuning of the sorting process. As a case study, we will use a mineral sample containing arsenic.
The literature review highlights various gaps in current mineral processing practices, emphasizing the need for innovative approaches to improve efficiency and sustainability. While Sensor-Based Ore Sorting (SBOS) offers promise for environmentally friendly metal recovery, further technological development is required to enhance its effectiveness and practicality in operational settings. Traditional mineral analysis methods are time-consuming and impractical for real-time processing, prompting the exploration of faster and more economical techniques. Additionally, the application of machine learning algorithms and hyperspectral imaging for mineral identification presents computational challenges and limitations in the practical implementation due to the high dimensionality of data.
In response to these challenges, the proposed framework integrates Neighborhood Component Analysis (NCA) and machine learning algorithms to address the complexities of mineral identification and sorting using multi-spectral data. By reducing data dimensionality and identifying crucial wavelength bands, the framework enables efficient mineral identification while considering the unique characteristics of each ore type. Furthermore, by utilizing multi-spectral cameras with reduced dimensions, the framework enhances sorting efficiency and selectivity, paving the way for more sustainable mining practices and improved operational outcomes in mineral processing plants. In this study, a clustering-based method, Neighborhood Components Analysis (NCA), was used to perform dimensionality reduction and wavelength selection on HS data. After the selection, the selected bands were learned by machine learning algorithms to experiment with mineral identification. It is expected that mineral identification using fewer wavelengths than HS data will enable data acquisition with less expensive MS cameras and increase the efficiency of mineral identification.
The rest is here:
Optimizing multi-spectral ore sorting incorporating wavelength selection utilizing neighborhood component analysis for ... - Nature.com
EVOLVE 2024 Welcomes AWS Leader to Discuss the Transformative Power of Generative AI – GlobeNewswire
AUSTIN, Texas, May 21, 2024 (GLOBE NEWSWIRE) -- Asure (NASDAQ: ASUR), a leading provider of cloud-based Human Capital Management (HCM) software solutions, is pleased to announce that Vidya Sagar Ravipati, Applied Science Manager at the Amazon Web Services (AWS) Generative AI Innovation Center, will be a guest speaker at its EVOLVE 2024 conference, June 4-7 in Austin, TX. Exclusively designed for Asure Reseller partners, EVOLVE offers a blend of strategic insights and practical sessions aimed at helping HCM resellers drive growth and enhance efficiency.
Vidya Sagar Ravipati is a seasoned expert in large-scale distributed systems and machine learning. As the Applied Science Manager at the Generative AI Innovation Center, Vidya Sagar leads a team at the cutting edge of generative AI applications. These include personalized chatbots, hyper-personalization, call center analytics, image and video understanding, knowledge graph-augmented large language models, and personalized advertising. The AWS Generative AI Innovation Center helps customers navigate the new era of innovation with strategic guidance and delivery of impactful Generative AI solutions. Vidyas work is instrumental in helping AWS customers across various industries accelerate their AI and cloud adoption journey.
Next-generation HCM technology is all about integrating advanced AI and machine learning capabilities to create more agile, efficient, and user-centric solutions, said Pat Goepel, Chairman and CEO of Asure. Were thrilled to have Vidya share with our reseller partner community how our collaboration with AWS enables us to push the boundaries of whats possible in HCM, to expedite innovation and deliver exceptional user experiences.
In his session titled Transformative Power of Generative AI, Vidya Sagar Ravipati will delve into how Asure is leveraging the transformative potential of AI through its partnership with AWS and the Generative AI Innovation Center. This collaboration connects AWS AI and ML experts with development teams like Asures, enabling the envisioning, design, and launch of new generative AI products and features.
Attendees will learn how these technologies can optimize workflows, enhance user experiences, and turn innovative ideas into reality faster and more effectively. This session is set to provide invaluable insights into the future of AI and its practical applications in driving digital transformation.
As a member of the AWS Application Modernization Lab, Asure collaborates directly with AWS to define and leverage an innovative development framework to enhance its HCM SaaS (Software as a Service) offerings with advancements like cloud optimization and artificial intelligence (AI) to deliver premium agility and speed to market.
About Asure Asure Software (NASDAQ: ASUR) provides cloud-based Human Capital Management (HCM) software solutions that assist organizations of all sizes in streamlining their HCM processes. Asure's suite of HCM solutions includes HR, payroll, time and attendance, benefits administration, payroll tax management, and talent management. The company's approach to HR compliance services incorporates AI technology to enhance scalability and efficiency while prioritizing client interactions. For more information, please visitwww.asuresoftware.com.
Contact Information: Patrick McKillop Vice President, Investor Relations 617-335-5058 patrick.mckillop@asuresoftware.com
Read more here:
EVOLVE 2024 Welcomes AWS Leader to Discuss the Transformative Power of Generative AI - GlobeNewswire
Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health … – Nature.com
Miller, M. A. & Hinman, A. R. In Vaccines, 6th edn (eds Plotkin, S. A., Orenstein, W. A., & Offit, P. A.) 14131426 (W.B. Saunders, 2013).
Ozawa, S. et al. Return on investment from childhood immunization in low- and middle-income countries, 201120. Health Aff. (Project Hope) 35, 199207. https://doi.org/10.1377/hlthaff.2015.1086 (2016).
Article Google Scholar
Bloom, D. E. In Hot Topics in Infection and Immunity in Children VII (eds Curtis, N., Finn, A., & Pollard, A. J.) 18 (Springer, 2011).
Sim, S. Y., Watts, E., Constenla, D., Brenzel, L. & Patenaude, B. N. Return on investment from immunization against 10 pathogens in 94 low- and middle-income countries, 201130. Health Aff. (Project Hope) 39, 13431353. https://doi.org/10.1377/hlthaff.2020.00103 (2020).
Article Google Scholar
Machingaidze, S., Wiysonge, C. S. & Hussey, G. D. Strengthening the expanded programme on immunization in Africa: Looking beyond 2015. PLoS Med. 10, e1001405 (2013).
Article PubMed PubMed Central Google Scholar
Masud, T. & Navaratne, K. V. The expanded program on immunization in Pakistan: Recommendations for improving performance. (2012).
WHO/UNICEF. Progress and challenges with achieving universal immunization coverage. (2020).
WHO and UNICEF: Progress and Challenges with Achieving Universal Immunization Coverage. (WHO/UNICEF Estimates of National Immunization Coverage, J., 2019).
UNICEF. Under Five Mortality. https://data.unicef.org/topic/child-survival/under-five-mortality/ (2023).
WHO/UNICEF. Estimates of National Immunization Coverage. http://www.who.int/news-room/fact-sheets/detail/immunization-coverage (2021).
Debie, A., Lakew, A. M., Tamirat, K. S., Amare, G. & Tesema, G. A. Complete vaccination service utilization inequalities among children aged 1223 months in Ethiopia: A multivariate decomposition analyses. Int. J. Equity Health 19, 65. https://doi.org/10.1186/s12939-020-01166-8 (2020).
Article PubMed PubMed Central Google Scholar
UNICEF. (2020).
Faisal, S. et al. Modeling the factors associated with incomplete immunization among children. Math. Probl. Eng. 2022 (2022).
Negussie, A., Kassahun, W., Assegid, S. & Hagan, A. K. Factors associated with incomplete childhood immunization in Arbegona district, southern Ethiopia: A case-control study. BMC Public Health 16, 27. https://doi.org/10.1186/s12889-015-2678-1 (2016).
Article PubMed PubMed Central Google Scholar
Nour, T. Y. et al. Predictors of immunization coverage among 1223 month old children in Ethiopia: Systematic review and meta-analysis. BMC Public Health 20, 1803. https://doi.org/10.1186/s12889-020-09890-0 (2020).
Article PubMed PubMed Central Google Scholar
Tesema, G. A., Tessema, Z. T., Tamirat, K. S. & Teshale, A. B. Complete basic childhood vaccination and associated factors among children aged 1223 months in East Africa: A multilevel analysis of recent demographic and health surveys. BMC Public Health 20, 1837. https://doi.org/10.1186/s12889-020-09965-y (2020).
Article PubMed PubMed Central Google Scholar
Skull, S. A., Ngeow, J. Y. Y., Hogg, G. & Biggs, B.-A. Incomplete immunity and missed vaccination opportunities in East African immigrants settling in Australia. J. Immigr. Minor. Health 10, 263268. https://doi.org/10.1007/s10903-007-9071-9 (2008).
Article PubMed Google Scholar
Adedokun, S. T., Uthman, O. A., Adekanmbi, V. T. & Wiysonge, C. S. Incomplete childhood immunization in Nigeria: A multilevel analysis of individual and contextual factors. BMC Public Health 17, 236. https://doi.org/10.1186/s12889-017-4137-7 (2017).
Article PubMed PubMed Central Google Scholar
Russo, G. et al. Vaccine coverage and determinants of incomplete vaccination in children aged 1223 months in Dschang, West Region, Cameroon: A cross-sectional survey during a polio outbreak. BMC Public Health 15, 630. https://doi.org/10.1186/s12889-015-2000-2 (2015).
Article PubMed PubMed Central Google Scholar
Mohamud Hayir, T. M., Magan, M. A., Mohamed, L. M., Mohamud, M. A. & Muse, A. A. Barriers for full immunization coverage among under 5 years children in Mogadishu, Somalia. J. Fam. Med. Prim. Care 9, 26642669. https://doi.org/10.4103/jfmpc.jfmpc_119_20 (2020).
Article Google Scholar
Kebede Kassaw, A. A. et al. Spatial distribution and machine learning prediction of sexually transmitted infections and associated factors among sexually active men and women in Ethiopia, evidence from EDHS 2016. BMC Infect. Dis. 23, 49. https://doi.org/10.1186/s12879-023-07987-6 (2023).
Article PubMed PubMed Central Google Scholar
DHS. Data Collection. https://www.dhsprogram.com/Data/.
Etana, B. & Deressa, W. Factors associated with complete immunization coverage in children aged 1223 months in Ambo Woreda, Central Ethiopia. BMC Public Health 12, 566. https://doi.org/10.1186/1471-2458-12-566 (2012).
Article PubMed PubMed Central Google Scholar
Kassahun, M. B., Biks, G. A. & Teferra, A. S. Level of immunization coverage and associated factors among children aged 1223 months in Lay Armachiho District, North Gondar Zone, Northwest Ethiopia: A community based cross sectional study. BMC. Res. Notes 8, 239. https://doi.org/10.1186/s13104-015-1192-y (2015).
Article PubMed PubMed Central Google Scholar
Sheikh, N. et al. Coverage, timelines, and determinants of incomplete immunization in Bangladesh. Trop. Med. Infect. Dis. 3, 72 (2018).
Article PubMed PubMed Central Google Scholar
Bugvi, A. S. et al. Factors associated with non-utilization of child immunization in Pakistan: Evidence from the Demographic and Health Survey 200607. BMC Public Health 14, 232. https://doi.org/10.1186/1471-2458-14-232 (2014).
Article PubMed PubMed Central Google Scholar
Tadesse, H., Deribew, A. & Woldie, M. Predictors of defaulting from completion of child immunization in south Ethiopia, May 2008A case control study. BMC Public Health 9, 150. https://doi.org/10.1186/1471-2458-9-150 (2009).
Article PubMed PubMed Central Google Scholar
Jani, J. V., De Schacht, C., Jani, I. V. & Bjune, G. Risk factors for incomplete vaccination and missed opportunity for immunization in rural Mozambique. BMC Public Health 8, 161. https://doi.org/10.1186/1471-2458-8-161 (2008).
Article PubMed PubMed Central Google Scholar
De, P. & Bhattacharya, B. N. Determinants of child immunization in fourless-developed states of North India. J. Child Health Care 6, 3450 (2002).
Article Google Scholar
Rahman, M. & Obaida-Nasrin, S. Factors affecting acceptance of complete immunization coverage of children under five years in rural Bangladesh. Salud pblica de mxico 52, 134140 (2010).
Article PubMed Google Scholar
Atnafu, A. et al. Prevalence and determinants of incomplete or not at all vaccination among children aged 1236 months in Dabat and Gondar districts, northwest of Ethiopia: Findings from the primary health care project. BMJ Open 10, e041163. https://doi.org/10.1136/bmjopen-2020-041163 (2020).
Article PubMed PubMed Central Google Scholar
Melaku, M. S., Nigatu, A. M. & Mewosha, W. Z. Spatial distribution of incomplete immunization among under-five children in Ethiopia: Evidence from 2005, 2011, and 2016 Ethiopian Demographic and health survey data. BMC Public Health 20, 1362. https://doi.org/10.1186/s12889-020-09461-3 (2020).
Article PubMed PubMed Central Google Scholar
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 28252830 (2011).
MathSciNet Google Scholar
Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. (2016).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 5667. https://doi.org/10.1038/s42256-019-0138-9 (2020).
Article PubMed PubMed Central Google Scholar
Rawat, S., Rawat, A., Kumar, D. & Sabitha, A. S. Application of machine learning and data visualization techniques for decision support in the insurance sector. Int. J. Inf. Manag. Data Insights 1, 100012 (2021).
Google Scholar
Guo, Y. The 7 steps of machine learning (2017). towardsdatascience.com (2017).
Brownlee, J. Data Preparation for Machine Learning: Data Cleaning, Feature Selection, and Data Transforms in Python (Machine Learning Mastery, 2020).
Yu, L. & Liu, H. In Proceedings of the 20th International Conference on Machine Learning (ICML-03). 856863.
Bekele, W. T. Machine learning algorithms for predicting low birth weight in Ethiopia. BMC Med. Inform. Decis. Mak. 22, 232. https://doi.org/10.1186/s12911-022-01981-9 (2022).
Article PubMed PubMed Central Google Scholar
Bitew, F. H., Sparks, C. S. & Nyarko, S. H. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public Health Nutr. 112 (2021).
Chilyabanyama, O. N. et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children (Basel, Switzerland). https://doi.org/10.3390/children9071082 (2022).
Article PubMed PubMed Central Google Scholar
Emmanuel, M. Application of Machine Learning Methods in Analysis of Infant Mortality in Rwanda: Analysis of Rwanda Demographic Health Survey 201415 Dataset (University of Rwanda, 2021).
Fenta, H. M., Zewotir, T. & Muluneh, E. K. A machine learning classifier approach for identifying the determinants of under-five child undernutrition in Ethiopian administrative zones. BMC Med. Inform. Decis. Mak. 21, 112 (2021).
Article Google Scholar
Kananura, R. M. Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Ugandas rural and urban settings. PLoS Glob. Public Health 2, e0000430. https://doi.org/10.1371/journal.pgph.0000430 (2022).
Article PubMed PubMed Central Google Scholar
Saroj, R. K., Yadav, P. K., Singh, R. & Chilyabanyama, O. N. Machine learning algorithms for understanding the determinants of under-five mortality. BioData Min. 15, 20. https://doi.org/10.1186/s13040-022-00308-8 (2022).
Article PubMed PubMed Central Google Scholar
Tesfaye, B., Atique, S., Azim, T. & Kebede, M. M. Predicting skilled delivery service use in Ethiopia: Dual application of logistic regression and machine learning algorithms. BMC Med. Inform. Decis. Mak. 19, 209. https://doi.org/10.1186/s12911-019-0942-5 (2019).
Article PubMed PubMed Central Google Scholar
Bekkar, M., Djemaa, H. K. & Alitouche, T. A. Evaluation measures for models assessment over imbalanced data sets. J. Inf. Eng. Appl. 3, 1533 (2013).
Google Scholar
Yang, L. & Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415, 295316. https://doi.org/10.1016/j.neucom.2020.07.061 (2020).
Article Google Scholar
Kebede, S. D. et al. Prediction of contraceptive discontinuation among reproductive-age women in Ethiopia using Ethiopian Demographic and Health Survey 2016 Dataset: A machine learning approach. BMC Med. Inform. Decis. Mak. 23, 9. https://doi.org/10.1186/s12911-023-02102-w (2023).
Article PubMed PubMed Central Google Scholar
Wang, K. et al. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput. Biol. Med. 137, 104813. https://doi.org/10.1016/j.compbiomed.2021.104813 (2021).
Article PubMed Google Scholar
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017).
Li, Q., Zhang, Y., Kang, H., Xin, Y. & Shi, C. Mining association rules between stroke risk factors based on the Apriori algorithm. Technol. Health Care. 25, 197205. https://doi.org/10.3233/thc-171322 (2017).
Article PubMed Google Scholar
Chandir, S. et al. Using predictive analytics to identify children at high risk of defaulting from a routine immunization program: Feasibility study. JMIR Public Health Surveill. 4, e9681 (2018).
Article Google Scholar
Mutua, M. K., Kimani-Murage, E. & Ettarh, R. R. Childhood vaccination in informal urban settlements in Nairobi, Kenya: Who gets vaccinated?. BMC Public Health 11, 6. https://doi.org/10.1186/1471-2458-11-6 (2011).
Article PubMed PubMed Central Google Scholar
Landoh, D. E. et al. Predictors of incomplete immunization coverage among one to five years old children in Togo. BMC Public Health 16, 968. https://doi.org/10.1186/s12889-016-3625-5 (2016).
Article PubMed PubMed Central Google Scholar
Pavlopoulou, I. D., Michail, K. A., Samoli, E., Tsiftis, G. & Tsoumakas, K. Immunization coverage and predictive factors for complete and age-appropriate vaccination among preschoolers in Athens, Greece: A cross- sectional study. BMC Public Health 13, 908. https://doi.org/10.1186/1471-2458-13-908 (2013).
Article PubMed PubMed Central Google Scholar
Zewdie, A., Letebo, M. & Mekonnen, T. Reasons for defaulting from childhood immunization program: A qualitative study from Hadiya zone, Southern Ethiopia. BMC Public Health 16, 1240. https://doi.org/10.1186/s12889-016-3904-1 (2016).
Article PubMed PubMed Central Google Scholar
Tauil, M. D. C., Sato, A. P. S. & Waldman, E. A. Factors associated with incomplete or delayed vaccination across countries: A systematic review. Vaccine 34, 26352643. https://doi.org/10.1016/j.vaccine.2016.04.016 (2016).
Read this article:
Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health ... - Nature.com
DeepDive: estimating global biodiversity patterns through time using deep learning – Nature.com
Sepkoski, J. J. A factor analytic description of the phanerozoic marine fossil record. Paleobiology 7, 3653 (1981).
Article Google Scholar
Quental, T. B. & Marshall, C. R. Diversity dynamics: molecular phylogenies need the fossil record. Trends Ecol. Evol. 25, 434441 (2010).
Article PubMed Google Scholar
Ezard, T. H., Aze, T., Pearson, P. N. & Purvis, A. Interplay between changing climate and species ecology drives macroevolutionary dynamics. Science 332, 349351 (2011).
Article ADS CAS PubMed Google Scholar
Benton, M. J. Exploring macroevolution using modern and fossil data. Proc. R. Soc. B: Biol. Sci. 282, 20150569 (2015).
Article Google Scholar
Niklas, K. J. Measuring the tempo of plant death and birth. N. Phytol. 207, 254256 (2015).
Article Google Scholar
Rabosky, D. L. & Hurlbert, A. H. Species richness at continental scales is dominated by ecological limits. Am. Nat. 185, 572583 (2015).
Article PubMed Google Scholar
Harmon, L. J. & Harrison, S. Species diversity is dynamic and unbounded at local and continental scales. Am. Nat. 185, 584593 (2015).
Article PubMed Google Scholar
Sepkoski Jr, J. Phanerozoic overview of mass extinction. In Patterns and Processes in the History of Life: Report of the Dahlem Workshop on Patterns and Processes in the History of Life Berlin 1985, June 1621, 277295 (Springer, 1986).
Benton, M. J. & Emerson, B. C. How did life become so diverse? the dynamics of diversification according to the fossil record and molecular phylogenetics. Palaeontology 50, 2340 (2007).
Article Google Scholar
Alroy, J. Geographical, environmental and intrinsic biotic controls on phanerozoic marine diversification. Palaeontology 53, 12111235 (2010).
Article Google Scholar
Weber, M. G., Wagner, C. E., Best, R. J., Harmon, L. J. & Matthews, B. Evolution in a community context: on integrating ecological interactions and macroevolution. Trends Ecol. Evol. 32, 291304 (2017).
Article PubMed Google Scholar
Niklas, K. J., Tiffney, B. H. & Knoll, A. H. Patterns in vascular land plant diversification. Nature 303, 614 616 (1983).
Article Google Scholar
Foote, M., Miller, A., Raup, D. & Stanley, S.Principles of Paleontology (W. H. Freeman, 2007). https://books.google.ch/books?id=8TsDC2OOvbYC
Close, R., Benson, R., Saupe, E., Clapham, M. & Butler, R. The spatial structure of phanerozoic marine animal diversity. Science 368, 420424 (2020).
Article ADS CAS PubMed Google Scholar
Raja, N. B. et al. Colonial history and global economics distort our understanding of deep-time biodiversity. Nat. Ecol. Evol. 6, 145154 (2022).
Article PubMed Google Scholar
Smith, A. B. & McGowan, A. J. The ties linking rock and fossil records and why they are important for palaeobiodiversity studies. Geol. Soc. Lond. Spec. Publ. 358, 17 (2011).
Article ADS Google Scholar
Benson, R., Butler, R., Close, R., Saupe, E. & Rabosky, D. Biodiversity across space and time in the fossil record. Curr. Biol. 31, R1225R1236 (2021).
Article CAS PubMed Google Scholar
Smith, A. B. Largescale heterogeneity of the fossil record: implications for phanerozoic biodiversity studies. Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci. 356, 351367 (2001).
Article CAS Google Scholar
Alroy, J. Fair sampling of taxonomic richness and unbiased estimation of origination and extinction rates. Paleontol. Soc. Pap. 16, 5580 (2010).
Article Google Scholar
Chao, A. & Jost, L. Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size. Ecology 93, 25332547 (2012).
Article PubMed Google Scholar
Raup, D. Taxonomic diversity estimation using rarefaction. Paleobiology 1, 333342 (1975).
Article Google Scholar
Alroy, J. et al. Effects of sampling standardization on estimates of phanerozoic marine diversification. Proc. Natl Acad. Sci. 98, 62616266 (2001).
Article ADS CAS PubMed PubMed Central Google Scholar
Starrfelt, J. & Liow, L. H. How many dinosaur species were there? fossil bias and true richness estimated using a poisson sampling model. Philos. Trans. R. Soc. B: Biol. Sci. 371, 20150219 (2016).
Article Google Scholar
Flannery-Sutherland, J. T., Silvestro, D. & Benton, M. J. Global diversity dynamics in the fossil record are regionally heterogeneous. Nat. Commun. 13, 117 (2022).
Article Google Scholar
Chao, A. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, 783791 (1987).
Alroy, J. Limits to species richness in terrestrial communities. Ecol. Lett. 21, 17811789 (2018).
Article PubMed Google Scholar
Alroy, J. On four measures of taxonomic richness. Paleobiology 46, 158175 (2020).
Article Google Scholar
Close, R., Evers, S., Alroy, J. & Butler, R. How should we estimate diversity in the fossil record? testing richness estimators using sampling-standardised discovery curves. Methods Ecol. Evol. 9, 13861400 (2018).
Article Google Scholar
Close, R. et al. The apparent exponential radiation of phanerozoic land vertebrates is an artefact of spatial sampling biases. Proc. R. Soc. B 287, 20200372 (2020).
Article PubMed PubMed Central Google Scholar
Antell, G. T., Benson, R. B. & Saupe, E. E. Spatial standardization of taxon occurrence dataa call to action. Paleobiology https://doi.org/10.1017/pab.2023.36 (2024).
Dunne, E. M., Thompson, S. E., Butler, R. J., Rosindell, J. & Close, R. A. Mechanistic neutral models show that sampling biases drive the apparent explosion of early tetrapod diversity. Nat. Ecol. Evol. 7, 14801489 (2023).
Article PubMed PubMed Central Google Scholar
Hauffe, T., Pires, M. M., Quental, T. B., Wilke, T. & Silvestro, D. A quantitative framework to infer the effect of traits, diversity and environment on dispersal and extinction rates from fossils. Methods Ecol. Evol. 13, 12011213 (2022).
Article Google Scholar
Cermeo, P. et al. Post-extinction recovery of the phanerozoic oceans and biodiversity hotspots. Nature 607, 507511 (2022).
Article ADS PubMed PubMed Central Google Scholar
Hagen, O. et al. gen3sis: a general engine for eco-evolutionary simulations of the processes that shape earths biodiversity. PLoS Biol. 19, e3001340 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hagen, O., Skeels, A., Onstein, R. E., Jetz, W. & Pellissier, L. Earth history events shaped the evolution of uneven biodiversity across tropical moist forests. Proc. Natl Acad. Sci. 118, e2026347118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Vilhena, D. A. & Smith, A. B. Spatial bias in the marine fossil record. PLoS One 8, e74470 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Raup, D. M. Taxonomic diversity during the phanerozoic: the increase in the number of marine species since the paleozoic may be more apparent than real. Science 177, 10651071 (1972).
Article ADS CAS PubMed Google Scholar
Raup, D. M. Species diversity in the phanerozoic: a tabulation. Paleobiology 2, 279288 (1976).
Article Google Scholar
Foote, M., Crampton, J. S., Beu, A. G. & Nelson, C. S. Aragonite bias, and lack of bias, in the fossil record: lithological, environmental, and ecological controls. Paleobiology 41, 245265 (2015).
Article Google Scholar
Silvestro, D., Salamin, N. & Schnitzler, J. Pyrate: a new program to estimate speciation and extinction rates from incomplete fossil data. Methods Ecol. Evol. 5, 11261131 (2014).
Article Google Scholar
Cantalapiedra, J. L. et al. The rise and fall of proboscidean ecological diversity. Nat. Ecol. Evol. 5, 12661272 (2021).
Article PubMed Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533536 (1986).
Article ADS Google Scholar
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 17351780 (1997).
Article CAS PubMed Google Scholar
Gers, F., Schmidhuber, J. & Cummins, F. Learning to forget: continual prediction with lstm. Neural Comput. 12, 24512471 (2000).
Article CAS PubMed Google Scholar
Gal, Y. & Ghahramani, Z. A theoretically grounded application of dropout in recurrent neural networks. Adv. Neural Inform. Process. Syst. 29, 19 (2016).
Gal, Y. & Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning 48, 10501059 (PMLR, 2016).
Silvestro, D. & Andermann, T. Prior choice affects ability of bayesian neural networks to identify unknowns. arXiv preprint arXiv:2005.04987 (2020).
Brusatte, S. L. et al. The extinction of the dinosaurs. Biol. Rev. 90, 628642 (2015).
Article PubMed Google Scholar
Dunne, E. M., Farnsworth, A., Greene, S. E., Lunt, D. J. & Butler, R. J. Climatic drivers of latitudinal variation in late triassic tetrapod diversity. Palaeontology 64, 101117 (2021).
Article Google Scholar
De Celis, A., Narvez, I., Arcucci, A. & Ortega, F. Lagersttte effect drives notosuchian palaeodiversity (crocodyliformes, notosuchia). Historical Biol. 33, 30313040 (2021).
Article Google Scholar
Cleary, T. J., Benson, R. B., Holroyd, P. A. & Barrett, P. M. Tracing the patterns of non-marine turtle richness from the triassic to the palaeogene: from origin to global spread. Palaeontology 63, 753774 (2020).
Article Google Scholar
Silvestro, D. et al. Fossil data support a pre-Cretaceous origin of flowering plants. Nat. Ecol. Evol. 5, 449457 (2021).
Leuenberger, C. & Wegmann, D. Bayesian computation and model selection without likelihoods. Genetics 184, 243252 (2010).
Article PubMed PubMed Central Google Scholar
Marjoram, P., Molitor, J., Plagnol, V. & Tavar, S. Markov chain monte carlo without likelihoods. Proc. Natl Acad. Sci. 100, 1532415328 (2003).
Article ADS CAS PubMed PubMed Central Google Scholar
Read this article:
DeepDive: estimating global biodiversity patterns through time using deep learning - Nature.com
REMINDER – Vector Institute affiliated AI Trust and Safety Experts available for commentary related to the AI Global … – GlobeNewswire
TORONTO, May 21, 2024 (GLOBE NEWSWIRE) -- The second AI Global Forum will take place in South Korea next week gathering government officials, corporate leaders, civil societies, and academics from around the world to discuss the future of AI.
The Vector Institute is affiliated to a significant number of world leading researchers working on AI Trust and Safety available to provide comment in the lead up and during the AI Global Forum.
In addition, Vectors President and CEO, Tony Gaffney will participate in person at the Forum in South Korea on Wednesday May 22nd, 2024 and will be available for comment while on site.
Media availability:
Experts will be available for commentary on AI Trust and Safety in the lead up and during the AI Global Forum.
Vector Institute Affiliated AI Trust and Safety Experts:
Jeff Clune
Jeff focuses on deep learning, including deep reinforcement learning. His research also focuses on AI Safety, including regulatory recommendations and improving the interpretability of agents
Roger Grosse
Rogers research examines training dynamics in deep learning. He is applying his expertise to AI alignment, to ensure the progress of AI is aligned with human values. Some of his recent work has focused on better understanding how large language models work in order to head off the potential for risk in their deployment.
Gillian Hadfield
Gillians current research is focused on innovative design for legal and regulatory systems for AI and other complex global technologies; computational models of human normative systems; and working with machine learning researchers to build ML systems that understand and respond to human norms.
Sheila McIlraith Sheilas research addresses AI sequential decision making that is human compatible, with a focus on safety, alignment, and fairness. Recent work looked at the impact of ethics education on computer science students.
Rahul G. Krishnan
Rahuls research focuses on building robust and generalizable machine learning algorithms to advance computational medicine. His recent work has developed new algorithms for causal decision making, built risk scores for patients on the transplant waitlist, and created automated guardrails for predictive models deployed in high-risk settings.
Xiaoxiao Li
Xiaoxiao specializes in the interdisciplinary field of deep learning and biomedical data analysis. Her primary mission is to make AI more reliable, especially when it comes to sensitive areas like healthcare.
Nicolas Papernot
Nicolass work focuses on privacy preserving techniques in deep learning, and advancing more secure and trusted machine learning models.
About the Vector Institute Launched in 2017, the Vector Institute works with industry, institutions, startups, and governments to build AI talent and drive research excellence in AI to develop and sustain AI-based innovation to foster economic growth and improve the lives of Canadians. Vector aims to advance AI research, increase adoption in industry and health through programs for talent, commercialization, and application, and lead Canada towards the responsible use of AI. Programs for industry, led by top AI practitioners, offer foundations for applications in products and processes, company-specific guidance, training for professionals, and connections to workforce-ready talent. Vector is funded by the Province of Ontario, the Government of Canada through the Pan-Canadian AI Strategy, and leading industry sponsors from across multiple sectors of Canadian Industry.
This availability is for media only.
For more information or to speak with an AI expert, contact: media@vectorinstitute.ai
Read the original here:
REMINDER - Vector Institute affiliated AI Trust and Safety Experts available for commentary related to the AI Global ... - GlobeNewswire
Prediction of adolescent weight status by machine learning: a population-based study – BMC Public Health – BMC Public Health
Design and setting
We conducted a retrospective cohort study of P4 students from the 1995/1996 to 2015/2016 academic cohorts, who were followed until Secondary 6 (S6, Grade 12 in the US). P4 students are cognitively competent to provide self-reported measurements [22]. Additionally, we chose a cohort of P6 students from 1995/1996 to 2013/14 academic cohorts to predict weight status after P6, the last year of primary education in Hong Kong before students are promoted to the secondary level. Students who visited at least two years and had completed health measurements records were included. Data were obtained from the Student Health Service (SHS) of the Department of Health in Hong Kong, which has provided voluntary territory-wide annual health assessment services for primary and secondary students since 1995/1996. The health assessment questionnaire changed in 2015/16 [23]. Therefore, we included P4 students during 1995/1996 to 2014/2015, allowing at least one year of follow-up prediction. Fruther details of the survey health assessment scheme can be found elsewhere [24, 25].
Weight was measured to the nearest 0.1kg and height to the nearest 0.1cm were assessed annually at the SHS by well trained healthcare workers or nurses according to the study protocol. Demographics included sex, age and family socioeconomic level. Familys socioeconomic status was indicated by parental educational level, parental occupation and the type of housing [26].
Dietary habits were assessed by breakfast eating habit, sweetness preference during past 7 days, junk food intake habit, fruit/vegetable intake, and milk consumption habit. Physical activity behaviors were assessed by frequency of aerobic exercise each week, hours of doing aerobic exercise each week, and daily hours of TV viewing. All of these predictors in the structured questionnaires had four response options representing different degrees of frequency or duration. Breakfast habits were assessed by the item I usually have breakfast at?, we considered three response categories: (i) home, representing frequently eating at home, (ii) rarely at home, after combining the original categories of fast food stall/cafeteria/restaurant and some other places, and (iii) no breakfast at all, representing never eating at home. Thus, this item can be considered an assessment of the frequency of breakfast eating at home.
Psychological development was assessed using the 60-item self-reported Culture Free Self-Esteem Inventory for Children Questionnaire (CFSEI-2), which has been validated in Hong Kong children and adolescents [27, 28]. The Self-Esteem Inventory (SEI) comprises a total score and four domain scores: (i) general self-esteem denoting childrens overall perception of themselves, the score7 was considered as very-low; (ii) social self-esteem denoting childrens perception of their peer relationship, (iii) school-related self-esteem denoting childrens perception on their ability to achieve academic success, (iv) parent-related self-esteem denoting childrens perception on their familys thoughts. Scores2 in any of these three subscales were considered very-low [27]. Children with a total score19 or a very-low score in any domain were considered to have low self-esteem. A lie scale score was also obtained, and a score2 indicates the corresponding childs self-reported assessment is unreliable [27].
Potential behavioral problems of children and adolescents were assessed using the 4-item Rutter Behavior Questionnaire (RBQ), which has been validated in Hong Kong children [29]. It inquired about behaviors on hyperactivity, conduct, and emotional disturbances and were completed by parents. A RBQ total score19 indicated a potential behavior problem [30]. In total, 25 predictors were considered as input variables in developing multiclass prediction models.
Prediction weight status was classified as normal, obese, overweight, and underweight, based on the next measurement year of the body mass index (BMI, expressed in kg/m2) and the age- and sex-specific BMI references in the international Obesity Task Force Standards (IOTF).
Children with a lie self-esteem score2 were considered unreliable and removed. For the type of housing and parental occupation, we ordered their response categories in order of socioeconomic level by using the median monthly domestic household income for each type of housing and occupation obtained from the Hong Kong Census and Statistics Department. Sex as categorical variables was one-hot encoded. The responses of dietary and physical activity behavioral measurements were treated as ordinal variables, and other predictors were considered as continuous variables. Missing data on socioeconomic status were filled out according to the information reported in the students other assessment years. The other measurements had less than 5% missing data, which was considered inconsequential to the validity of the model development [31]. We used k nearest neighbour imputation algorithms to the training and test sets separatly to facilitate the use of ML that required complete data [32].
Categorical data were expressed as the number with a percentage for each weight status and compared using chi-square test. Numberical data were presented as the meanstandard deviation (SD).
P4 students were randomly divided into a training set and a test set at an 80:20 ratio. Multiclass prediction models were developed using the P4 training data to predict weight status in each subsequent year until S6, creating eight prediction windows. We used the same procedure to develop prediction models for the P6 training cohort, creating six prediction windows until S6. The weight status in our cohorts was imbalanced, with underweight, overweight and obese categories being underpresented. The imbalance could have led to biased model performance, where the model may have been more accurate at predicting the majority weight status while performing poorly on the minority weight status. To address this issue, we used the Synthetic Minority Oversampling Technique (SMOTE) sampling technique to the training sets [33]. SMOTE was a widely used technique that creates synthetic samples for the minority categories by generating new instances that are similar to the original underpresented categories. We attempted several ML approaches, including Decision Tree (DT), Random Forest (RF), Supportive Vector Machine (SVM), k-Nearest Neighbor (k-NN), and eXtreme Gradient Boosting (XG Boost), as well as the LG approach for comparison. The short- and long-term prediction abilities of the models were compared by calculating the correct classification rate, overall accuracy of the test set and micro-, macro-averaging area under the curve (AUC). Receiver operating characteristics (ROC) curves for each weight status on test set were also obtained. The AUC, precision, recall and F1-score were calculated to evaluate the model prediction accuracy, and assess the ability to predict an abnormal weight status. The precision and recall are conceptually equivalent to the sensitivity and positive predictive value, and the F1 score is the harmonic mean of precision and recall [34]. For predicting a specific weight status, all accuracy measures ranged from 0 to 1, with a higher value indicating a higher accuracy.
To examine the importance of each predictor at both population and individual levels, based on the best performing prediction models, we used the Shapley Additive Explanations (SHAP) to obtain their contributions for a prediction window [35]. SHAP value is assigned to each predictor and can quantify them by comparing the differences with and without that predictor. The Shapley values from all prediction windows in each cohort were used to compare the summary importance of predictors by different weight status. Furthermore, to better understand the individual-level prediction of weight status, we selected two students as examples and used SHAP waterfall plots to illustrate the importance of different predictors for each student. Figure1 shows the workflow used for this study. All prediction models were developed and compared using Python software (version 3.10) with Scikit-Learn.
Graphical illustration of the workflow used for this study
See original here:
Prediction of adolescent weight status by machine learning: a population-based study - BMC Public Health - BMC Public Health
How Machine Learning Revolutionizes Automation Security with AI-Powered Defense – Automation.com
Summary
Machine learning is sometimes considered a subset of overarching AI. But in the context of digital security, it may be better understood as a driving force, the fuel powering the engine.
The terms AI and machine learning are often used interchangeably by professionals outside the technology, managed IT and cybersecurity trades. But, truth be told, they are separate and distinct tools that can be coupled to power digital defense systems and frustrate hackers.
Artificial iIntelligence has emerged as an almost ubiquitous part of modern life. We experience its presence in everyday household robots and the familiar Alexa voice that always seems to be listening. Practical uses of AI mimic and take human behavior one step further. In cybersecurity, it can deliver 24/7 monitoring, eliminating the need for a weary flesh-and-blood guardian to stand a post.
Machine learning is sometimes considered a subset of overarching AI. But in the context of digital security, it may be better understood as a driving force, the fuel powering the engine. Using programmable algorithms, it recognizes sometimes subtle patterns. This proves useful when deployed to follow the way employees and other legitimate network users navigate systems. Although even discussions regarding AI and machine learning feel redundant, to some degree, they are a powerful one-two punch in terms of automating security decisions.
Integrating AI calls for a comprehensive understanding of mathematics, logical reasoning, cognitive sciencesand a working knowledge of business networks. The professionals who implement AI for security purposes must also possess high-level expertise and protection planning skills. Used as a problem-solving tool, AI can provide real-time alerts and take pre-programmed actions. But it cannot effectively stem the tide of bad actors without support. Enter machine learning.
In this context, machine learning emphasizes software solutions driven by data analysis. Unlike human information processing limitations, machine learning can handle massive swaths of data. What machine learning learns, for lack of a better word, translates into actionable security intel for the overarching AI umbrella.
Some people think about machine learning as a subcategory of AI, which it is. Others comprehend it in a functional way,i.e., two sides to the same coin. But for cybersecurity experts determined to deter, detectand repel threat actors, machine learning is the gasoline that powers AI engines.
Its now essential to leverage machine learning capabilities to develop a so-called intelligent computer that can defend itself, to some degree. Although the relationship between AI and machine learning is diverse and complex, an expert can integrate them into a cybersecurity posture with relative ease. Its simply a matter of repetition and the following steps.
When properly orchestrated and refined to detect user patterns and subtle anomalies, the AI-machine learning relationship helps cybersecurity professionals keep valuable and sensitive digital assets away from prying eyes and greedy digital hands.
First and foremost, its crucial to put AI and machine learning benefits in context. Studies consistently conclude that more than 80% of all cybersecurity failures are caused by human error. Using automated technologies removes many mistake-prone employees and other network users from the equation. Along with minimizing risk, these are benefits of onboarding these automated next-generation technologies.
Improved cybersecurity efficiency. According to the 2023 Global Security Operations Center Study, cybersecurity professionals spend one-third of their workday chasing downfalse positives. This waste of time negatively impacts their ability to respond to legitimate threats, leaving a business at higher than necessary risk. The strategic application of AI and machine learning can be deployed to recognize harmless anomalies and alert a CISO or vCISO only when authentic threats are present.
Increased threat hunting capabilities.Without proactive, automated security measures like MDR (managed detection and response), organizations are too often following an outdated break-and-fix model. Hackers breach systems or deposit malware, and then the IT department spends the remainder of their day, or week, trying to purge the threat and repair the damage. Cybersecurity experts have widely adopted the philosophy that the best defense is a good offense. A thoughtful AI-machine learning strategy can engage in threat hunting without ever needing a coffee break.
Cure business network vulnerabilities.Vulnerability management approaches generally employ technologies that provide proactive automation. They close cybersecurity gaps and cure inherent vulnerabilities by identifying these weaknesses and alerting human decision-makers. Unlike scheduling a routine annual risk assessment, these cutting-edge technologies deliver ongoing analytics and constant vigilance.
Resolve cybersecurity skills gap.Its something of an open secret that there are not enough trained, certified cybersecurity experts to fill corporate positions. Thats one of the reasons why industry leaders tend to outsource managed IT and cybersecurity to third-party firms. Outsourcing helps to onboard the high-level knowledge and skills required to protect valuable digital assets and sensitive information. Without enough cybersecurity experts to safeguard businesses, automation allows the resources available to companies to drill down and identify true threats. Without these advanced technologies being used to bolster network security, its likely the number of debilitating cyberattacks would grow exponentially.
The type of predictive analytics and swift decision-making capabilities this two-prong approach delivers has seemingly endless industry applications. Banking and financial sector organizations can not only use AI and machine learning to repel hackers but also ferret out fraud. Healthcare organizations have a unique opportunity to exceed Health Insurance Portability and Accountability Act (HIPAA) requirements due to the advanced personal identity record protections it affords. Companies conducting business in the global marketplace can also get a leg-up in meeting the EUs General Data Protection Regulation (GDPR) designed to further informational privacy.
Perhaps the greatest benefit organizations garner from AI and machine learning security automation is the ability to detect, respondand expel threat actors and malicious applications. Managed IT cybersecurity experts can help companies close the skills gap by integrating these and other advanced security strategies.
John Funk is a Creative Consultant at SevenAtoms. A lifelong writer and storyteller, he has a passion for tech and cybersecurity. When hes not found enjoying craft beer or playing Dungeons & Dragons, John can be often found spending time with his cats
Check out our free e-newsletters to read more great articles..
Original post:
How Machine Learning Revolutionizes Automation Security with AI-Powered Defense - Automation.com