Page 433«..1020..432433434435..440450..»

How to Use Deep Learning to Process and Analyze Audio Data for Various Tasks and Domains – Analytics Insight

Audio data is a type of unstructured data that contains information about sound waves, such as frequency, amplitude, and phase. Audio data can be used for various applications, such as speech recognition, music generation, noise reduction, and audio classification. However, audio data is also complex and noisy, which makes it challenging to process and analyze.

Deep learning, a subset of machine learning, leverages artificial neural networks to glean insights from data and execute various tasks. Deep learning can handle large and high-dimensional data, such as audio data, and extract useful features and patterns from it. Deep learning can also achieve state-of-the-art results in various audio processing and analysis tasks

This article explains how to use deep learning to process and analyze audio data. Follow these steps:

The first step in any deep learning project is to prepare the data for the model. For audio data, this involves the following steps:

Audio data can be stored in various file formats, such as WAV, mp3, or WMA. To load audio data into Python, we can use libraries such as Librosa. These libraries can read audio files and convert them into NumPy arrays, which are compatible with deep learning frameworks such as TensorFlow or PyTorch.

Audio data can have different characteristics, such as sampling rate, bit depth, or channel. To make the data consistent and compatible, we need to preprocess it by applying operations such as resampling, normalization, or channel mixing.

Audio data can be limited or imbalanced, which can affect the performance and generalization of the model. To increase the quantity and diversity of the data, we can augment it by applying operations such as shifting, stretching, cropping, or adding background noise. We can also use techniques.

The second step in any deep learning project is to build the model for the task. For audio data, this involves the following steps:

Audio data is usually represented as a time series of amplitude values, which can be hard to process and analyze by the model. To make the data more meaningful and compact, we need to extract features from it by converting it into a different domain, such as frequency, time-frequency, or cepstral.

Audio data can be processed and analyzed by various types of deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), or attention-based models. The choice of the model depends on the task and the data.

To train a deep learning model, we need to define the loss function, the optimizer, and the hyperparameters. The loss function measures the difference between the model output and the ground truth. The optimizer modifies the model parameters to reduce the loss.

To evaluate a deep learning model, we need to measure its accuracy and robustness on unseen data. We can use metrics such as accuracy, precision, recall, or F1-score to measure the performance of the model on the task.

The third step in any deep learning project is to deploy and use the model for the task. For audio data, this involves the following steps:

To save a deep learning model, we need to store its architecture, parameters, and configuration. We can use formats such as HDF5, ONNX, or TensorFlow Saved Model to save deep learning models for audio data.

To load a deep learning model, we need to restore its architecture, parameters, and configuration. We can use libraries such as TensorFlow or PyTorch to load deep-learning models for audio data. These libraries can reconstruct the model structure and functionality, as well as enable inference and prediction on new data.

To serve a deep learning model, we need to expose it as a service that can receive and process audio data from various sources and clients. We can use frameworks such as TensorFlow Serving, PyTorch Serve, or FastAPI to serve deep learning models for audio data.

Join our WhatsApp and Telegram Community to Get Regular Top Tech Updates

Excerpt from:
How to Use Deep Learning to Process and Analyze Audio Data for Various Tasks and Domains - Analytics Insight

Read More..

How Neara uses AI to protect utilities from extreme weather – Yahoo Singapore News

Over the past few decades, extreme weather events have not only become more severe, but are also occurring more frequently. Neara is focused on enabling utility companies and energy providers to create models of their power networks and anything that might affect them, like wildfires or flooding. The Redfern, New South Wales, Australia-based startup recently launched AI and machine learning products that create large-scale models of networks and assess risks without having to perform manual surveys.

Since launching commercially in 2019, Neara has raised a total of $45 million AUD (about $29.3 million USD) from investors like Square Peg Capital, Skip Capital and Prosus Ventures. Its customers include Essential Energy, Endeavour Energy and SA Power Networks. It is also partnered with Southern California Edison and EMPACT Engineering.

Nearas AI and machine learning-based features are already part of its tech stack and have been used by utilities around the world, including Southern California Edison, SA Power Networks and Endeavor Energy in Australia, ESB in Ireland and Scottish Power.

Co-founder Jack Curtis tells TechCrunch that billions are spent on utilities infrastructure, including maintenance, upgrades and the cost of labor. When something goes wrong, consumers are affected immediately. When Neara started integrating AI and machine learning capabilities into its platform, it was to analyze existing infrastructure without manual inspections, which he says can often be inefficient, inaccurate and expensive.

Then Neara grew its AI and machine learning features so it can create a large-scale model of a utilitys network and surroundings. Models can be used in many ways, including simulating the impact of extreme weather on electricity supplies before, after and during an event. This can increase the speed of power restoration, keep utilities teams safe and mitigate the impact of weather events.

The increasing frequency and severity of severe weather motivates our product development more so than any one event, says Curtis. Recently there has been an uptick of severe weather events across the world and the grid is being impacted by this phenomenon. Some examples are Storm Isha, which left tens of thousands without power in the United Kingdom, winter storms that caused massive blackouts across the United States and tropical cyclone storms in Australia that leave Queenslands electricity grid vulnerable.

Story continues

By using AI and machine learning, Nearas digital models of utility networks can prepare energy providers and utility for them. Some situations Neara can predict include where high winds might cause outages and wildfires, flood water levels that mean networks need to turn off their energy and ice and snow buildups that can make networks less reliable and resilient.

In terms of training the model, Curtis says AI and machine learning was baked into the digital network from inception, with lidar being critical to Nearas ability to simulate weather events accurately. He adds that its AI and machine learning model was trained on over one million miles of diverse network territory, which helps us capture seemingly small but high consequential nuances with hyper-accuracy.

Thats important because in scenarios like a flood, a single degree difference in elevation geometry can result in modeling inaccurate water levels, which means utilities might need to energize electricity lines before they need to or, on the other hand, keep power on longer than is safe.

Neara co-founders Daniel Danilatos, Karamvir Singh and Jack Curtis. Image Credits: Neara

Lidar imagery is captured by utility companies or third-party capture companies. Some customers scan their networks to continuously feed new data into Neara, while others use it to get new insights from historic data.

A key outcome from ingesting this lidar data is the creation of the digital twin model, says Curtis. Thats where the power lies as opposed to the raw lidar data.

A couple examples of Nearas work include Southern California Edison, where its goal is auto-prescription, or automatically identifying where vegetation is likely to catch fire more accurately than manual surveys. It also helps inspectors tell survey teams where to go, without putting them at risk. Because utility networks are often massive, different inspectors are sent to different areas, which means multiple sets of subjective data. Curtis says using Nearas platform keeps data more consistent.

In Southern California Edisons case, Neara uses lidar and satellite imagery and simulates things that contribute to the spread of wildfire through vegetation, including windspeed and ambient temperature. But some things that make predicting vegetation risk more complex is that Southern California Edison needs to answer more than 100 questions for each of its electric poles due to regulations and its also required to inspect its transmission system annually.

In the second example, Neara started working with SA Power Networks in Australia after the 2022-2023 River Murray flooding crisis, which impacted thousands of homes and businesses and is considered one of the worst natural disasters to hit southern Australia. SA Power Networks captured lidar data from the Murray River region and used Neara to perform digital flood impact modeling and see how much of its network was damaged and how much risk remained.

This enabled SA Power Networks to complete a report in 15 minutes that analyzed 21,000 power line spans within the flood area, a process that would have otherwise taken months. Because of this, SA Power Networks was able to re-energize power lines within five days, compared to the three-weeks it originally anticipated.

The 3D modeling also allowed SA Power Networks to model the potential impact of various flood levels on parts of its electricity distribution networks and predict where and when power lines might breach clearances or be at risk for electricity disconnection. After river levels returned to normal, SA Power Networks continued to use Nearas modeling to help it plan the reconnection of its electrical supply along the river.

Neara is currently doing more machine learning R&D. One goal is to help utilities get more value out of their existing live and historical data. It also plans to increase the number of data sources that can be used for modeling, with a focus on image recognition and photogrammetry.

The startup is also developing new features with Essential Energy that will help utilities assess each asset, including poles, in a network. Individual assets are currently assessed on two factors: the likelihood of an event like extreme weather and how well it might hold up under those conditions. Curtis says this type of risk/value analysis has usually been performed manually and sometimes doesnt prevent failures, as in the case of blackouts during California wildfires. Essential Energy plans to use Neara to develop a digital network model that will be able to perform more precise analysis of assets and reduce risks during wildfires.

Essentially, were allowing utilities to stay a step ahead of extreme weather by understanding exactly how it will affect their network, allowing them to keep the lights on and their communities safe, says Curtis.

Here is the original post:
How Neara uses AI to protect utilities from extreme weather - Yahoo Singapore News

Read More..

Graph Learning at the Scale of Modern Data Warehouses – InfoQ.com

Transcript

Dulloor: My name is Dulloor. I'm a founding engineer at Kumo.ai. At Kumo, we are working on bringing the value of AI driven predictive analytics to relational data, which is the most common form of data in enterprises today. Specifically, we apply graph learning to data in warehouses to deliver high quality predictions for a wide variety of business use cases. This talk is about the challenges of productionizing graph learning at scale, and the systems that we built at Kumo to address those challenges.

For this talk, I'll first provide a brief overview of graph neural networks or GNNs, and explain their advantages over traditional machine learning. Then we'll have a quick primer on graph representation learning using PyG, a popular open source GNN library. After that, I will delve into how the Kumo GNN platform with PyG at its core simplifies productionizing GNNs at a very large scale for enterprise grade applications.

First, some background. We have all seen how deep learning has transformed the field of machine learning. It has revolutionized the way we approach complex tasks, such as computer vision and natural language processing. In computer vision, deep learning has replaced traditional handcrafted feature engineering with representation learning, enabling the learning of optimal embeddings for each pixel and its neighbors. This has enabled a significant improvement in accuracy and efficiency for computer vision tasks. Similarly, in the case of natural language processing, the state-of-the-art performance achieved with deep learning is truly astounding. With deep learning, it's now possible to train models that can understand and generate natural language with incredible accuracy and fluency. The ability to transfer learn from data rich tasks to data per task has further extended the capabilities of deep learning, allowing high model performance to be achieved with fewer labels. Overall, the impact of deep learning on machine learning has been transformative, and it has unlocked new levels of accuracy, efficiency, and capabilities across a wide range of applications.

While deep learning has seen wide adoption for visual data and natural languages, enterprise data typically consists of interconnected tabular data with a lot of attribute-rich entities. This data can be visualized as a graph where tables are connected by primary and foreign keys. The graph neural networks or GNNs bring the revolution of deep learning to such graph data in enterprises.

Graph neural networks or GNNs are a type of neural network designed to operate on graph structured data. They extend traditional neural network architectures to model and reason about relationships between entities, such as nodes and edges in a graph. The basic idea behind GNNs is to learn representations for each node in the graph by aggregating information from its neighbors. This is done through a series of message passing steps, where each node receives messages from its neighbors, processes them, and then sends out its own message to its neighbors. At each step, the GNN computes a hidden state for each node based on its current state and the messages it has received. This hidden state is then updated using a nonlinear activation function and passed to the next layer of the GNN. This process is repeated until a final representation is obtained for each node, which can then be used for downstream tasks such as node classification, link prediction, and graph level prediction. GNNs provide a very powerful tool for modeling and reasoning about graph structured data.

Now let's see how graph-based ML compares to traditional tabular ML when it comes to graph data. Traditional tabular ML often requires one-off efforts to formulate the problem, to perform necessary feature engineering, select an ML algorithm, construct training data for the problem. Finally, train the model using one of the many frameworks. As a result of so many one-off steps, traditional tabular ML is often error prone. In a sense, it amounts to throwing everything at the wall and seeing what sticks. Adding to that, the need to deal with so many frameworks and their peculiarities makes the problem even worse. Graph-based ML, on the other hand, is a much more principled approach to learning on graph data. Graph learning offers better performance at scale, and also generalizes to a wide variety of tasks. Problem formulation is much easier too because a use case has to be translated into only one of a handful of graph ML tasks. Once a graph ML task is defined, GNNs automatically learn how to aggregate and combine information to learn complex relational patterns at scale. GNNs are also good at reasoning across multiple hops. It is very hard to pre-compute and capture this as input features. That's one reason why traditional tabular ML tends to lose signal and result in poorer model quality, particularly as the use cases become more sophisticated. Finally, GNN's learned representations are more effective and generalizable than manually engineered features, and they are well suited for use in the downstream tasks. With traditional ML, problem formulation and mapping the problem to an ML task must be done pretty much from scratch for each use case. There is very little generalization that you can get across multiple tasks.

No wonder because of all these advantages, GNN adoption has really picked up in the industry. GNNs have repeatedly demonstrated their effectiveness in real-world scenarios, with machine learning teams at top companies successfully leveraging them to improve performance across a range of tasks. These tasks include recommendation and personalization systems, fraud and abuse detection systems, forecasting dynamic systems, and modeling complex networks, and many more use cases. GNNs have certainly become a potent tool for data scientists and engineers who are dealing with graph data.

Let's do a quick overview of how graph representation learning is done today in the research and open source community. For this part, we'll focus on PyTorch Geometric or PyG. PyG is a very popular open source library for deep learning on graphs that is built on top of PyTorch. It is a leading open source framework for graph learning, with wide adoption in the research community and industry. PyG's thriving community consists of practitioners and researchers contributing to the best new GNN architectures and design all the time. As we will see later, PyG is also at the core of the Kumo GNN platform. PyG provides a set of tools for implementing various graph-based neural networks, as well as utilities for data loading and preprocessing. It enables users to efficiently create and train graph neural networks for a wide range of applications from node classification and link prediction to graph classification. PyG also provides several curated datasets for deep learning on graphs. These datasets are specifically designed for benchmarking and evaluating the performance of graph-based neural networks. PyG also provides a variety of examples and tutorials for users to get started with graph deep learning.

The PyG's programming model is designed to be flexible and modular, allowing users to easily define and experiment with different types of GNNs for various graph-based machine learning tasks. The first step is creating and instantiating graph datasets and graph transformations. For this step, PyG provides a variety of built-in graph datasets, such as citation networks, social networks, and bioinformatics graphs. You can also create your own custom dataset by extending the PyG dataset class. Graph transformations allow you to perform preprocessing steps on your graphs such as adding self-loops, normalizing node features, and so on. After creating, the next step is defining how to obtain mini-batches from your dataset. In PyG, mini-batches are created using the data loader class, which takes in a graph dataset and a batch size. PyG provides several different sampling methods for mini-batching out of the box, such as random sampling and neighbor sampling for graph's edge, and so on. You can also define your own sampling methods by creating a custom sampler class. The third step is designing your own custom GNN via predefined building blocks or using predefined GNN models. PyG provides a variety of predefined GNN layers such as graph convolutional networks, graph attention networks, and graph's edge. You can also create your own custom GNN layers by extending the underlying PyG message passing class. GNNs in PyG are designed as modules, which can be stacked together to create a multi-layer GNN. The final step in doing GNNs with PyG is implementing your own training and inference routines. PyG provides an API for training and evaluating GNNs, which looks very similar to the PyTorch API. Training a GNN involves defining a loss function, optimizer, and calling the train method on your GNN module. The inference involves calling the eval method on your GNN module, followed by making predictions on new graphs.

While PyG provides a great basis for graph learning, productionizing GNNs requires many additional capabilities that are difficult to build and scale. Firstly, let's consider graph creation. PyG expects a graph to be in either COO or CSR format to model complex heterogeneous graphs. Both nodes and edges can hold any set of curated features. Datasets that come with PyG are ready to use, however for non-curated datasets, such as what we will see in enterprises all the time, it is up to the users to create and provide a graph that PyG expects. Graph creation and management are not trivial, particularly at scale. The second issue is the problem formulation. While PyG supports any graph related machine learning tasks, the problem formulation itself is up to the user. It is the user's responsibility to define a business problem as one of the supported graph learning task types in PyG. Even after the problem formulation, curation of training labels for a given task is also the user's responsibility. When doing this, one has to make sure that temporal consistency is maintained during label generation and neighbor sampling, and all the way in the pipeline to avoid leakage from future entities. The same goes for predictions. Avoiding data leakage and serving GNN predictions is a non-trivial task. Another limitation is that, while GNN supports full customization, from model architecture to training routine pipeline, the best model architecture itself is both data and task dependent, and users must take into account several factors such as, which GNN is best suited for a given task? How many neighbors and how many hops to sample? How to ensure that the model generalizes over time, and how to deal with class imbalances and overfitting. To add to all this complexity, when new data arrives, or the structure of the graph changes, it requires updating the graph, retraining and versioning the model outputs. As you can see, productionizing GNNs particularly at scale is challenging, because there are many complex and error prone steps generally required to go from a business problem to a production model. It requires significant GNN and systems' expertise to productionize GNNs at scale. The Kumo GNN platform addresses this gap.

We will now dive deeper into the Kumo programming model and the platform. The Kumo platform makes it easier to express business problems as graph-based machine learning tasks. It is also easy to securely connect to large amounts of data in warehouses and start running queries. Once a connection to a data source is established, and a meta-graph is specified, the Kumo platform automatically materializes the graph at the record level. The business problem itself can be specified declaratively using predictive query syntax. Internally, predictive queries are compiled into a query execution plan, which includes the training plan for the corresponding ML task. As data in the data source changes, the materialized graph and features are incrementally updated, and Kumo automatically optimizes the graph structure for specific tasks, such as adding meta-paths between records in the same table to reduce the number of hops that is required to learn effective representations. When it comes to model training, Kumo provides out of the box few-shot AutoML capabilities that are tailored to GNNs. Finally, Kumo also provides enterprise grade features such as explainability in addition to MLOps capabilities.

The Kumo programming model for training and deploying GNNs is pretty simple, and it can be broken down into five steps. The first step is to create Secure Connectors to one or more data sources. The next step is to create one or more business graphs by connecting tables from the data sources. Following that, we formulate the business problem with predictive queries on the business graphs. Finally, the predictive queries are trained, typically with an AutoML trainer, whose search space has been configured by the predictive query compiler. Then, running inference can be done multiple times on this trained model. All of the steps described here can also be done via UI. The Python SDK and UI both share the same REST service backend. Let's take a look at each of these steps in more detail. We use the popular H&M Kaggle dataset for the running example. It has three tables, user, sales, and products that are linked to each other as shown here. The first step is to create a connector for each data source. In the Kumo platform, connectors abstract physical data sources to provide a uniform interface to read metadata and data. Kumo supports connectors to S3 and a number of cloud warehouses such as Snowflake and Redshift. For datasets that are in S3, Kumo supports both CSV and Parquet file formats, and a number of common partitioning schemes, and direct tree layouts. The Kumo platform uses the Secure Connectors to ingest data and cache them for downstream processing. Working with a wide variety of data sources can be challenging due to data cleanliness issues, particularly typing related issues. That's something the Kumo platform takes care of automatically. After creating connectors, the next step is to register a meta-graph by adding tables and specifying linkage between them. In well-designed schemas, these linkages are typically primary key, foreign key relationships, and the meta-graph represents the graph at the schema level. The actual graph at the record level is typically large and difficult to construct and maintain. The graph here is materialized automatically in the backend in a scalable manner.

After building a graph, users can define their business use cases declaratively with the predictive query syntax. Predictive query makes it easy to express business problems without worrying about how they map to graph-based ML tasks. Once a predictive query is defined, Kumo automatically infers the task type, generates the training labels for the task, handles the train and evaluation splits, and determines the best training strategy based on the understanding of both data and task. Through the training process, Kumo also automatically handles time correctness to prevent data leakage. Users may create any number of predictive queries on a given graph. Kumo then provides multiple options for creating your model. As mentioned before, for a given predictive query, based on the understanding of data and the task, Kumo can automatically generate the best training strategy and search space for few-shot AutoML. This auto-selected config includes both AutoML process options, such as data split strategy, upsampling and downsampling of target classes, the training budget. It also includes GNN design space options, such as GNN model architecture, sampling strategy, optimization parameters, and so on. In addition, the type of encoding to use for features is also part of the search space. The encoding options are selected automatically based on data understanding and statistics, thereby avoiding manual feature engineering, which is error prone, and often ad hoc. Advanced users, they also have the option of customizing the AutoML search space and restricting or expanding the set of training experiments to run. AutoML requires that the Kumo platform is able to run a large number of concurrent experiments for each predictive query.

The final step is to run inference on the best trained model. This inference could be run potentially many times a day. The results from running the inference, which is the model output, can be pushed to S3 or directly to a warehouse such as Snowflake. Kumo GNNs can produce both predictions and embeddings. Predictions are usually integrated directly into business applications, while embeddings are typically used in a variety of downstream applications to improve their performance. To summarize, Kumo GNNs provide a flexible and powerful way of doing ML on enterprise graph data. The Kumo programming model itself is simple enough that even non-experts in ML and GNNs can bootstrap quickly and deliver business value. However, enabling this nice user experience, particularly at scale, is challenging.

Let's get into how we productionize GNNs at scale at Kumo. Before getting into the Kumo platform architecture, let's first take a look at how a typical workflow for training a single predictive query looks. When a predictive query arrives, the platform first determines the data dependencies. Based on that, data is ingested from data source and cached. Next, we compute stats and run some custom logic to better understand data. Simultaneously, the platform also materializes features and edges to a format that is better suited to support training, particularly mini-batch fetching. These steps so far can be shared between predictive queries with the same set of dependencies. Then the workflows that follow are more query specific. First, the predictive query compiler takes the predictive query as input, infers the type of ML task, and generates a query execution plan based on the task and data characteristics. The workflows following that take the query execution plan as input, generate the target table, and train models using the specified AutoML config. It is important to note that to search for the best model, AutoML itself spawns a number of training jobs and additional workflows. The key takeaway from the previous slide is that there is a lot happening in the backend, even to train a single predictive query. When you add to that, the Kumo platform's objective of running model training and inference for many predictive queries simultaneously, the problem becomes much more challenging. To address this challenge, the Kumo platform was able to scale seamlessly and on-demand to a very large number of data and compute intensive workflows.

Here, I'll first discuss some of the key principles that we followed in designing the Kumo platform architecture. These principles can be seen as common-sense guidelines that have been proven effective in large scale systems for a long time. Firstly, since the Kumo platform must be scalable enough to handle a large number of fine-grained concurrent workflows, a microservice based architecture is a straightforward choice. For operational simplicity, we want to use standard cluster management tools like K8s. Another common design choice in large scale systems is a separation of compute and storage. In our context, workflows must be well defined, taking inputs from S3, and producing outputs to S3 to maintain the separation of compute and storage. This simple principle enables the system to scale compute independently from storage, allowing for efficient resource usage and flexibility. The third principle is that the microservice architecture also makes it easy to choose AWS instances based on specific workload requirements. Data processing and ML workflows in Kumo have very different characteristics, requiring a mix of GPU instances, memory optimized instances, and IO optimized instances often with very large SSDs. A tailored approach to choosing the instance types is required for optimized performance and cost efficiency. Finally, the Kumo platform deals with data in the order of tens of terabytes. For continuous operations, starting from scratch after every data update can become prohibitively expensive, and slow. Incremental processing of newly arriving data and incremental materialization of graphs and features are necessary, not only for efficient resource usage, but also to reduce the overall turnaround time for training and predictions. The Kumo platform's design is guided by these four simple scaling principles.

The overall architecture can be broken down into four major components. The control plane is the Always On component that holds the REST service, manages metadata, and the workflow orchestrator. The control plane also houses the predictive query planner or compiler that compiles a user provided predictive query into an execution plan driven by the workflow orchestrator. The data engine is the component responsible for interfacing with the data sources. It is responsible for ingesting and caching data, inferring column semantics, and computing table column edge stats for data understanding. Data engine also materializes edges and features from raw data into artifacts that are used by the graph engine for neighbor sampling and feature serving, respectively. The Compute Engine is the ML workhorse of the system with PyG at its core. The Compute Engine is where model training jobs are run for all experiments and for all predictive queries. Inference jobs that produce predictions or embeddings are also run in the Compute Engine. The Compute Engine works on mini-batches of data during the training process. Each example in a mini-batch represents a sub-graph that contains a node and a sample of its neighborhood. It is the only component that requires GPU instances. Finally, the graph engine exists as a separate component to provide Compute Engine with two services. One is graph neighbor sampling, and the second, feature servicing. Compute Engine requires these services to construct mini-batches for training. Graph engine is an independently scaling shared service that's used by all trainer instances in the Compute Engine.

Next, we'll get into the details of each of these components and the specific challenges that we had to solve with them. First, let's take a look at the data engine. Like mentioned before, the data engine is responsible for all raw data processing and transformations in the system. Data engine uses Secure Connectors to ingest data from the data sources and cache them internally in a canonical format. During this process, data engine also tries to infer the semantic meaning of columns, and it computes extensive table, column, and edge level stats for better data understanding. This information is used later by the predictive query planner in feature encoding decisions and in determining the AutoML search space. For efficient execution of predictive queries, some components need data in columnar format, while others need it in row-oriented format. For instance, data engine needs data in columnar format for statistics computation, and semantic inference, and edge materialization. Whereas the feature store in the graph engine requires data in row-wise format for fast feature fetching and mini-batch construction. The data engine efficiently produces all these artifacts, and is also responsible for versioning and lifecycle management of data artifacts in the cache.

Let's first take a look at one of the two main functions of data engine, which is feature materialization. Materializing features converts raw data features in tables to a row-oriented format that is suited for random feature serving. Materializing features is not only data intensive, but the artifacts produced are also very large. One of the primary goals here is to avoid any additional processing when these artifacts are loaded into the feature store. We want the loading to be as lightweight as possible. Additionally, we want to minimize the time taken to load the materialized features. Since we implement the feature store as a RocksDB key-value store, the data engine achieves both of these objectives by materializing features directly to the RocksDB internal SSD file format. These SSD files can later be loaded directly by the feature store. During feature materialization, nodes are assigned unique indices, and these same indices are used when materializing the list of edges between each pair of connected tables. That's how we can make the edges themselves a fixed size and more storage efficient. Finally, feature materialization can be easily parallelized both across tables and across partitions within a table. The second big function of data engine is graph or edge materialization. Let's take a look at that now. To materialize the graph, we need to generate the list of edges for each pair of connected tables in the graph. This step is where automatic graph creation occurs so that users don't have to bring their own graph. The edges are produced by joining tables on keys that link them. The materialized edges are also output in COO format to enable more data parallelism while generating these edges. Additionally, as new data arrives, the materialized edges are updated incrementally, which is faster and also more efficient.

As we can see, between data caching, stats computation, and materialization of features and edges, there's a lot happening in the data engine, particularly when data is in the order of tens of terabytes. Fortunately, all of this data processing can be distributed and parallelized. However, it is important to ensure that all data processing in Kumo data engine is completely out of core, meaning that at no time do we need all of the input to be present in memory. For this purpose, Kumo data engine uses Spark for data processing. Specifically, we use PySpark with EMR on EKS, which enables the data engine to autoscale based on its compute needs. EMR on EKS also integrates cleanly with the Kumo control plane, which is Kubernetes based. Also, the data engine uses Apache Livy to manage Spark jobs. Livy provides a simple REST interface to submit jobs as snippets of code. It also provides a clean, simple interface to retrieve these results synchronously or asynchronously. Livy simplifies the Spark context management and provides excellent support for concurrent, long running Spark context that can be reused across multiple jobs and clients. Livy also manages and simplifies the sharing of cast RDDs across jobs and clients, which turned out to be very helpful for efficient job execution. Finally, as you can see here, data engine implements a lightweight Livy driver, which launches Spark jobs on-demand in EMR on EKS. That job launches from the Livy driver themselves are triggered by workflow activities, which are scheduled by the orchestrator.

Now that we are done with data engine, let's move on to the scaling challenges with the Compute Engine. The Compute Engine is the ML workhorse of the Kumo GNN platform. It is where AutoML driven model training happens, and also where model outputs from inference are produced. Take the example of a training pipeline for a single AutoML training job. Taking this example, the GPU based trainer instance with PyG at its core, must continuously fetch mini-batches of training examples from the graph engine. Train on these mini-batches, and do all of these while the feature serving and mini-batch production matches the training throughput. Construction of the mini-batches itself is a two-step process. The first step is to sample the neighborhood for a set of requested nodes and construct a subgraph for each of those nodes. The next step is to retrieve features for the nodes in these subgraphs, and construct a mini-batch of examples for the training. This process is repeated for each step of model training and Kumo supports several sampling strategies. While the specific sampling method may vary across trainer jobs, this sampler itself always ensures temporal consistency if the event timestamps are present. An AutoML search algorithm decides which configurations the trainers will execute based on data, task, and past performance metrics. The goal of this searcher algorithm is to ensure that the optimal set of parameters is learned for any given task. At any given time, we may have multiple trainer jobs waiting to be executed for a predictive query, and many predictive queries in flight. For a reasonable turnaround time, the Compute Engine must be able to launch and execute many trainers in parallel. That is the main scaling requirement for the Compute Engine to be able to scale up the number of trainer jobs on-demand.

To scale to a large number of trainers in parallel, we rely on two key ideas. The first is a separation of graph engine from the Compute Engine, so that we are able to scale these two components independently. By sharing the feature and graph stores across multiple training jobs, we are able to run a large number of training jobs in parallel with low resource requirements on the GPU nodes themselves. Now these trainers communicate independently with the feature and graph stores through a shared mini-batch fetcher, which not only fetches, but also caches these mini-batches that are requested by trainers. In practice, we have seen this mini-batch fetcher to be quite helpful. The second key idea is autoscaling trainers with intelligent selection of the type of GPU instance. The Kumo engine implements a lightweight driver that manages a K8s cluster of trainer instances. When requested by the AutoML searcher, this driver is able to launch a trainer on-demand, selecting the type of instance that is suited for the specific training job config. Now the resulting architecture is highly scalable, resource efficient, and cost efficient. It is also extremely flexible, and makes it easy to integrate new machine learning approaches. To enable this sharing of feature store across multiple trainer jobs, we keep only raw features in the feature store. The feature encoding itself happens on the fly and in the Compute Engine. The specific types of encoding, which is part of the AutoML search space depend on both data and task. Advanced users always have the option to overriding feature encoders, just like they're able to override other options in the generated AutoML config. Kumo supports a large number of encoding types out of the box, and continues to add more as needed.

Finally, let's take a look at the graph engine and how we scale feature and graph stores. The Kumo feature store is a horizontally scalable persistent key-value store that is optimized for random read throughput. It is implemented as a service for feature fetching over RPC. We have implemented three key optimizations to improve the efficiency of feature serving, and also speed up the conversion of raw features in the storage to Tensors expected by the caller. The first optimization is to reduce communication overhead. We use protobuf/gRPC to communicate between the data server and the compute client. We store these individual node features as protobufs. Typically, like in TensorFlow, these example features are defined as a list of individual feature messages that contain the name of the feature and its associated value. However, this representation is not particularly storage friendly, due to the duplication of names and lack of memory alignment. We circumvent this issue with a simple optimization, which is based on this idea. The idea is pretty simple, that we create a separate feature config, which determines the order of columns in the protobuf that contains the feature values. Additionally, we also group the columns in this feature config by the data types to enable features of the same type to be stored in a compacted array. That further reduces the message size.

The second optimization is related to how we convert from row-wise to column-wise feature representation on the client side. On the client side, the feature store receives features that are stored as protobuf in the row-wise feature representation, as has been extracted from the feature store. These features need to be converted to a column-wise feature matrix, so that we can easily perform feature transformations that are applied to columns. The image on the bottom left actually depicts this process. While the column-wise feature matrix is constructed in a client that is written in C++, it is later used in a number of places in Python code, and in one or more processes. We chose Arrow as our column-wise data format to benefit from its zero-copy design. To be able to do this, the most challenging part that we had to handle was basically dealing with the NA values for which we designed lightweight math, and some careful design decisions that helped us get to the zero-copy design. We want to optimize the feature access performance by improving the data locality so that we can maximize the number of features that are fetched within each field operation. We implemented a simple but very effective idea to reorder the nodes in the feature store based on the neighbors. For example, as you can see here in the picture at the bottom right, we placed triangle nodes accessed by the same circle neighbor as close as possible. Then we placed the square nodes accessed by the same triangle neighbors as close as possible. This optimization works very well for a lot of real-world applications, and in our particular scenario, where we have to traverse the graph and get the features for neighbors. With these optimizations in place, we were able to achieve 3x speedup in the end-to-end feature fetching, which includes the time it takes to convert features to a Tensor. As a result, we are able to keep GPUs fully utilized during model training, and run many more trainer instances in parallel.

Let's move on to the Kumo graph store, which is implemented as an in-memory graph store that is optimized for sampling. It is also implemented as a separate service to allow for independent scaling from the feature store and the Compute Engine. Sampling in GNNs produces a uniquely sampled subgraph for each seed node. To ensure maximum flexibility during training, the graph store must be optimized for fast random access to outgoing neighbors given an input node. Additionally, to scale to large graphs, with tens of billions of edges, the graph store must minimize the memory footprint by using compressed graph formats. The graph engine achieves both of these objectives by leveraging core PyG sampling algorithms that are optimized for heterogeneous graphs in CSR format. When timestamps are provided, the edges in CSR are secondarily sorted by timestamp, the speed of the temporal sampling process. Furthermore, since enterprise graphs are typically sparse, the CSR representation itself is able to achieve very high compression ratios.

Now let's move on to the control plane and some of the specific challenges that we had to solve over there. The Kumo platform, as we have seen before, is designed to run many predictive queries on potentially many graphs simultaneously. Training a predictive query requires running a large number of workflows to securely ingest data into caches, compute stats, materialize features and edges for the graph engine. Then, additional workflows are needed to automatically generate the training and validation tables, train the models, and run inference to produce model outputs. To support all of this, the Kumo control plane is built around a central orchestrator with dynamic task queues that can scale to execute thousands of stateful workflows, and activities, all of them concurrently. After carefully evaluating a number of options for orchestrators out there, we chose Temporal for workflow orchestration. In addition, the control plane includes a metadata manager and enterprise grade security and access controls. The metadata itself is stored in a highly available transactional database. The control plane also handles all aspects of graph management, including scheduling incremental updates to the graph, and also graph versioning. Furthermore, the control plane provides basic MLOps functionality, including model versioning, and tools to monitor data and model quality.

GNNs bring deep learning revolution to graph data found in enterprises. The graph-based ML can significantly simplify predictive analytics on relational data by replacing ad hoc feature engineering with a much more principled approach that automatically learns from connections between entities in a graph. However, deploying GNNs can be very challenging, particularly at the scale that is required by many enterprises. The Kumo platform is designed from the ground up with an architecture that scales GNNs to very large graphs. Then the platform is also able to simultaneously train many models and produce model outputs from many queries on the same graph at the same time. While building out the capabilities in the Kumo platform requires deep expertise in both GNNs and high-performance distributed systems, the complexity itself is hidden away from users by means of a simple yet intuitive API. As an example of how much Kumo platform can scale, in one specific customer deployment, we score 45 trillion combinations of user item pairs and generate 6.3 billion top ranking link predictions in a matter of a couple of hours, starting from scratch. By designing for flexibility, the Kumo GNN platform is able to enable users to quickly go from the business use case to a deployable GNN at scale, and therefore derive business value much faster.

See more presentations with transcripts

Read more:
Graph Learning at the Scale of Modern Data Warehouses - InfoQ.com

Read More..

Is T-Mobile’s AI training model the reason it keeps getting hacked? – Light Reading

A new lawsuit against T-Mobile's board of directors contains some surprising and timely accusations against the 5G provider: that it pooled its customers' data into one big database that it is using to train its AI services, and that this is the reason the US company has suffered through a series of devastating hacks into its systems.

"In order to train the sophisticated AI and machine learning models T-Mobile needed ... T-Mobile pooled all its data, pooled credentials, and prioritized (and still prioritizes) model training and accessibility over data security," according to the lawsuit, which was filed by a T-Mobile investor.

"Because T-Mobile centrally maintains credentials and configurations for its databases, then allows software programs to query and combine their disparate data, T-Mobile essentially maintains a single consolidated pool of data," the lawsuit states. "This single-point of access data centralization is incredibly dangerous and a serious departure from well-accepted baseline data security and enterprise data storage practices."

T-Mobile and its parent company Deutsche Telekom (DT) have soundly rejected the allegations in the lawsuit.

T-Mobile responds

The lawsuit "is based solely on speculation (piled on speculation), not well-pleaded facts," according to T-Mobile's response to the plaintiff's initial claims. The response was filed in Delaware Court of Chancery.

"Plaintiff points to no T-Mobile board minutes discussing any directive or any documents (either internal or external) at all that mention such a directive," T-Mobile's response continues. "Plaintiff's opposition ignores that fatal flaw and instead asks the court to infer such a directive based on nothing more than (1) two YouTube videos, (2) an irrelevant PowerPoint slide from a DT supervisory board meeting, and(3) the fact that T-Mobile announced a merger with Sprint in 2018. None of those comes close to supporting such an inference."

The Delaware Court of Chancery is often the forum for disputes involving the internal affairs of corporations. The lawsuit against members of T-Mobile's board was first filed by T-Mobile investor Jenna Harper in late 2022. Lawyers in the case earlier this month presented their arguments before Vice Chancellor Sam Glasscock III, who reportedly seemed skeptical of some of the lawsuit's claims.

It's unclear what might happen next in the case, which is being viewed by some as an important look into the sometimes murky rules around AI development, security and data management. Leading AI companies including ChatGPT have argued that AI services are only as good as the data they're trained on the more data, the better. And in the wireless industry, network operators command vast amounts of data about their operations and their customers.

Teaching the AI

According to Harper's lawsuit, T-Mobile's AI efforts stem from a program started in 2014 in DT's T-Labs research division. "DT's plan unprecedented in the staid telecommunications space was to roll out a unified, incredibly audacious data-mining and AI-training architecture," according to the lawsuit.

The suit notes that the effort initially fell under DT's "big data" efforts. Big data is a term that has been used to describe using high-performance computing services including those running in the cloud to comb through mountains of data to glean business insights. Today, however, big data has been mostly subsumed by the AI craze because artificial intelligence is used to find correlations inside those data warehouses.

According to Harper's complaint, DT sought a leg up over competitors by unifying its "data lake" across business units and country borders. "Deutsche Telekom has launched an overarching AI program, eLIZA, for the purpose of linking all AI solutions within the Deutsche Telekom Group," the lawsuit states. Doing so will "commingle and share everything learned from that data, including ML/AI models, for the benefit of DT as a whole," it adds.

The AI program stretched into T-Mobile following its acquisition of Sprint, which closed in 2020, according to the suit. DT has been increasing its ownership stake in T-Mobile since that merger.

However, T-Mobile sought to cut corners in its efforts to participate in DT's AI program, according to the lawsuit.

The 'qAPI' hole

"For example, although most enterprises used sophisticated and robust programming languages, such as Python, to develop machine-learning applications, T-Mobile's team used the programming language R a language used for statistical modeling," according to the lawsuit. "While R could help T-Mobile's data scientists rapidly prove that their ML models had predictive capacity, the language was poorly suited to security, data management and data infrastructure, as it lacked many of the software libraries available in other programming languages, like Python."

The complaint also states that T-Mobile created an application programming interface (API) that could interact with multiple databases of information. But the company didn't implement a secure method for accessing that API, dubbed qAPI.

"Critically, qAPI allowed 'credential' centralization," according to the lawsuit. "That meant that individual usernames and passwords or other database access keys would not have to be maintained by each app. They would be held by the API, which in turn would enforce access from querying apps. This meant that the credentials for every database would be centrally maintained creating a single point of failure for T-Mobile's security."

The complaint continues: "As a result, a single compromised test server anywhere in the entire T-Mobile ecosystem can easily and durably access, save and export the entirety of T-Mobile's data ecosystem because T-Mobile designed its system that way."

The hacks

Harper's lawsuit then highlights the multiple breaches into T-Mobile's security systems that happened shortly after the company closed its merger with Sprint. The most serious occurred in August 2021, when John Binns discovered "an unprotected [T-Mobile] router exposed on the Internet." The 21-year-old told his story to The Wall Street Journal.

"In short, Binns found a single unsecured router publicly exposed on T-Mobile's network, and was quickly able to gain access to a centralized repository of credentials that allowed him the keys to T-Mobile's entire data kingdom, including more than 100 servers," the lawsuit states. "This matches the precise architecture of the qAPI system."

The lawsuit also alleges that T-Mobile hasn't fixed its data architecture, and is maintaining its systems in order to continue participating in DT's wide-ranging AI training efforts.

But that argument doesn't fly, according to DT and T-Mobile.

"Plaintiff's central thesis that T-Mobile's board disloyally allowed DT to 'loot' T-Mobile's data, for DT's own benefit, thus exposing T-Mobile to cyberattacks is based solely on speculation (piled on speculation), not well-pleaded facts," T-Mobile's response states.

The context and the background

Big data, and now AI, remain hot topics in the wireless industry and in the wider tech marketplace. For example, AT&T has detailed its own efforts to unify its data assets to improve business strategies.

"AT&T carries more than 534.7 petabytes of data across its global network every day," explains AT&T's Andy Markus, chief data officer, in a 2022 post to the company's website. "To manage data at this scale, the CDO [Chief Data Office] team has defined a common approach to how data is stored, managed, accessed and shared across AT&T."

By applying AI technologies to all that data, AT&T says it has been able to block robocalls, predict outages and develop virtual assistants for customer care services, among other services. Other carriers have boasted of similar efforts.

AI is also becoming a key technological consideration among telecom vendors. Giants like Ericsson and startups like Aira Technologies are promising dramatic improvements in performance and operations thanks to AI. But the gains are contingent on the vendors' ability to sift through mountains of networking data.

As a result, such data is growing in value. For example, companies like The New York Times are moving to prevent generative AI companies from using their data to train AI agents. In response, ChatGPT has proposed ways for content owners to license their data for AI training programs.

Indeed, AI technology continues to create sticky legal questions. For example, the US Patent and Trademark Office now has guidelines on how to award patents that are generated through AI platforms.

Finally, cybersecurity remains a top concern among regulators and others. For example, the FCC recently moved forward with rules that would require mobile network operators to notify customers when their data is illegally accessed.

Go here to see the original:
Is T-Mobile's AI training model the reason it keeps getting hacked? - Light Reading

Read More..

EU AI Act Takes Another Step Forward – InformationWeek

European lawmakers on Tuesday ratified a provisional agreement that paves the way for landmark legislation for artificial intelligence that will likely have worldwide implications.

While the Biden Administrations AI executive order provides some regulation in the US, the EUs AI Act will represent the first major world power codifying AI protections into law. The landmark rules will establish regulations for AI systems like the popular OpenAI chatbot, ChatGPT, and will rein in governments use of biometric surveillance.

The European Parliaments civil liberties (LIBE) and internal market (IMCO) committees approved a draft by a vote of 71-8. The regulations were originally proposed in April 2021 and moved ahead quickly last year after the explosive growth of AI sparked by the success of ChatGPT.

With passage in April, the EU would roll out the AI Act in phases between 2024 and 2027 as increasing levels of legal requirements target high-risk AI applications.

Despite the overwhelming support, there were some dissenters. In a statement, LIBE committee member Patrick Breyer said the rules did not go far enough to offer safeguards. The EUs AI Act opens the door to permanent facial surveillance in real time: Over 7,000 people are wanted by European arrest warrant for the offences listed in the AI Act. Any public space in Europe can be placed under permanent biometric mass surveillance on these grounds.

Related:Biden Pens Landmark AI Executive Order

He added, This law legitimizes and normalizes a culture of mistrust. It leads Europe into a dystopian future of a mistrustful high-tech surveillance state.

Earlier this month, EU member states endorsed the AI Act deal reached in December.

Margrethe Vestager, the EUs digital chief, said the AI Act was urgently needed considering the recent spread of fake sexually explicit images of pop star Taylor Swift on social media.

What happened to @taylorswift13 tells it all: the #harm that #AI can trigger if badly used, the responsibility of #platforms, & why it is so important to enforce #tech regulation, she posted on X (formerly Twitter).

Var Shankar, Responsible AI Institutes executive director, tells InformationWeek in an email interview that the EU Act may complement other AI regulatory initiatives. The EU AI Act represents a thoughtful and comprehensive approach to AI governance and positions the EU as a leader in setting global rules for AI use, he says. At the same time, it is not clear whether we will see a Brussels-effect like we did with GDPR [the EUs General Data Protection Regulation], since the US and China both have well-developed AI governance models and host the largest AI companies.

Shankar adds, Organizations are also looking to international AI standards and to efforts like the G7s Hiroshima code of Conduct for Advanced AI Systems to help guide an international consensus on what constitutes responsible AI implementation.

Go here to read the rest:
EU AI Act Takes Another Step Forward - InformationWeek

Read More..

Revolutionizing Pancreatic Cancer Treatment through Innovative Research – Medriva

Revolutionizing Pancreatic Cancer Treatment through Innovative Research

A recently published research paper in Oncotargets Volume 15 titled Genetic and therapeutic landscapes in cohort of pancreatic adenocarcinomas: next-generation sequencing and machine learning for full tumor exome analysis has made significant strides in the field of pancreatic cancer research. This study aims to construct a mutational landscape of pancreatic adenocarcinomas (PCa) in the Russian population using cutting-edge techniques like full exome next-generation sequencing (NGS) and machine learning.

The researchers used a limited group of patients to create a comprehensive map of the genetic changes and heterogenic molecular profile of PCa. This map was made possible by full exome next-generation sequencing (NGS) that allowed for a complete analysis of the tumor exome. This innovative approach gives unprecedented insights into the genetic alterations and mutations that drive the progression of PCa.

Equally impressive is the application of machine learning models on the individual full exome data. This high-tech approach was used to generate personalized recommendations for targeted treatment options for each clinical case. These recommendations were then compiled into a unique therapeutic landscape, providing a comprehensive, personalized view of potential treatment strategies for each patient based on their unique genetic profile.

The results of this study are incredibly promising for the future of pancreatic adenocarcinomas treatment. By combining full exome next-generation sequencing with machine learning, researchers can now provide personalized, targeted treatment options based on a patients unique genetic profile. This approach could revolutionize how we approach and treat pancreatic adenocarcinomas, moving away from a one-size-fits-all approach to a more personalized and effective treatment strategy.

This groundbreaking study represents a significant advancement in the field of personalized medicine. The use of next-generation sequencing and machine learning to analyze the tumor exome in pancreatic adenocarcinomas has opened up new avenues for customized treatment options. These findings not only provide hope for patients diagnosed with PCa but also serve as an inspiration for further research in the field of personalized medicine. As we continue to unlock the secrets of our genetic code, the possibilities for personalized, targeted treatment are almost limitless.

Original post:
Revolutionizing Pancreatic Cancer Treatment through Innovative Research - Medriva

Read More..

AMD expands its AI development tools with ROCm 6.0 – OC3D

With ROCm 6.0, AMD has expanded their AI development tools with expanded software support and GPU support. Now, ROCm supports AMDs Radeon RX 7900 XTX, RX 7900 XT, and RX 7900 GRE, as well as AMDs Radeon Pro W7900 and Pro W7800 GPUs.

With this expanded hardware support, AMD are giving AI researchers and engineers the ability to use a broader selection of Radeon hardware. With support for Radeon desktop GPUs, AMD are offering this support at very affordable pricing levels. Sadly, AMDs ROCm platform does not support AMDs Radeon RX 7800 XT or RX 7700 XT, though this may change with future versions of ROCm.

Building on our previously announced support of the AMD Radeon RX 7900 XT, XTX and Radeon PRO W7900 GPUs with AMD ROCm 5.7 and PyTorch, we are now expanding our client-based ML Development offering, both from the hardware and software side with AMD ROCm 6.0.

Firstly, AI researchers and ML engineerscan now also develop onRadeon PRO W7800and onRadeon RX 7900 GREGPUs. With support for such a broad product portfolio, AMD is helping the AI community to get access to desktop graphics cards at even more price points and at different performance levels.

Furthermore, we arecomplementing our solution stack with support for ONNX Runtime. ONNX, short for Open Neural Network Exchange, is an intermediary Machine Learning framework used to convert AI models between different ML frameworks. As a result, users can now perform inference on a wider range of source data on local AMD hardware. This also adds INT8 via MIGraphX AMDs own graph inference engine to the available data types (including FP32 and FP16).

With AMD ROCm 6.0, we are continuing our support forthe PyTorch frameworkbringing mixed precision with FP32/FP16 to Machine Learning training workflows.

These are exciting times for anyone deciding to start working on AI. ROCm for AMD Radeon desktop GPUs is a great solution for AI engineers, ML researchers and enthusiasts alike and no longer remains exclusive to those with large budgets. AMD is determined to keep broadening hardware support and adding more capabilities to our Machine Learning Development solution stack over time.

On the software side, ROCm 6.0 features support for the ONNX runtime. ONNX (Open Neural Network Exchange) is an open standard for machine learning algorithms and software tools that allows developers to easily convert AI models to new frameworks. This support allows AI researchers and developers to to utilise a wider range of source data.

With ROCm 6.0, AMD is allowing AI developers to work with more affordable graphics cards and with a wider range of data sources. This is great news for both developers and AMD themselves. Developers now have stronger AI development tools from AMD. AMD now has more selling points for their Navi 31 based Radeon graphics cards. Thats a win-win.

In the future, we would like to see AMDs RX 7800 XT support ROCm. This would further expand the Radeon GPU options that developers have. If AMDs 16GB RX 7900 GRE can support ROCm, there is no reason why AMDs 16GB RX 7800 XT shouldnt. Perhaps we will see this with ROCm 6.x.

You can join the discussion on AMDs expanded ROCm support for Radeon GPUs on the OC3D Forums.

See more here:
AMD expands its AI development tools with ROCm 6.0 - OC3D

Read More..

Hugging Face: Everything you need to know about the AI platform – Android Police

Hugging Face is a platform for viewing, sharing, and showcasing machine learning models, datasets, and related work. It aims to make Neural Language Models (NLMs) accessible to anyone building applications powered by machine learning. Many popular AI and machine-learning models are accessible through Hugging Face, including LLaMA 2, an open source language model that Meta developed in partnership with Microsoft.

Hugging Face is a valuable resource for beginners to get started with machine-learning models. You don't need to pay for any special apps or programs to get started. You only need a web browser to browse and test models and datasets on any device, even on budget Chromebooks.

Hugging Face provides machine-learning tools for building applications. Notable tools include the Transformers model library, pipelines for performing machine-learning tasks, and collaborative resources. It also offers dataset, model evaluation, simulation, and machine learning libraries. Hugging Face can be summarized as providing these services:

Hugging Face receives funding from companies including Google, Amazon, Nvidia, Intel, and IBM. Some of these companies have created open source models accessible through Hugging Face, like the LLaMA 2 model mentioned at the beginning of this article.

The number of models available through Hugging Face can be overwhelming, but it's easy to get started. We walk you through everything you need to know about what you can do with Hugging Face and how to create your own tools and applications.

The core of Hugging Face is the Transformers model library, dataset library, and pipelines. Understanding these services and technologies gives you everything you need to use Hugging Face's resources.

The Transformers model library is a library of open source transformer models. Hugging Face has a library of over 495,000 models grouped into data types called modalities. You can use these models to perform tasks with pipelines, which we explain later in this article.

Some of the tasks you can perform through the Transformers model library are:

A complete list of these tasks can be seen on the Hugging Face website, categorized for easy searching.

Within these categories are numerous user-created models to choose from. For example, Hugging Face currently hosts over 51,000 models for Text Generation.

If you aren't sure how to get started with a task, Hugging Face provides in-depth documentation on every task. These docs include use cases, explanations of model and task variants, relevant tools, courses, and demos. For example, the demo on the Text Generation task page uses the Zephyr language models to complete models. You'll refer to the model for instructions on how to use it for the task.

These tools make experimenting with models easy. While some are pre-trained with data, you'll need datasets for others, which is where the datasets library comes into play.

The Hugging Face datasets library is suitable for all machine-learning tasks offered within the Hugging Face model library. Each dataset contains a dataset viewer, a summary of what's included in the dataset, the data size, suggested tasks, data structure, data fields, and other relevant information.

For example, the Wikipedia dataset contains cleaned Wikipedia articles of all languages. It has all the necessary documentation for understanding and using the dataset, including helpful tools like a data visualization map of the sample data. Depending on what dataset you access, you may see different examples.

Models and datasets are the power behind performing tasks from Hugging Face, but pipelines make it easy to use these models to complete tasks.

Hugging Face's pipelines simplify using models through an API that cuts out using abstract code. You can provide a pipeline with multiple models by specifying which one you want to use for specific actions. For example, you can use one model for generating results from an input and another for analyzing them. This is where you'll need to refer to the model page you used for the results to interpret the formatted results correctly.

Hugging Face has a full breakdown of the tasks you can use pipelines for.

Now you have an understanding of the models, datasets, and pipelines provided by Hugging Face, you're ready to use these assets to perform tasks.

You only need a browser to get started. We recommend using Google Colab, which lets you write and execute Python code in your browser. It provides free access to computing resources, including GPUs and TPUs, making it ideal for basic machine-learning tasks. Google Colab is easy to use and requires zero setup.

After you've familiarized yourself with Colab, you're ready to install the transformer libraries using the following command:

Then check it was installed correctly using this command:

You're now ready to dive into Hugging Face's libraries. There are a lot of places to start, but we recommend Hugging Face's introductory course, which explains the concepts we outlined earlier in detail with examples and quizzes to test your knowledge.

Collaboration is a huge part of Hugging Face, allowing you to discuss models and datasets with other users. Hugging Face encourages collaboration through a discussion forum, a community blog, Discord, and classrooms.

Models and datasets on Hugging Face also have their own forums where you can discuss errors, ask questions, or suggest use cases.

Machine learning and AI are daunting for beginners, but platforms like Hugging Face provide a great way to introduce these concepts. Many of the popular models on Hugging Face are large language models (LLMs), so familiarize yourself with LLMs if you plan to use machine-learning tools for text generation or analysis.

Read more:
Hugging Face: Everything you need to know about the AI platform - Android Police

Read More..

Microsoft, OpenAI: US Adversaries Armed with GenAI – InformationWeek

Microsoft and OpenAI say Iran, North Korea, Russia, and China have started arming their US cyberattack efforts with generative artificial intelligence (GenAI).

The companies said in a blog post on Microsofts website Wednesday that they jointly detected and stopped attacks using their AI technologies. The companies listed several examples of specific attacks using large language models to enhance malicious social engineering efforts -- leading to better deepfakes and voice cloning attempting to crack US systems.

Micosoft said North Koreas Kimsuky cyber group, Irans Revolutionary Guard, Russias military, and a Chinese cyberespionage called Aquatic Panda, all used the companies large language model tools for potential attacks and malicious activity. The attack from Iran included phishing emails pretending to come from an international development agency and another attempting to lure prominent feminists to an attacker-built website on feminism.

Cyberattacks from foreign adversaries have been steadily increasing in severity and complexity. This month, the Cybersecurity and Infrastructure Agency (CISA) said China-backed threat actor Volt Typhoon targeted several western nations critical infrastructure and have had access to the systems for at least five years. Experts fear such attacks will only increase in severity as nation-states use GenAI to enhance their efforts.

Related:Firms Arm US Against AI Cyberattacks

Nazar Tymoshyk, CEO at cybersecurity firm UnderDefense, tells InformationWeek in a phone interview that even as threats become more sophisticated through GenAI, the fundamentals for cybersecurity should stay the same. The onus for safeguarding, he said, is on the company producing AI. Every product is AI-enabled, so its now a feature in every program, he says. It becomes impossible to distinguish between whats an AI attack. So, its the company who is responsible to put additional controls in place.

Microsoft called the attack attempts early stage, and our research with OpenAI has not identified significant attacks employing the LLMs we monitor closely. At the same time we feel this is important research to expose early stage, incremental moves that we observe well-known threat actors attempting, and share information on how we are blocking and countering them with the defender community.

The companies say hygiene practices like multifactor authentication and zero-trust defenses are still vital weapons against attacks -- AI-enhanced or not. While attackers will remain interested in AI and probe technologies current capabilities and security controls, its important to keep these risks in context.

Related:What CISOs Need to Know About Nation-State Actors

In a separate blog post, OpenAI says it will continue to work with Microsoft to identify potential threats using GenAI models.

Although we work to minimize potential misuse by such actors, we will not be able to stop every instance. But by continuing to innovate and investigate, collaborate, and share, we make it harder for malicious actors to remain undetected across the digital ecosystem and improve the experience for everyone else.

OpenAI declined to make an executive available for comment.

While Microsoft and OpenAIs report was focused on how threat actors are using AI tools for attacks, AI can also be a vector for attack. Thats an important thing to remember with businesses implementing GenAI tools at a feverish pace, Chris Tito Sestito, CEO and co-founder of adversarial AI firm HiddenLayer tells InformationWeek in an email.

Artificial intelligence is, by a wide margin, the most vulnerable technology ever to be deployed in production systems, Sestito says. Its vulnerable at a code level, during training and development, post-deployment, over networks, via generative outputs and more. With AI being rapidly implemented across sctors, there has also been a substantial rise in intentionally harmful attacks providing why defensive solutions to secure AI are needed.

Related:Microsoft IDs Russia-Backed Actor Behind Leadership Email Hacks

He adds, Security has to maintain pace with AI to accelerate innovation. Thats why its imperative to safeguard your most valuable assets from development to implementation companies must regularly update and refine their AI-specific security program to address new challenges and vulnerabilities.

See the rest here:
Microsoft, OpenAI: US Adversaries Armed with GenAI - InformationWeek

Read More..

Top 5 Robot Trends 2024 International Federation of Robotics reports | RoboticsTomorrow – Robotics Tomorrow

The stock of operational robots around the globe hit a new record of about 3.9 million units. This demand is driven by a number of exciting technological innovations. The International Federation of Robotics reports about the top 5 automation trends in 2024:

The trend of using Artificial Intelligence in robotics and automation keeps growing. The emergence of generative AI opens-up new solutions. This subset of AI is specialized to create something new from things its learned via training, and has been popularized by tools such as ChatGPT. Robot manufacturers are developing generative AI-driven interfaces which allow users to program robots more intuitively by using natural language instead of code. Workers will no longer need specialized programming skills to select and adjust the robots actions.

Another example is predictive AI analyzing robot performance data to identify the future state of equipment. Predictive maintenance can save manufacturers machine downtime costs. In the automotive parts industry, each hour of unplanned downtime is estimated to cost US$1.3m - the Information Technology & Innovation Foundation reports. This indicates the massive cost-saving potential of predictive maintenance. Machine learning algorithms can also analyze data from multiple robots performing the same process for optimization. In general, the more data a machine learning algorithm is given, the better it performs.

Human-robot collaboration continues to be a major trend in robotics. Rapid advances in sensors, vision technologies and smart grippers allow robots to respond in real-time to changes in their environment and thus work safely alongside human workers.

Collaborative robot applications offer a new tool for human workers, relieving and supporting them. They can assist with tasks that require heavy lifting, repetitive motions, or work in dangerous environments.

The range of collaborative applications offered by robot manufacturers continues to expand.

A recent market development is the increase of cobot welding applications, driven by a shortage of skilled welders. This demand shows that automation is not causing a labor shortage but rather offers a means to solve it. Collaborative robots will therefore complement not replace investments in traditional industrial robots which operate at much faster speeds and will therefore remain important for improving productivity in response to tight product margins.

New competitors are also entering the market with a specific focus on collaborative robots. Mobile manipulators, the combination of collaborative robot arms and mobile robots (AMRs), offer new use cases that could expand the demand for collaborative robots substantially.

Mobile manipulators so called MoMas - are automating material handling tasks in industries such as automotive, logistics or aerospace. They combine the mobility of robotic platforms with the dexterity of manipulator arms. This enables them to navigate complex environments and manipulate objects, which is crucial for applications in manufacturing. Equipped with sensors and cameras, these robots perform inspections and carry out maintenance tasks on machinery and equipment. One of the significant advantages of mobile manipulators is their ability to collaborate and support human workers. Shortage of skilled labor and a lack of staff applying for factory jobs is likely to increase demand.

Digital twin technology is increasingly used as a tool to optimize the performance of a physical system by creating a virtual replica. Since robots are more and more digitally integrated in factories, digital twins can use their real-world operational data to run simulations and predict likely outcomes. Because the twin exists purely as a computer model, it can be stress-tested and modified with no safety implications while saving costs. All experimentation can be checked before the physical world itself is touched. Digital twins bridge the gap between digital and physical worlds.

Robotics is witnessing significant advancements in humanoids, designed to perform a wide range of tasks in various environments. The human-like design with two arms and two legs allows the robot to be used flexibly in work environments that were actually created for humans. It can therefore be easily integrated e.g. into existing warehouse processes and infrastructure.

The Chinese Ministry of Industry and Information Technology (MIIT) recently published detailed goals for the countrys ambitions to mass-produce humanoids by 2025. The MIIT predicts humanoids are likely to become another disruptive technology, similar to computers or smartphones, that could transform the way we produce goods and the way humans live.

The potential impact of humanoids on various sectors makes them an exciting area of development, but their mass market adoption remains a complex challenge. Costs are a key factor and success will depend on their return on investment competing with well-established robot solutions like mobile manipulators, for example.

The five mutually reinforcing automation trends in 2024 show that robotics is a multidisciplinary field where technologies are converging to create intelligent solutions for a wide range of tasks, says Marina Bill, President of the International Federation of Robotics. These advances continue to shape the merging industrial and service robotics sectors and the future of work.

Read more:
Top 5 Robot Trends 2024 International Federation of Robotics reports | RoboticsTomorrow - Robotics Tomorrow

Read More..