Category Archives: Deep Mind
DeepMind’s AI finds new solution to decades-old math puzzle outsmarting humans – TNW
DeepMind has used a large language model (LLM) to generate a novel solution to one of humanitys toughest math problems in a breakthrough that could herald a new era in AI development.
The model, known as FunSearch, discovered a solution to the so-called cap set puzzle. The decades-old math conundrum essentially comes down to how many dots you can joint down on a page while drawing lines between them, without three of them ever forming a straight line.
If that gave you a migraine, dont worry. Whats important to note is that the problem has never been solved, and researchers have only ever found solutions for small dimensions. Until now.
FunSearch successfully discovered new constructions for large cap sets that far exceeded the best-known ones. While the LLM didnt solve the cap set problem once and for all (contrary to some of the news headlines swirling around), it did find facts new to science.
To the best of our knowledge, this shows the first scientific discovery a new piece of verifiable knowledge about a notorious scientific problem using an LLM, wrote the researchers in a paper published in Nature this week.
In previous experiments, researchers have used large language models to solve maths problems with known solutions.
FunSearch works by combining a pre-trained LLM, in this case a version of Googles PaLM 2, with an automated evaluator. This fact-checker guards against the production of false information.
LLMs have been shown to regularly produce so-called hallucinations basically when they just make shit up and present it as fact. This has, naturally, limited their usefulness in making verifiable scientific discoveries. However, researchers at the London-based lab claim that the use of an in-built fact-checker makes FunSearch different.
FunSearch engages in a continuous back-and-forth dance between the LLM and the evaluator. This process transforms initial solutions into new knowledge.
What also makes the tool quite promising for scientists is that it outputs programs that reveal how its solutions are constructed, rather than just what the solutions are.
We hope this can inspire further insights in the scientists who use FunSearch, driving a virtuous cycle of improvement and discovery, said the researchers.
Go here to see the original:
DeepMind's AI finds new solution to decades-old math puzzle outsmarting humans - TNW
DeepMind AI with built-in fact-checker makes mathematical discoveries – New Scientist
DeepMinds FunSearch AI can tackle mathematical problems
alengo/Getty Images
Google DeepMind claims to have made the first ever scientific discovery with an AI chatbot by building a fact-checker to filter out useless outputs, leaving only reliable solutions to mathematical or computing problems.
Previous DeepMind achievements, such as using AI to predict the weather or protein shapes, have relied on models created specifically for the task at hand, trained on accurate and specific data. Large language models (LLMs), such as GPT-4 and Googles Gemini, are instead trained on vast amounts of varied data to create a breadth of abilities. But that approach also makes them susceptible to hallucination, a term researchers use for producing false outputs.
Gemini which was released earlier this month has already demonstrated a propensity for hallucination, getting even simple facts such as the winners of this years Oscars wrong. Googles previous AI-powered search engine even made errors in the advertising material for its own launch.
One common fix for this phenomenon is to add a layer above the AI that verifies the accuracy of its outputs before passing them to the user. But creating a comprehensive safety net is an enormously difficult task given the broad range of topics that chatbots can be asked about.
Alhussein Fawzi at Google DeepMind and his colleagues have created a generalised LLM called FunSearch based on Googles PaLM2 model with a fact-checking layer, which they call an evaluator. The model is constrained to providing computer code that solves problems in mathematics and computer science, which DeepMind says is a much more manageable task because these new ideas and solutions are inherently and quickly verifiable.
The underlying AI can still hallucinate and provide inaccurate or misleading results, but the evaluator filters out erroneous outputs and leaves only reliable, potentially useful concepts.
We think that perhaps 90 per cent of what the LLM outputs is not going to be useful, says Fawzi. Given a candidate solution, its very easy for me to tell you whether this is actually a correct solution and to evaluate the solution, but actually coming up with a solution is really hard. And so mathematics and computer science fit particularly well.
DeepMind claims the model can generate new scientific knowledge and ideas something LLMs havent done before.
To start with, FunSearch is given a problem and a very basic solution in source code as an input, then it generates a database of new solutions that are checked by the evaluator for accuracy. The best of the reliable solutions are given back to the LLM as inputs with a prompt asking it to improve on the ideas. DeepMind says the system produces millions of potential solutions, which eventually converge on an efficient result sometimes surpassing the best known solution.
For mathematical problems, the model writes computer programs that can find solutions rather than trying to solve the problem directly.
Fawzi and his colleagues challenged FunSearch to find solutions to the cap set problem, which involves determining patterns of points where no three points make a straight line. The problem gets rapidly more computationally intensive as the number of points grows. The AI found a solution consisting of 512 points in eight dimensions, larger than any previously known.
When tasked with the bin-packing problem, where the aim is to efficiently place objects of various sizes into containers, FunSearch found solutions that outperform commonly used algorithms a result that has immediate applications for transport and logistics companies. DeepMind says FunSearch could lead to improvements in many more mathematical and computing problems.
Mark Lee at the University of Birmingham, UK, says the next breakthroughs in AI wont come from scaling-up LLMs to ever-larger sizes, but from adding layers that ensure accuracy, as DeepMind has done with FunSearch.
The strength of a language model is its ability to imagine things, but the problem is hallucinations, says Lee. And this research is breaking that problem: its reining it in, or fact-checking. Its a neat idea.
Lee says AIs shouldnt be criticised for producing large amounts of inaccurate or useless outputs, as this is not dissimilar to the way that human mathematicians and scientists operate: brainstorming ideas, testing them and following up on the best ones while discarding the worst.
Topics:
Here is the original post:
DeepMind AI with built-in fact-checker makes mathematical discoveries - New Scientist
Google’s DeepMind creates generative AI model with fact checker to crack unsolvable math problem – SiliconANGLE News
Google LLCs DeepMind artificial intelligence research unit claims to have cracked an unsolvable math problem using a large language model-based chatbot equipped with a fact-checker to filter out useless outputs.
By using a filter, DeepMind researchers say the LLM can generate millions of responses, but only submit the ones that can be verified as accurate.
Its a milestone achievement, as previous DeepMind breakthroughs have generally relied on AI models that were specifically created to solve the task in hand, such as predicting weather or designing new protein shapes. Those models were trained on very accurate and specific datasets, which makes them quite different from LLMs such as OpenAIs GPT-4 or Googles Gemini.
Those LLMs are trained on vast and varied datasets, enabling them to perform a wide range of tasks and talk about almost any subject. But the approach carries risks, as LLMs are susceptible to so-called hallucinations, which is the term for producing false outputs.
Hallucinations are a big problem for LLMs. Gemini, which was only released this month and is said to be Googles most capable LLM ever, has already shown its vulnerable, inaccurately answering fairly simple questions such as who won this years Oscars.
Researchers believe that hallucinations can be fixed by adding a layer above the AI model that verifies the accuracy of its outputs before passing them onto users. But this kind of safety net is tricky to build when LLMs have been trained to discuss such a wide range of topics.
At DeepMind, Alhussein Fawzi and his team members created a generalized LLM called FunSearch, which is based on Googles PaLM2 model. They added a fact-checking layer, called an evaluator. In this case, FunSearch has been geared to solving only math and computer science problems by generating computer code. According to DeepMind, this makes it easier to create a fact-checking layer, because its outputs can be rapidly verified.
Although the FunSearch model is still susceptible to hallucinations and generating inaccurate or misleading results, the evaluator can easily filter them out, and ensure the user only receives reliable outputs.
We think that perhaps 90% of what the LLM outputs is not going to be useful, Fawzi said. Given a candidate solution, its very easy for me to tell you whether this is actually a correct solution and to evaluate the solution, but actually coming up with a solution is really hard. And so mathematics and computer science fit particularly well.
According to Fawzi, FunSearch is able to generate new scientific knowledge and ideas, which is a new milestone for LLMs.
The researchers tested its abilities by giving it a problem, plus a very basic solution in source code, as an input. Then, the model generated a database of new solutions that were checked by the evaluator for their accuracy. The most reliable of those solutions are then fed back into the LLM as inputs, together with a prompt asking it to improve on its ideas. According to Fawzi, by doing it this way, FunSearch produces millions of potential solutions that eventually converge to create the most efficient result.
When tasked with mathematical problems, FunSearch writes computer code that can find the solution, rather than trying to tackle it directly.
Fawzi and his team tasked FunSearch with finding a solution to the cap set problem, which involves determining patterns in points, where no three points make a straight line. As the number of points grows, the problem becomes vastly more complex.
However, FunSearch was able to create a solution consisting of 512 points across eight dimensions, which is larger than any human mathematician has managed. The results of the experiment were published in the journal Nature.
Although most people are unlikely ever to come across the cap set problem, let alone attempt to solve it, its an important achievement. Even the best human mathematicians do not agree on the best way to solve this challenge. According to Terence Tao, a professor at the University of California, who describes the cap set problem as his favorite open question, FunSearch is an extremely promising paradigm since it can potentially be applied to many other math problems.
FunSearch proved as much when tasked with the bin-packing problem, where the goal is to efficiently place objects of different sizes into the least number of containers as is possible. Fawzi said FunSearch was able to find solutions that outperform the best algorithms created to solve this particular problem. Its results could have significant implications in industries such as transport and logistics.
FunSearch is also notable because, unlike with other LLMs, users can actually see how it goes about generating its outputs, meaning they can learn from it. This sets it apart from other LLMs, where the AI is more akin to a black box.
TheCUBEis an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate thecontent you create as well Andy Jassy
THANK YOU
The rest is here:
Google's DeepMind creates generative AI model with fact checker to crack unsolvable math problem - SiliconANGLE News
DeepMind claims its AI can tackle unsolved mathematics – SiliconRepublic.com
The company said its FunSearch AI model has an automated evaluator to prevent hallucinations, allowing the model to find the best answers for advanced problems.
Google-owned DeepMind claims one of its AI models found a new answer for an unsolved mathematical problem, by tackling one of the biggest issues in large language models (LLMs).
This key issue is the tendency for these AI models to share factually incorrect information, which is commonly referred to as hallucinations. This issue has been noted in many popular AI models, such as ChatGPT which has faced lawsuits for defaming individuals.
DeepMind claims its AI model FunSearch tackles this issue by including an automated evaluator that protects against hallucinations and incorrect ideas.
The company tested this model on an unsolved maths problem known as the cap set problem, which involves finding the largest size of a certain type of set. DeepMind claims FunSearch discovered new constructions of large cap sets that go beyond the best known ones.
In addition, to demonstrate the practical usefulness of FunSearch, we used it to discover more effective algorithms for the bin-packing problem, which has ubiquitous applications such as making data centres more efficient, DeepMind said in a blogpost.
The AI model contains the automated evaluator and a pre-trained LLM that aims to provide creative solutions to problems. DeepMind claims the back-and-forth of these two components creates an evolutionary method of finding the best ways to solve a problem.
Problems are presented to the AI model in the form of code, which includes a procedure to evaluate programs and a seed program used to initialise a pool of programs. DeepMind said FunSearch then selects some programs and creatively builds upon them.
The results are evaluated and the best ones are added back to the pool of existing programs, which creates a self-improving loop according to DeepMind.
FunSearch demonstrates that if we safeguard against LLMs hallucinations, the power of these models can be harnessed not only to produce new mathematical discoveries, but also to reveal potentially impactful solutions to important real-world problems, DeepMind said.
DeepMind has claimed to hit multiple breakthroughs with the power of AI. Last year, DeepMind claimed its AlphaFold model predicted the structure of nearly every protein known to science more than 200m in total.
At the end of October, DeepMind claimed the next version of AlphaFold can predict nearly all molecules in the Protein Data Bank a database for the 3D structures of various biological molecules.
DeepMind also claims that one of its AI models GraphCast canpredict weather conditions up to 10 days in advance and in a more accurate way than standard industry methods. Meanwhile, the company claims one of its AI models has been used by researchers to create hundreds of new materials in laboratory settings.
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republics digest of essential sci-tech news.
Read the original post:
DeepMind claims its AI can tackle unsolved mathematics - SiliconRepublic.com
Google DeepMind announces the FunSearch training method – Gizchina.com
Google DeepMind has introduced a new method called FunSearch. This method uses Large Language Models (LLMs) to search for new solutions in mathematics and computer science. The method is described in a paper published in Nature. FunSearch is an evolutionary method that promotes and develops the highest-scoring ideas which reflects as computer programs. The running and evaluation of these programs are automatic. The system selects some programs from the current pool of programs, which are fed to an LLM. The LLM creatively builds upon these and generates new programs, which are automatically evaluated. The best ones are added back to the pool of existing programs, creating a self-improving loop. FunSearch uses Googles PaLM 2, but it is compatible with other LLMs trained on code.
According to Google,FunSearch can calculate upper-limit problems and a series of complex problems involving mathematics and computer science. FunSearch model training method mainly introduces an Evaluator system for the AI model. The AI model outputs a series of creative problem-solving methods and evaluation methods. The processor is responsible for judging the problem-solving method output by the model. After multiple iterations, an AI model with stronger mathematical capabilities can join the training.
Google DeepMind used the PaLM 2 model for testing. The researchers established a dedicated code pool, used code form to input a series of questions for the model, and set up an evaluator process. After that, the model would automatically be drawn from the code pool in each iteration. Select problems, generate creative new solutions and submit them to the evaluator for evaluation. The best solution will be re-added to the code pool and start another iteration.
FunSearch uses an iterative procedure. First, the user writes a description of the problem in the form of code. This description comprises a procedure to evaluate programs, and a seed program used to initialize a pool of programs. At each iteration, the system selects some programs from the current pool of programs, which are fed to an LLM. The LLM creatively builds upon these and generates new programs which get automatic evaluation. The best ones are added back to the pool of existing programs, creating a self-improving loop. FunSearch uses Googles PaLM 2, but it is compatible with other LLMs trained on code.
Discovering new mathematical knowledge and algorithms in different domains is notoriously and largely beyond the power of the most advanced AI systems. To tackle such challenging problems with FunSearch, there is a need to use multiple key components. FunSearch generates programs that describe how those solutions were arrived at. This show-your-working approach is how scientists generally operate, with discoveries or phenomena explained through the process used to produce them. FunSearch favours finding solutions represented by highly compact programs solutions with a low Short program can describe very large objects, allowing FunSearch to scale.
Google said the FunSearch training method is particularly good at Discrete Mathematics (Combinatorics). The model trained by the training method can easily solve extreme value combinatorial mathematics problems. The researchers introduced in a press release a process method for model calculation of upper-level problems (a central problem in mathematics involving counting and permutations).
To test its versatility, the researchers used FunSearch to approach another hard problem in math: the bin packing problem, which involves trying. The researchers left out the lines in the program that would specify how to solve it. That is where FunSearch comes in. It gets Codey to fill in the blanksin effect, to suggest code that will solve the problem. A second algorithm then checks and scores what Codey comes up with. The best suggestionseven if not yet correctare saved and given back to Codey, which tries to complete the program again. Many will be nonsensical, some will be sensible, and a few will be decent, says Kohli. You take those truly inspired ones and you say, Okay, take these ones and repeat..
The Bin Packing Problem is a problem of putting items of different sizes into a minimum number of containers. FunSearch provides a solution for the Bin Packing Problem. This is a just-in-time solution that generates a program that automatically adjusts based on the existing volume of the item.Researchers mentioned that comparedwith other AI training methods that use neural networks to learn, the output code of the model trained by the FunSearch training method is easier to check and deploy. This means that it is easier to be integrated into the actual industrial environment.
The AI system, called FunSearch, made progress on Set-inspired problems in combinatorics. This is a field of mathematics that studies how to count the possible arrangements of sets. FunSearch automatically creates requests for a specially trained LLM, asking it to write short computer programs that can generate solutions to a particular scenario. The system then checks quickly to see whether those solutions are better than known ones. If not, it provides feedback to the LLM so that it can improve at the next round. The way we use the LLM is as a creativity engine, says DeepMind computer scientist Bernardino. Not all programs that the LLM generates are useful, and some are so incorrect that they wouldnt even be able to run, he says.
What I find really exciting, even more so than the specific results we found, is the prospects it suggests for the future of human-machine interaction in math. Instead of generating a solution, FunSearch generates a program that finds the solution. A solution to a specific problem might give me no insight into how to solve other related problems. But a program that finds the solution Artificial intelligence researchers claim to have made the worlds first scientific discovery using a breakthrough that suggests the technology behind ChatGPT and similar programs can generate.
FunSearch is a new method that uses Large Language Models (LLMs) to search for new solutions in mathematics and computer science. The details of the description of this method are in an academic paper in Nature, a top academic journal. FunSearch is an evolutionary method that promotes and develops the highest-scoring ideas in computer programs. The process of running and evaluating these programs is automatic. The system selects some programs from the current pool of programs, which are fed to an LLM. FunSearch uses Googles PaLM 2, but it is compatible with other LLMs that use the same code for training. FunSearch can improve manufacturing algorithms thereby optimizing logistics, and reducing energy consumption.
Efe Udin is a seasoned tech writer with over seven years of experience. He covers a wide range of topics in the tech industry from industry politics to mobile phone performance. From mobile phones to tablets, Efe has also kept a keen eye on the latest advancements and trends. He provides insightful analysis and reviews to inform and educate readers. Efe is very passionate about tech and covers interesting stories as well as offers solutions where possible.
See the original post:
Google DeepMind announces the FunSearch training method - Gizchina.com
Google’s AI powerhouse finds millions of new crystals that could change the fate of humanity forever and, for better … – TechRadar
Researchers at Google DeepMind have used artificial intelligence to discover new crystals and inorganic materials that could power future technologies as part of a landmark study.
Using the Graph Networks for Materials Exploration (GNoME) deep learning tool, researchers found 2.2 million new crystals, including 380,000 stable materials.
The discovery could represent a landmark moment in the discovery of materials used to power modern technologies, such as computer chips, batteries, and solar panels - all of which rely on inorganic crystals.
Availability and stability of these materials is a common hurdle in the development of such technologies. However, researchers said that in using the GNoME AI tool, they were able to dramatically increase the speed and efficiency of discovery by predicting the stability of new materials.
To enable new technologies, crystals must be stable otherwise they can decompose, and behind each new, stable crystal can be months of painstaking experimentation, the study noted.
With GNoME, weve multiplied the number of technologically viable materials known to humanity. Of its 2.2 million predictions, 380,000 are the most stable, making them promising candidates for experimental synthesis.
Among these candidates are materials that have the potential to develop future transformative technologies ranging from superconductors, powering supercomputers, and next-generation batteries to boost the efficiency of electric vehicles.
Google DeepMinds findings were published in the Nature journal, with the firm noting that, over the last decade, more than 28,000 new materials have been discovered following extensive research.
However, traditional AI-based approaches to searching for novel crystal structures has typically been an expensive, trial-and-error process that could take months to deliver minimal results.
AI-guided approaches hit a fundamental limit in their ability to accurately predict materials that could be experimentally viable, the study said.
GNoMEs recent discovery of 2.2 million materials would be equivalent to about 800 years worth of knowledge, researchers said, which highlights the transformative power and accuracy now afforded to scientists operating in the field.
Around 52,000 new compounds similar to graphene have been discovered as part of the project, which the study said has the potential to revolutionize electronics and superconductor development.
In previous years, just 1,000 materials of this kind had been identified via previous techniques.
We also found 528 potential lithium ion conductors, 25 times more than a previous study, which could be used to improve the performance of rechargeable batteries.
Long-term, Google DeepMind researchers said the GNoME project aims to drive down the cost of discovering new materials.
So far, external researchers have created 736 of materials discovered through GNoME in a lab environment. The company also plans to release its database of newly discovered crystals and share its findings with the research community.
By giving scientists the full catalog of the promising recipes for new candidate materials, we hope this helps them to test and potentially make the best ones.
View original post here:
Google's AI powerhouse finds millions of new crystals that could change the fate of humanity forever and, for better ... - TechRadar
Discovering the Potential of AI in Mathematics: A Breakthrough by Google DeepMind – Medriva
Discovering the Potential of AI in Mathematics
Artificial Intelligence is making strides in various fields, and mathematics is no exception. Google DeepMinds new AI tool, FunSearch, is a significant breakthrough in this regard. The tool is not limited to specific tasks like its predecessors. Instead, it utilizes a large language model named Codey to unveil new mathematical functions, demonstrating the potential of AI in fundamental math and computer science.
The process is fascinating. FunSearch suggests code that potentially solves complex mathematical problems. The incorrect or nonsensical answers are rejected, and the good ones are plugged in. After millions of suggestions and repetitions, FunSearch has successfully produced a correct and previously unknown solution to the cap set problem, a niche but important problem in mathematics. This achievement is a testament to the fact that large language models are indeed capable of making groundbreaking discoveries.
The Google DeepMind FunSearch tool is a project developed by DeepMind with the primary goal of aiding researchers and developers in discovering and exploring new ideas and research in the field of artificial intelligence. It employs advanced algorithms to sift through a massive amount of AI-related content and deliver relevant and intriguing results.
The ultimate aim of FunSearch is to foster innovation and collaboration within the AI community. The tool has been instrumental in solving complex problems, such as the cap set problem, demonstrating its potential to revolutionize the way we approach problem-solving in various fields.
Despite the evident success of FunSearch, the researchers behind it confess that they dont fully understand why it works as efficiently as it does. However, the results are undeniable and speak volumes about the tools effectiveness. The FunSearch tool is a testament to the unexplored potential of artificial intelligence, demonstrating the capacity of AI to make meaningful contributions to diverse fields.
Developed by Googles AI research lab, DeepMind, the FunSearch tool is part of Googles broader efforts to enhance the quality of search results and provide users with a more personalized and engaging experience. It uses machine learning algorithms to analyze web pages and identify content that is likely to be entertaining or enjoyable for users. This feature makes FunSearch a user-friendly tool, as it helps users find fun and interesting content on the internet.
As artificial intelligence continues to evolve, tools like FunSearch are likely to become invaluable assets in various fields, including mathematics, computer science, and beyond. The success of FunSearch is a significant step forward in the journey of AI. It underscores the transformative potential of AI, which is no longer limited to the digital world but is also making substantial contributions to fundamental sciences.
Originally posted here:
Discovering the Potential of AI in Mathematics: A Breakthrough by Google DeepMind - Medriva
Gemini AI: What do we know about Google’s answer to ChatGPT? – Livescience.com
Google DeepMind has released a rival to ChatGPT, named Gemini, and it can understand and generate multiple types of media including images, videos, audio, and text.
Most artificial intelligence (AI) tools only understand and generate one type of content. For example, OpenAI's ChatGPT, "reads" and creates only text. But Gemini can generate multiple types of output based on any form of input, Google said in a blog post.
The three versions of Gemini 1.0 are Gemini Ultra, the largest version, Gemini Pro, which is being rolled out into Google's digital services, and Gemini Nano, designed to be used on devices like smartphones.
According to DeepMind's technical report on the chatbot, Gemini Ultra beat GPT-4 and other leading AI models in 30 of 32 key academic benchmarks used in AI research and development. These include high school exams and tests on morality and law.
Specifically, Gemini won out in nine image comprehension benchmarks, six video understanding tests, five in speech recognition and translation, and 10 of 12 text and reasoning benchmarks. The two in which Gemini Ulta failed to beat GPT-4 were in common-sense reasoning, according to the report.
Related: AI is transforming every aspect of science. Here's how.
Building models that process multiple forms of media is hard because biases in the training data are likely to be amplified, performance tends to drop significantly, and models tend to overfit meaning they perform well when tested against the training data, but can't perform when exposed to new input.
Multimodal training also normally involves training different components of a model separately, each on a single type of medium and then stitching these components together. But Gemini was trained jointly across text, image, audio and video data at the same time. Scientists sourced this data from web documents, books and code.
Scientists trained Gemini by curating the training data and incorporating human supervision in the feedback process.
The team deployed servers across multiple data centers on a much grander scale than previous AI training efforts and relied on thousands of Google's AI accelerator chips known as the tensor processing units (TPUs).
DeepMind built these chips specifically to speed up model training, and DeepMind packaged them into clusters of 4,096 chips known as "SuperPods", before training its system. The overall result of the re-configured infrastructure and methods meant the goodput the volume of genuinely useful data that moved through the system (as opposed to throughput, which is all data) increased from 85% in previous training endeavors to 97%, according to the technical report.
DeepMind scientists envision the technology being used in scenarios such as a person uploading photos of a meal being prepared in real-time, and Gemini responding with instructions on the next step in the process.
That said, the scientists did concede hallucinations a phenomenon in which AI models return false information with maximum confidence remains an issue for Gemini. Hallucinations are normally caused by limitations or biases in the training data, and they're difficult to eradicate.
Link:
Gemini AI: What do we know about Google's answer to ChatGPT? - Livescience.com
Advancements in machine learning for machine learning Google Research Blog – Google Research
Posted by Phitchaya Mangpo Phothilimthana, Staff Research Scientist, Google DeepMind, and Bryan Perozzi, Senior Staff Research Scientist, Google Research
With the recent and accelerated advances in machine learning (ML), machines can understand natural language, engage in conversations, draw images, create videos and more. Modern ML models are programmed and trained using ML programming frameworks, such as TensorFlow, JAX, PyTorch, among many others. These libraries provide high-level instructions to ML practitioners, such as linear algebra operations (e.g., matrix multiplication, convolution, etc.) and neural network layers (e.g., 2D convolution layers, transformer layers). Importantly, practitioners need not worry about how to make their models run efficiently on hardware because an ML framework will automatically optimize the user's model through an underlying compiler. The efficiency of the ML workload, thus, depends on how good the compiler is. A compiler typically relies on heuristics to solve complex optimization problems, often resulting in suboptimal performance.
In this blog post, we present exciting advancements in ML for ML. In particular, we show how we use ML to improve efficiency of ML workloads! Prior works, both internal and external, have shown that we can use ML to improve performance of ML programs by selecting better ML compiler decisions. Although there exist a few datasets for program performance prediction, they target small sub-programs, such as basic blocks or kernels. We introduce TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs (presented at NeurIPS 2023), which we recently released to fuel more research in ML for program optimization. We hosted a Kaggle competition on the dataset, which recently completed with 792 participants on 616 teams from 66 countries. Furthermore, in Learning Large Graph Property Prediction via Graph Segment Training, we cover a novel method to scale graph neural network (GNN) training to handle large programs represented as graphs. The technique both enables training arbitrarily large graphs on a device with limited memory capacity and improves generalization of the model.
ML compilers are software routines that convert user-written programs (here, mathematical instructions provided by libraries such as TensorFlow) to executables (instructions to execute on the actual hardware). An ML program can be represented as a computation graph, where a node represents a tensor operation (such as matrix multiplication), and an edge represents a tensor flowing from one node to another. ML compilers have to solve many complex optimization problems, including graph-level and kernel-level optimizations. A graph-level optimization requires the context of the entire graph to make optimal decisions and transforms the entire graph accordingly. A kernel-level optimization transforms one kernel (a fused subgraph) at a time, independently of other kernels.
To provide a concrete example, imagine a matrix (2D tensor):
It can be stored in computer memory as [A B C a b c] or [A a B b C c], known as row- and column-major memory layout, respectively. One important ML compiler optimization is to assign memory layouts to all intermediate tensors in the program. The figure below shows two different layout configurations for the same program. Lets assume that on the left-hand side, the assigned layouts (in red) are the most efficient option for each individual operator. However, this layout configuration requires the compiler to insert a copy operation to transform the memory layout between the add and convolution operations. On the other hand, the right-hand side configuration might be less efficient for each individual operator, but it doesnt require the additional memory transformation. The layout assignment optimization has to trade off between local computation efficiency and layout transformation overhead.
If the compiler makes optimal choices, significant speedups can be made. For example, we have seen up to a 32% speedup when choosing an optimal layout configuration over the default compilers configuration in the XLA benchmark suite.
Given the above, we aim to improve ML model efficiency by improving the ML compiler. Specifically, it can be very effective to equip the compiler with a learned cost model that takes in an input program and compiler configuration and then outputs the predicted runtime of the program.
With this motivation, we release TpuGraphs, a dataset for learning cost models for programs running on Googles custom Tensor Processing Units (TPUs). The dataset targets two XLA compiler configurations: layout (generalization of row- and column-major ordering, from matrices, to higher dimension tensors) and tiling (configurations of tile sizes). We provide download instructions and starter code on the TpuGraphs GitHub. Each example in the dataset contains a computational graph of an ML workload, a compilation configuration, and the execution time of the graph when compiled with the configuration. The graphs in the dataset are collected from open-source ML programs, featuring popular model architectures, e.g., ResNet, EfficientNet, Mask R-CNN, and Transformer. The dataset provides 25 more graphs than the largest (earlier) graph property prediction dataset (with comparable graph sizes), and graph size is 770 larger on average compared to existing performance prediction datasets on ML programs. With this greatly expanded scale, for the first time we can explore the graph-level prediction task on large graphs, which is subject to challenges such as scalability, training efficiency, and model quality.
We provide baseline learned cost models with our dataset (architecture shown below). Our baseline models are based on a GNN since the input program is represented as a graph. Node features, shown in blue below, consist of two parts. The first part is an opcode id, the most important information of a node, which indicates the type of tensor operation. Our baseline models, thus, map an opcode id to an opcode embedding via an embedding lookup table. The opcode embedding is then concatenated with the second part, the rest of the node features, as inputs to a GNN. We combine the node embeddings produced by the GNN to create the fixed-size embedding of the graph using a simple graph pooling reduction (i.e., sum and mean). The resulting graph embedding is then linearly transformed into the final scalar output by a feedforward layer.
Furthermore we present Graph Segment Training (GST), a method for scaling GNN training to handle large graphs on a device with limited memory capacity in cases where the prediction task is on the entire-graph (i.e., graph-level prediction). Unlike scaling training for node- or edge-level prediction, scaling for graph-level prediction is understudied but crucial to our domain, as computation graphs can contain hundreds of thousands of nodes. In a typical GNN training (Full Graph Training, on the left below), a GNN model is trained using an entire graph, meaning all nodes and edges of the graph are used to compute gradients. For large graphs, this might be computationally infeasible. In GST, each large graph is partitioned into smaller segments, and a random subset of segments is selected to update the model; embeddings for the remaining segments are produced without saving their intermediate activations (to avoid consuming memory). The embeddings of all segments are then combined to generate an embedding for the original large graph, which is then used for prediction. In addition, we introduce the historical embedding table to efficiently obtain graph segments embeddings and segment dropout to mitigate the staleness from historical embeddings. Together, our complete method speeds up the end-to-end training time by 3.
Finally, we ran the Fast or Slow? Predict AI Model Runtime competition over the TpuGraph dataset. This competition ended with 792 participants on 616 teams. We had 10507 submissions from 66 countries. For 153 users (including 47 in the top 100), this was their first competition. We learned many interesting new techniques employed by the participating teams, such as:
We will debrief the competition and preview the winning solutions at the competition session at the ML for Systems workshop at NeurIPS on December 16, 2023. Finally, congratulations to all the winners and thank you for your contributions to advancing research in ML for systems!
If you are interested in more research about structured data and artificial intelligence, we hosted the NeurIPS Expo panel Graph Learning Meets Artificial Intelligence on December 9, which covered advancing learned cost models and more!
Sami Abu-el-Haija (Google Research) contributed significantly to this work and write-up. The research in this post describes joint work with many additional collaborators including Mike Burrows, Kaidi Cao, Bahare Fatemi, Jure Leskovec, Charith Mendis, Dustin Zelle, and Yanqi Zhou.
Link:
Advancements in machine learning for machine learning Google Research Blog - Google Research
From hallucinations to discovery: For the first time a large language model finds new solutions to math problems – ZME Science
Credit: DALL-E 3.
Large Language Models (LLMs) like ChatGPT have a lot of things going for them. These powerful AI systems can synthesize and interpret vast amounts of information and are surprisingly human-like with language. At the same time, theyre also notorious for making up facts with confidence. Put simply, they hallucinate, as people have come to describe this annoying behavior.
A huge question ever since this technology was released is whether LLMs are capable of discovering new knowledge, rather than repurposing and rehashing existing information. As it turns out, they can.
Researchers at Googles DeepMind branch have shown a new AI method called FunSearch, which can forge new paths to find solutions to complex problems in mathematics and computer science.
The innovation of FunSearch lies in the pairing of a pre-trained LLM with an automated evaluator. This setup is designed to leverage the LLMs strength in generating creative solutions in the form of computer code, while the evaluator rigorously checks these solutions for accuracy. The highest-performing solutions are continuously fed back into the cycle, fostering a self-improving loop of problem-solving and innovation.
This partnership enables an iterative refinement process, transforming initial creative outputs into verified, novel knowledge. The focus on discovering functions in computer code is what gives FunSearch its distinctive name and operational approach.
This initiative marks the first time LLMs have contributed to solving open problems in the scientific and mathematical community. FunSearch found novel solutions to the cap set problem, a long-standing mathematical challenge.
The Cap Set Problem in mathematics involves finding the largest subset of integers from 0 to 3n1 (where each integer is represented in base 3) such that no three integers in the subset sum to another integer in base 3. Its a challenge in combinatorics, a field concerned with counting, arrangement, and structure. Terence Tao, the highest IQ person in the world and one of the worlds leading mathematicians, once described the cap set problem as one of his favorite open questions in the field.
FunSearch succeeded in discovering new, larger cap sets, contributing valuable insights to the problem and demonstrating the potential of AI in advancing mathematical research. FunSearchs contribution marks the largest increase in the size of cap sets in the past two decades.
These results demonstrate that the FunSearch technique can take us beyond established results on hard combinatorial problems, where intuition can be difficult to build. We expect this approach to play a role in new discoveries for similar theoretical problems in combinatorics, and in the future it may open up new possibilities in fields such as communication theory, wrote the DeepMind researchers in a blog post.
Moreover, FunSearch has proven itself further by enhancing algorithms for the bin-packing problem. The bin-packing problem is a classic algorithmic challenge. It involves efficiently packing objects of different sizes into a finite number of bins or containers in a way that minimizes the number of bins used.
Contrary to many computational tools that offer solutions without explanation like a black box, FunSearch provides a detailed account of how its conclusions are reached.
This show-your-working approach is how scientists generally operate, with new discoveries or phenomena explained through the process used to produce them, add the DeepMind researchers.
The ability of FunSearch to not only generate innovative solutions but also provide the details of the problem-solving process holds immense potential. With the continual advancement of LLM technology, the capabilities of tools like FunSearch are expected to expand, paving the way for groundbreaking discoveries and solutions to some of societys most pressing scientific and engineering challenges.
The findings were reported in the journal Nature.