Google parent Alphabet on December 6 unveiled Gemini, its largest and most capable AI model to date, as the tech giant looks to take on rivals OpenAI's GPT-4 and Meta's Llama 2 in a race to lead the nascent artificial intelligence (AI) space.
This is the first AI model from Alphabet after the merger of its AI research units, DeepMind and Google Brain, into a single division called Google DeepMind, led by DeepMind CEO Demis Hassabis.
Gemini has been built from the ground up and is "multimodal" in nature, meaning it can understand and work with different types of information, including text, code, audio, image and video, at the same time.
The AI model will be available in three different sizes: Ultra (for highly complex tasks), Pro (for scaling across a wide range of tasks) and Nano (on-device tasks).
"These are the first models of the Gemini era and the first realisation of the vision we had when we formed Google DeepMind earlier this year. This new era of models represents one of the biggest science and engineering efforts weve undertaken as a company," said Alphabet CEO Sundar Pichai.
Gemini Pro will be accessible to developers through the Gemini API in Google AI Studio and Google Cloud Vertex AI starting December 13.
On the other hand, Gemini Nano will be accessible to Android developers through AICore, a new system capability introduced in Android 14. This capability will be made available on Pixel 8 Pro devices starting December 6, with plans to extend support to other Android devices in the future.
Google's AI model Gemini will be available in three different sizes: Ultra, Pro and Nano
Gemini Ultra is currently being made available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback with a broader rollout to developers and enterprise customers early next year.
Also read:Google parent to make 'meaningful' investments to double down on its AI efforts, says CEO Sundar Pichai
Google will also be using Gemini across all its products. Starting December 6, Bard will use a fine-tuned version of Gemini Pro for more advanced reasoning, planning, and understanding.
Meanwhile, Gemini Nano will be powering new features on Pixel 8 Pro smartphones like 'Summarise' in the Recorder app and will soon be available in Smart Reply in Gboard, starting with WhatsApp - with more messaging apps coming next year.
Gemini is also being used to make Google's generative AI search offering Search Generative Experience (SGE) faster for users. The company said that they witnessed a 40 percent reduction in latency in English in the United States, alongside improvements in quality.
Hassabis said that Gemini will be integrated into more of the company's products and services, including Search, Ads, Chrome, and Duet AI in the coming months.
'Transition to AI far bigger than mobile or web'
Pichai said that every technology shift is an opportunity to advance scientific discovery, accelerate human progress and improve lives.
"I believe the transition we are seeing right now with AI will be the most profound in our lifetimes, far bigger than the shift to mobile or the web before it," he said.
Pichai added "AI has the potential to create opportunities - from the everyday to the extraordinary for people everywhere. It will bring new waves of innovation and economic progress and drive knowledge, learning, creativity, and productivity on a scale we havent seen before...Were only beginning to scratch the surface of whats possible."
Alphabet first previewed Gemini in its annual developer conference Google I/O in May 2023. This launch comes at a time when the tech giant is racing to catch up with Microsoft-backed OpenAI which released its latest AI model GPT-4 Turbo during its OpenAI DevDay last month. GPT-4 Turbo is an improved version of the AI upstart's flagship GPT-4 model that was released in March 2023.
Also read:Generative AI helping us reimagine Search, other products: Alphabet's Sundar Pichai
Most flexible model yet
In a blog post, Hassabis said that Gemini Ultra's performance exceeds the current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
It is also the first model to outperform human experts on MMLU (massive multitask language understanding) benchmark, which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.
Meanwhile, Gemini Pro outperformed GPT-3.5 in six of eight benchmarks including in MMLU and GSM8K (Grade School Math 8K), which measures grade school math reasoning, before its public launch, said Sissie Hsiao, Vice-President, Google Assistant and Bard.
"This is a significant milestone in the development of AI, and the start of a new era for us at Google as we continue to rapidly innovate and responsibly advance the capabilities of our models," Hassabis said.
Hassabis said that for a long time, they wanted to build a new generation of AI models, inspired by the way people understand and interact with the world. "AI that feels less like a smart piece of software and more like something useful and intuitive an expert helper or assistant. Today, were a step closer to this vision," he said.
He mentioned that Gemini is their most flexible model yet since it can run efficiently on everything from data centres to mobile devices and its capabilities will significantly enhance the way developers and enterprise customers build and scale with AI.
Hassabis said that the multimodal reasoning capabilities of the first version of Gemini can help make sense of complex written and visual information, due to which it can extract insights from hundreds of thousands of documents through reading, filtering and understanding information.
He said it also better understands nuanced information and can answer questions relating to complicated topics, making it adept at explaining reasoning in complex subjects like math and physics.
The AI model can also understand, explain, and generate high-quality code in many popular programming languages, like Python, Java, C++ and Go.
"Were working hard to further extend its capabilities for future versions, including advances in planning and memory, and increasing the context window for processing even more information to give better responses," Hassabis said.
Considering Gemini's capabilities, Alphabet is also adding new protections building upon its safety policies and AI principles to tackle potential risks.
"Weve conducted novel research into potential risk areas like cyber-offence, persuasion, and autonomy, and have applied Google Researchs best-in-class adversarial testing techniques to help identify critical safety issues in advance of Geminis deployment," Hassabis said.
The company is also working with a diverse group of external experts and partners to stress-test their models across a range of issues, he said.
View original post here:
Google unveils Gemini, its largest AI model, to take on OpenAI - Moneycontrol
Read More..