Inside DeepMinds effort to understand its own creations – Semafor

With missteps at industry leader OpenAI possibly providing an opening for rivals touting safety advances, Google DeepMind unveiled fresh details of how its building systems to catch potentially dangerous leaps in artificial intelligence capabilities.

OpenAI has tried to reassure the public, announcing a new safety committee earlier this week, after a top safety researcher joined rival firm Anthropic. That move came before actress Scarlett Johansson accused Sam Altmans firm of using her voice without her permission for ChatGPT.

With AI guardrails becoming a possible competitive advantage, Google DeepMind executives told Semafor that the methods for predicting and identifying threats will likely involve a combination of humans and what the company calls auto evaluations, in which AI models analyze other models or even themselves.

The effort, though, has become particularly challenging, now that the most advanced AI models have made the jump to multimodality, meaning they were trained not only on text, but video and audio as well, they said.

We have some of the best people in the world working on this, but I think everybody recognizes the field of science and evaluations is still very much an area where we need additional investment research, collaboration and also best practices, said Tom Lue, general counsel and head of governance at Google DeepMind.

Google, which released a comprehensive new framework earlier this month to assess the dangers of AI models, has been working on the problem for years. But the efforts have ramped up now that foundation models like GPT and DeepMinds Gemini have ignited a global, multibillion dollar race to increase the capabilities of AI models.

The challenge, though, is that the massive foundation models that power these popular products are still in their infancy. They are not yet powerful enough to pose any imminent threat, so researchers are trying to design a way to analyze a technology that has not yet been created.

When it comes to new multimodal models, automated evaluation is still in the distant horizon, said Helen King, Google DeepMinds senior director of responsibility. We havent matured the evaluation approach yet and actually trying to automate that is almost premature, she said.

See the original post:
Inside DeepMinds effort to understand its own creations - Semafor

Related Posts

Comments are closed.