Page 303«..1020..302303304305..310320..»

Maybe We’ve Got The Artificial Intelligence In Law ‘Problem’ All Wrong – Above the Law

When some hapless NY lawyers submitted a brief riddled with case citations hallucinated by consumer-facing artificial intelligence juggernaut ChatGPT and then doubled down on the error, we figured the resulting discipline would serve as a wake-up call to attorneys everywhere. But there would be more. And more. Andmore.

Weve repeatedly balked at declaring this an AI problem, because nothing about these cases really turned on the technology. Lawyers have an obligation to check their citations and if theyre firing off briefs without bothering to read the underlying cases, thats a professional problem whether ChatGPT spit out the case or their summer associate inserted the wrong cite. Regulating AI for an advocate falling down on the job seemed to miss the point at best and at worst poison the well against a potentially powerful legal tool before its even gotten off the ground.

Another popular defense of AI against the slings and arrows of grandstanding judges is that the legal industry needs to remember that AI isnt human. Its just like every other powerful but ultimately dumb tool and you cant just trust it like you can a human. Conceived this way, AI fails because its not human enough. Detractors have their human egos stroked and AI champions can market their bold future where AI creeps ever closer to humanity.

But maybe weve got this all backward.

The problem with AI is that its more like humans than machines, David Rosen, co-founder and CEO of Catylex told me off-handedly the other day. With all the foibles, and inaccuracies, and idiosyncratic mistakes. Its a jarring perspective to hear after months of legal tech chit chat about generative AI. Every conversation Ive had over the last year frames itself around making AI more like a person, more able to parse through whats important and whats superfluous. Though the more I thought about it, theres something to this idea. It reminded me of my issue with AI research tools trying to find the right answer when that might not be in the lawyers or the clients best interest.

How might the whole discourse around AI change if we flipped the script?

If we started talking about AI as too human, we could worry less about figuring out how it makes a dangerous judgment call between two conclusions and worry more about a tool that tries too hard to please its bosses, makes sloppy errors when it jumps to conclusions, and holds out the false promise that it can deliver insights for the lawyers themselves. Reorient around promising a tool thats going to ruthlessly and mechanically process tons more information than a human ever could and deliver it to the lawyer in a format that the humans can digest and evaluate themselves.

Make AI Artificial Again if you will.

Joe Patriceis a senior editor at Above the Law and co-host of Thinking Like A Lawyer. Feel free toemail any tips, questions, or comments. Follow him onTwitterif youre interested in law, politics, and a healthy dose of college sports news. Joe also serves as a Managing Director at RPN Executive Search.

Read the original:
Maybe We've Got The Artificial Intelligence In Law 'Problem' All Wrong - Above the Law

Read More..

FDA approves AI-driven test for sepsis made by Prenosis – The Washington Post

Bobby Reddy Jr. roamed a hospital as he built his start-up, observing how patient care began with a diagnosis and followed a set protocol. The electrical engineer thought he knew a better way: an artificial intelligence tool that would individualize treatment.

Now, the Food and Drug Administration has greenlighted such a test developed by Reddys company, Chicago-based Prenosis, to predict the risk of sepsis a complex condition that contributes to at least 350,000 deaths a year in the United States. It is the first algorithmic, AI-driven diagnostic tool for sepsis to receive the FDAs go-ahead, the company said in a statement Wednesday.

In hospitals and emergency departments, we are still relying on one-size-fits-all, when instead we should be treating each person based on their individual biology, Reddy, the companys CEO, said in an interview.

Sepsis occurs when a patients immune system tries to fight an infection and ends up attacking the bodys own organs. Managing sepsis is a priority among federal health agencies including the Centers for Disease Control and Prevention and the Centers for Medicare and Medicaid Services.

Sepsis is a serious and sometimes deadly complication, Jeff Shuren, director of the FDAs Center for Devices and Radiological Health, said in a statement. Technologies developed to help prevent this condition have the potential to provide a significant benefit to patients.

To build its test, Prenosis acquired more than 100,000 blood samples along with clinical data on hospital patients, and trained its algorithm to recognize the health measures most associated with developing sepsis. The company narrowed its test to 22 parameters, including blood-based measures and other vital signs such as temperature and heart rate. The diagnostic tool now produces a snapshot that classifies a patients risk of sepsis in four categories, from low to very high.

Though Prenosis is the first to win FDA authorization for such a test, other companies, including Epic Systems, have already brought to market AI-driven diagnostics for the condition. Epic, known for its software that manages electronic health records, has faced questions about the accuracy of its algorithm for predicting sepsis.

Jacob Wright, an Epic spokesman, said that multiple studies have shown that its diagnostic model for sepsis improved patient outcomes, adding that a second version released in 2022 has fewer false positives when compared to the first version. The company is seeking FDA clearance, he said.

Reddy said Prenosis built its technology without initially knowing what problem it would try to solve. An Illinois hospital gave him office space and a badge, allowing him to roam the hospital and observe its staff interacting with patients. What I saw over and over again is that they really run based on protocols, he said. He later came across a paper on sepsis, he said, that opened his eyes to how many people die of it. This is going to be what we do, he said.

At least 1.7 million adults develop sepsis in a given year, including at least 350,000 who die during their hospitalization or are discharged to hospice care, according to the CDC. Roughly 1 in 3 people who die in a hospital had sepsis during their stay, and federal agencies are aiming to reward facilities that are making strides to reduce the condition.

Those at higher risk of sepsis include adults 65 and older, people with weakened immune systems, and those with a recent severe illness or hospitalization.

The new test comes as hospitals are grappling with the future of medicine and how to best incorporate artificial intelligence into the practice. In some instances, artificial intelligence tools have created tension among front-line workers who worry the technology could lead to inaccurate results or replace staff.

See original here:
FDA approves AI-driven test for sepsis made by Prenosis - The Washington Post

Read More..

The Fate of Hundreds of Thousands of Civilians in Gaza depends on Artificial Intelligence – Sarajevo Times

Israeli sources say Israel risks at least 20 civilian casualties for every 37,000 suspects identified by an artificial intelligence program called Lavander to identify human targets in attacks on the blockaded Gaza Strip.

Sources from Tel Aviv testified for the media houses +972 and Local Call that Lavander analyzed data on about 2.3 million people in Gaza according to unclear criteria and assessed whether any of the persons had ties to Hamas.

A total of six sources stated that the Israeli army fully complied with that program, especially in the early stages of the war, and the names identified by Lavander were labeled as targets without control and without taking into account any special criteria, except that its about men.

37,000 suspected Palestinians

Sources who testified to +972 said that the concept of military target, which allows killing on private property even if there are civilians in the facility and surroundings, previously included only high-level military targets, and that after October 7 concept extended to all members of Hamas.

Due to the enormous increase in the number of targets, the need for artificial intelligence has arisen because the possibility of examining and checking targets individually by humans has been eliminated, and sources also state that artificial intelligence has marked close to 37,000 Palestinians as suspects.

The sources said that Lavander was very successful in classifying Palestinians, and that the process was fully automated.

We killed thousands of people. We automated everything and did not control each target separately. We bombed the targets as soon as they moved in their houses, the source said, confirming that human control of the targets had been eliminated.

The comment of one of the sources that he found it very surprising that they were asked to bomb a house to kill an unimportant person is a sort of acknowledgment of the Israeli massacre of civilians in Gaza.

Green light for high level targets with up to 100 civilian casualties

The sources stated that up to 20 civilian victims were allowed in the action that was carried out against the lower ranks, and that this number often changed during the process, and they emphasized that the principle of proportionality was not applied.

On the other hand, it was stated that the number of possible collateral civilian casualties increased to 100 for high-level targets.

While the sources said they were ordered to bomb every place they could, one of the sources said that hysteria dominated senior officials and all they knew was to bomb like crazy to limit Hamass capabilities.

A senior soldier with the initials B., who used the Lavander program, said that the margin of error of the program is about ten percent and that there is no need for people to control targets and waste time on it.

Israeli soldier B stated that in the beginning there were fewer labeled targets, but that with the expansion of the definition of Hamas members, the practice was further expanded and the number of targets grew. He added that members of the police and civil protection who may have helped Hamas, but who were not a threat to the Israeli army, were also targeted.

There are many shortcomings of the system. If the target person gave their phone to another person, that person is bombed at home with their entire family. This happened very often. This was one of the most common mistakes Lavander made, said Soldier B.

Most of the killed are women and children

On the other hand, the same sources said that the software called Wheres Daddy? it tracks thousands of people at a time and notifies Israeli authorities when they enter their homes. Attacks are also carried out on the database of this program.

Lets say you calculate that there is one member of Hamas and ten civilians in the house, usually those ten people are women and children. So, absurdly, most of the people you kill are women and children, said one of the sources.

Unguided bombs are used to save money

Sources also said that many civilians were killed because less important targets were hit with ordinary and cheaper missiles instead of guided smart missiles.

We usually carried out the attacks with unguided missiles, which meant literally destroying the entire house with its contents. The system kept adding new targets, one of the sources said.

Artificial intelligence is not used to reduce civilian casualties, but to find more targets

Speaking to Al Jazeera on the subject, Marc Owen Jones, professor of Middle East Studies and Digital Humanities at Hamid bin Khalifa University in Qatar, said it was increasingly clear that Israel was using unproven artificial intelligence systems that had not undergone transparent evaluation to help in making decisions about the lives of civilians.

Jones believes that Israeli officials activated an artificial intelligence system to select targets to avoid moral responsibility.

He emphasized that the purpose of using the program is not to reduce civilian casualties, but to find more targets.

Even the officials who run the system see AI as a killing machine. It is unlikely that Israel will stop using artificial intelligence in attacks if its allies do not put pressure on it. The situation in Gaza is genocide supported by artificial intelligence. A call for a moratorium on the use of artificial intelligence in warfare is needed, Jones concluded.

Habsora

Another study published on December 1, 2023 revealed that an artificial intelligence application called Habsora (Gospel), which the Israeli military also used to identify targets in its attacks on the Gaza Strip, was used to precisely target civilian infrastructure and that was used in attacks against automatically generated targets. In this case, the balance of civilian victims who would die with the target was known.

Habsora is an artificial intelligence technology used by Israel to attack buildings and infrastructure, and Lavander is used when targeting people, AA writes.

Read more from the original source:
The Fate of Hundreds of Thousands of Civilians in Gaza depends on Artificial Intelligence - Sarajevo Times

Read More..

Google DeepMind Co-Founder Voices Concerns Over AI Hype: ‘We’re Talking About All Sorts Of Things That Are Just … – TradingView

The co-founder of Google DeepMind, Sir Demis Hassabis, has voiced concerns that the surge in funding for artificial intelligence (AI) is leading to exaggerated hype, overshadowing the actual scientific progress in the sector.

What Happened: Hassabis shared his worries that the billions being poured into generative AI startups and products are causing a hype akin to the crypto buzz. This hype, he fears, is clouding the impressive advancements being made in AI and "brings with it a whole attendant bunch of hype and maybe some grifting and some other things that you see in other hyped-up areas, crypto or whatever, reported the Financial Times.

"In a way, AI's not hyped enough but in some senses, it's too hyped. We're talking about all sorts of things that are just not real, he said,

The launch of the ChatGPT chatbot by OpenAI in November 2022 triggered a rush among investors, as startups hustled to create and launch generative AI and attract venture capital. CB Insights, a market analysis firm, reported that investors put $42.5 billion into 2,500 AI startup equity rounds in the previous year.

Investors have also been attracted to the Magnificent Seven tech companies, including Microsoft , Alphabet , and Nvidia , which are at the forefront of the AI revolution. However, companies are under scrutiny from regulators for making false AI-related claims.

See Also: Elon Musk Gears Up For Grok 2, Zuckerberg Reportedly Woos AI Talent From Googles DeepMind, Tesla Hiring

Despite the hype, Hassabis is confident that AI is one of the most transformative inventions in human history. He pointed to DeepMinds AlphaFold model, launched in 2021, as a key example of how AI can speed up scientific research. AlphaFold has been used to predict the structures of 200 million proteins and is now being used by over 1 million biologists worldwide.

Why It Matters: Previously, concerns about an AI bubble have been raised by several experts. In February, Apollo Global Managements Chief Economist, Torsten Slk, sounded an alarm on the AI bubble, warning that its Bigger Than The 1990s Tech Bubble.

In March, Richard Windsor, a seasoned tech stock analyst, highlighted potential indicators of an impending market correction due to the ongoing excitement surrounding AI. However, Ken Griffin, CEO of Citadel, expressed confidence in Nvidias position in the AI market despite the uncertainties.

Read Next: Tesla CEO Elon Musk Reacts To Apple Co-Founder Steve Jobs On Finding Top Talent: You Build Up These Pockets Of A Players And It Propagates

Image by George Gillams via Flickr

Engineered by Benzinga Neuro, Edited by

Pooja Rajkumari

The GPT-4-based Benzinga Neuro content generation system exploits the extensive Benzinga Ecosystem, including native data, APIs, and more to create comprehensive and timely stories for you.

Learn more.

2024 Benzinga.com. Benzinga does not provide investment advice. All rights reserved.

Continue reading here:
Google DeepMind Co-Founder Voices Concerns Over AI Hype: 'We're Talking About All Sorts Of Things That Are Just ... - TradingView

Read More..

Google Deepmind CEO says AI industry is full of ‘hype’ and ‘grifting’ – ReadWrite

The CEO of Googles AI division Deepmind, Sir Demis Hassabis, has called out the amount of hype circulating the industry, sometimes obscuring the genuine developments.

Having co-founded Deepmind in 2010, Sir Demis has years of experience in the field of AI, well before machine learning tools hit the mainstream. More and more everyday people can use AI tools in their lives but the very nature of something taking off to the extent that it has does lead to some confusion in what the future could hold.

In an interview with the Financial Times, Sir Demis compares the AI explosion to the crypto boom of the last few years, highlighting that the billion-dollar investments into generative AI start-ups and products brings with it a whole attendant bunch of hype and maybe some grifting.

Some of that has now spilled over into AI, which I think is a bit unfortunate. It clouds the science and the research, which is phenomenal, the CEO continued. In a way, AIs not hyped enough but in some senses, its too hyped. Were talking about all sorts of things that are just not real.

This boom can largely be traced back to the launch of OpenAIs ChatGPT tool in November 2022, bringing AI-powered chats to the mainstream for the first time. Other start-ups raced to release similar or competitive tools, backed by a massive round of a collectible $42.5bn in 2,500 AI start-up equity rounds from VC groups in 2023.

Major tech companies like Microsoft, Alphabet, and Nvidia have also risen to the challenge, each bringing their own AI innovation as well as Googles own push via Deepmind.

Whether under or over-hyped, it certainly seems to be true that the average persons understanding of AI is somewhat limited. The future possibilities of AI leave a lot left to be discovered, something that Sir Demis himself is looking forward to.

I think were only scratching the surface of what I believe is going to be possible over the next decade-plus, Sir Demis stated. Were at the beginning, maybe, of a new golden era of scientific discovery, a new Renaissance.

Featured image: Ideogram

Go here to see the original:
Google Deepmind CEO says AI industry is full of 'hype' and 'grifting' - ReadWrite

Read More..

Google AI chief says AI hype distracts from science and research – Quartz

The billions of dollars going into AI is reminiscent of crypto hype, which is getting in the way of science and research, Googles AI chief is warning.

Microsoft tests an animated AI chatbot for Xbox

The investment into AI brings with it a whole attendant bunch of hype and maybe some grifting, Demis Hassabis, co-founder and CEO of DeepMind, which was acquired by Google in 2014, told the Financial Times. He compared the phenomenon to crypto and other hyped-up areas and said similar hype has now spilled over into AI, which I think is a bit unfortunate.

Hassabis also said the hype around AI clouds the science and the research, which is phenomenal. In a way, AIs not hyped enough but in some senses its too hyped. Were talking about all sorts of things that are just not real. Google DeepMind did not immediately respond to a request for comment.

Amid a tight race to develop bigger and better AI models, startups are seeing millions in investments (and even billions) that are boosting their valuations into the billions, leading some analysts to warn of an AI bubble.

After chipmaker Nvidia, which is behind the worlds most sought-after hardware powering the AI industrys leading models and products, became the first semiconductor company to reach a $2 trillion valuation in February, some analysts were wary of how far the company can go.

Another blockbuster quarter from Nvidia raises the question of how long its soaring performance will last, Jacob Bourne, a senior analyst at Insider Intelligence, said after Nvidia beat earnings expectations. Nvidias near-term market strength is durable, though not invincible.

After Nvidia reached its $2 trillion valuation, Torsten Slk, chief economist at Apollo Global Management, warned the current AI hype has surpassed that of the 1990s dot-com bubble.

Despite his warning, Hassabis said he believes the industry is only scratching the surface of what is possible with the technology. Alongside its AlphaFold model, which predicts the structures of proteins, Hassabis said Google DeepMind is also using AI to advance drug discovery research, weather prediction models, and nuclear fusion technology.

Were at the beginning, maybe, of a new golden era of scientific discovery, a new Renaissance, he told the Financial Times.

Read this article:
Google AI chief says AI hype distracts from science and research - Quartz

Read More..

Liverpool Team Up With Deepmind To Always "take Corners Quickly" – Dataconomy

Some moments in football are unforgettable, like Liverpools epic corner taken quickly goal in the 2019 UEFA Champions League semi-finals. It led to Divock Origis goal and a remarkable comeback. Thats why Liverpool knows corner kicks are big chances to score, but planning them isnt easy. Can AI help? Googles DeepMind team thinks so and has started to create Tactic AI. Its a smart program that predicts what might happen during a corner kick, suggests the best strategies for teams like Liverpool, and creates more moments like this:

Did Jurgen Klopps departure at the end of the season push Liverpool to do this, or did they not want to stay away from AI developments? Either way, if this project succeeds, it may change football we know a lot.

TacticAI is an artificial intelligence system developed in partnership with Liverpool FC and AI researchers to enhance football tactics, particularly focusing on corner kicks. This innovative tool utilizes advanced AI techniques to analyze past corner kick scenarios and provide valuable insights to coaches.

Heres how it works: The system operates by employing a geometric deep learning approach, which allows it to represent cornerkick scenarios as graphs.

In these graphs, each player is depicted as a node, containing various attributes such as position, velocity, and height, while the relationships between players are represented by edges. By leveraging this unique representation, TacticAI can accurately predict potential outcomes of corner kicks and propose tactical adjustments to optimize success.

Whats really cool about TacticAI is that it can come up with different scenarios for corner kicks, so coaches can try out new ideas and see what works best. It also helps coaches review past plays more easily, so they can learn from whats happened before.

After lots of testing with real football experts, TacticAI has proven to be really helpful. Coaches can rely on it to give them smart advice that could make a big difference in their teams performance.

Artificial intelligence in sports: AI & Big Data are changing the sports for good

In simple terms, TacticAI is like having a super-smart assistant coach who knows all about football tactics. Its there to help coaches make better decisions during corner kicks, which could lead to more goals and more wins for their team.

Visit link:
Liverpool Team Up With Deepmind To Always "take Corners Quickly" - Dataconomy

Read More..

Separating the Hype From Reality in AI – PYMNTS.com

The rapid rise of artificial intelligence (AI) has sparked a heated debate among experts, with some warning that the hype surrounding the technology may be overshadowing genuine scientific advancements.

DeepMind Co-Founder Demis Hassabis recently drew parallels between the current AI frenzy and the cryptocurrency boom, raising concerns about the potential impact on the fields progress.

The debate over whether AI is overpromised has significant implications for the commercial landscape as businesses rush to capitalize on the technologys potential. Observers say striking a balance between enthusiasm and realism will be crucial for the healthy growth of AI-driven commerce.

While generative AI is powerful, it is still only one segment of AI,Muddu Sudhakar, co-founder and CEO of the generative AI payments platformAisera, told PYMNTS. AI encompasses a variety of categories. But with so much attention on generative AI, it means that these areas get neglected and crowded out. It could also limit research, which could mean less innovation.

Interest in AI is growing. According to the PYMNTS Intelligence report Consumer Interest in Artificial Intelligence, the average consumer uses around five AI technologies weekly, including web browsing, navigation apps, and online recommendations. Nearly two-thirds of Americans are interested in AI assistants for tasks like booking travel, with AI enhancing the personalization of in-car experiences. These intelligent systems, leveraging generative AI, tailor recommendations to users behaviors and preferences far beyond simple list-based suggestions.

Hassabis expressed concernsto the Financial Times regarding the surge of investment in generative AI startups and products, likening the frenzy to other speculative bubbles. The billions of dollars being poured into generative AI startups and products brings with it a whole attendant bunch of hype and maybe some grifting and some other things that you see in other hyped-up areas, like crypto or whatever, he said.

Some experts say the hype surrounding AI has reached a fever pitch, with grandiose promises and astronomical investments obscuring the reality of the technologys current capabilities.

One of the main issues with AI hype is that it creates unrealistic expectations among the public and investors. When companies make bold claims about their AI-powered products or services, they often fail to deliver on those promises, leading to disappointment and erosion of trust.

Most people in the AI space have good intentions and dont want to mislead consumers or users,Zohar Bronfman, co-founder and CEO ofPecan AI, told PYMNTS. I dont doubt that theyre working hard to deliver the best AI products they can. Whats been ignored, though, is that generative AI so far just hasnt provided significant business value. Its fascinating and powerful, but so far, most business users have come up empty-handed when they try to use it to really drive business impact.

Sudhakar pointed out the excessive investment in large language models (LLMs), suggesting it may overshadow other vital areas of AI research. This focus risks limiting innovation and neglecting emerging technologies that could offer more significant advancements or solutions to pressing challenges in the field.

How many of these do we need? he said. How can you really tell which one is better? Its not clear. This is why I think just a handful of state-of-the-art models will ultimately prevail. That being said, there will be many SLMs [small language models] that address lots of edge cases, but even in this area, many will fade away.

Sudhakar raised a looming issue in AI: the dwindling supply of data necessary to train LLMs. This scarcity, he warned, could become a significant bottleneck in the development and advancement of these models, potentially hindering progress in AI research and applications.

One alternative is to use synthetic data, he added. This is an emerging area and could use much more focus.

Sudhakar also highlighted the importance of shifting focus toward what will eventually succeed the current transformer modelsin AI. Based on a deep learning architecture, transformer models have revolutionized how machines understand and generate human-like text by enabling them to process words about all the other words in a sentence rather than one at a time.

He added, This is a powerful model, but it has limitations, such as with hallucinations, which are based on the underlying probabilities.

While generative AI gets all the attention, the real workhorses of AI, machine learning techniques for prediction and optimization, arent hyped nearly enough, Bronfman said.

Tested and proven machine learning methods can quickly take business data and extract a great deal of value, he added. They may not seem as shiny and new as generative AI, but they definitely shine when theyre integrated into business systems the right way. These recognized methods deserve more attention and investment so businesses can achieve the transformative benefits of AI.

Some commenters say that the best use of AI might not be for commerce.Ilia Badeev, head of data science at Trevolution Group, told PYMNTS that the significance of employing AI for nonprofit and scientific endeavors receives inadequate attention.

I would like to see more hype around AI researchers, he added. Imagine a ScientistGPT that possesses information from all currently existing textbooks and scientific studies and can use it to advance theoretical and practical science.

Originally posted here:
Separating the Hype From Reality in AI - PYMNTS.com

Read More..

Google DeepMind’s Fact Quest: Improving Long-form Accuracy In LLMs With SAFE – Dataconomy

Large language models (LLMs) have demonstrated remarkable abilities they can chat conversationally, generate creative text formats, and much more. Yet, when asked to provide detailed factual answers to open-ended questions, they still can fall short. LLMs may provide plausible-sounding yet incorrect information, leaving users with the challenge of sorting fact from fiction.

Google DeepMind, the leading AI research company, is tackling this issue head-on. Their recent paper, Long-form factuality in large language models introduces innovations in both how we measure factual accuracy and how we can improve it in LLMs.

DeepMind started by addressing the lack of a robust method for testing long-form factuality. They created LongFact, a dataset of over 2,000 challenging fact-seeking prompts that demand detailed, multi-paragraph responses. These prompts cover a broad array of topics to test the LLMs ability to produce factual text in diverse subject areas.

The next challenge was determining how to accurately evaluate LLM responses. DeepMind developed the Search-Augmented Factuality Evaluator (SAFE). Heres the clever bit: SAFE itself uses an LLM to make this assessment!

Heres how it works:

DeepMind also proposed a new way to score long-form factual responses. The traditional F1 score (used for classification tasks) wasnt designed to handle longer, more complex text. F1@K balances precision (the percentage of provided facts that are correct) against a concept called recall.

Recall takes into account a users ideal response length after all, an LLM could gain high precision by providing a single correct fact, while a detailed answer would get a lower score.

DeepMind benchmarked a range of large langue models of varying sizes, and their findings aligned with the intuition that larger models tend to demonstrate greater long-form factual accuracy. This can be explained by the fact that larger models are trained on massive datasets of text and code, which imbues them with a richer and more comprehensive understanding of the world.

Imagine an LLM like a student who has studied a vast library of books. The more books the student has read, the more likely they are to have encountered and retained factual information on a wide range of topics. Similarly, a larger LLM with its broader exposure to information is better equipped to generate factually sound text.

In order to perform this measurement, Google DeepMind tested the following models: Gemini, GPT, Claude (versions 3 and 2), and PaLM. The results are as follows:

DeepMinds study shows a promising path toward LLMs that can deliver more reliable factual information. SAFE achieved accuracy levels that exceeded human raters on certain tests.

However, its crucial to note the limitations:

Search engine dependency: SAFEs accuracy relies on the quality of search results and the LLMs ability to interpret them.

Non-repeating facts: The F1@K metric assumes an ideal response wont contain repetitive information.

Despite potential limitations, this work undeniably moves the needle forward in the development of truthful AI systems. As LLMs continue to evolve, their ability to accurately convey facts could have profound impacts on how we use these models to find information and understand complex topics.

Featured image credit: Freepik

View post:
Google DeepMind's Fact Quest: Improving Long-form Accuracy In LLMs With SAFE - Dataconomy

Read More..

Humane, Rabbit, Brilliant, Meta: the AI gadgets are here – The Verge

Im just going to call it: well look back on April 2024 as the beginning of a new technological era. That sounds grandiose, I know, but in the next few weeks, a whole new generation of gadgets is poised to hit the market. Humane will launch its voice-controlled AI Pin. Rabbits AI-powered R1 will start to ship. Brilliant Labs AI-enabled smart glasses are coming out. And Meta is rolling out a new feature to its smart glasses that allow Metas AI to see and help you navigate the real world.

There are many more AI gadgets to come, but the AI hardware revolution is officially beginning. What all these gadgets have in common is that they put artificial intelligence at the front of the experience. When you tap your AI Pin to ask a question, play music, or take a photo, Humane runs your query through a series of language models to figure out what youre asking for and how best to accomplish it. When you ask your Rabbit R1 or your Meta smart glasses who makes that cool mug youre looking at, it pings through a series of image recognition and data processing models in order to tell you thats a Yeti Rambler. AI is not an app or a feature; its the whole thing.

Its possible that one or many of these devices will so thoroughly nail the user experience and feature list that this month will feel both like the day you got your first flip phone and the day the iPhone made that flip phone look like an antique. But probably not. More likely, what were about to get are a lot of new ideas about how you interact with technology. And together, theyll show us at least a glimpse of the future.

The primary argument against all these AI gadgets so far has been that the smartphone exists. Why, you might ask, do I need special hardware to access all this stuff? Why cant I just do it on the phone in my pocket? To that, I say, well, you mostly can! The ChatGPT app is great, Googles Gemini is rapidly taking over the Android experience, and if I were a betting man, Id say theres a whole lot of AI coming to iOS this year.

Smartphones are great! None of these devices will kill or replace your phone, and anyone who says otherwise is lying to you. But after so many years of using our phones, weve forgotten how much friction they actually contain. To do almost anything on your phone, you have to take the device out of your pocket, look at it, unlock it, open an app, wait for the app to load, tap between one and 40,000 times, switch to another app, and repeat over and over again. Smartphones are great because theyre able to contain and access practically everything, but theyre not actually particularly efficient tools. And theyre not going to get better, not as long as the app store business model stays the way it is.

The promise of AI and I want to emphasize the word promise because nothing weve seen so far comes remotely close to accomplishing this is to abstract all those steps and all that friction out of existence. All you need to do is declare your intentions play music, navigate home, text Anna, tell me what poison ivy looks like and let the system figure out how to get it done. Your phone contains multitudes, but its not really optimized for anything. An AI-optimized gadget can be easier to reach, quicker to launch, and alert to your input at all times.

The promise of AIis to abstract all those steps and all that friction out of existence

If that pans out, we might get not only a new set of gadgets but also a new set of huge companies. Google and Apple won the smartphone wars, and no company over the last decade has even come close to upsetting that app store duopoly. So much of the race to augmented reality, the metaverse, wearables, and everything else has been about trying to open up a new market. (On the flip side, its no accident that while so many other companies are building AI gadgets, Google and Apple are working hastily to shove AI into your phone.) AI might turn out to be just another flailing attempt from the folks that lost the smartphone wars. But it might also be the first general-purpose, all-things-to-all-people technology that actually feels like an upgrade.

Obviously, the AI-first approach brings its own set of challenges. Starting with the whole AI is not yet very good or reliable thing. But even once were past that, all the simplicity by abstraction can actually turn into confusion. What if I text Anna in multiple places? What if I listen to podcasts in Pocket Casts and music in Spotify and audiobooks in Audible, and I have accounts with a bunch of other music services I never even use? What if the closest four-star coffee shop is a Starbucks, and I hate Starbucks? If I tell my AI device to buy something, what card does it use? What retailer does it pick? How fast will it ship? Automation requires trust, and we dont yet have many reasons to trust AI.

So far, the most compelling approach seems to be a hybrid one. Both Humane and Rabbit have built complex web apps through which you can manage all your accounts, payment systems, conversation history, and other preferences. Rabbit allows you to actually teach your device how to do things the way you like. Both also have some kind of display Humane, a laser projector, Rabbit, a small screen on the R1 on which you can check the AIs work or change the way its planning to do something. The AI glasses from Meta and Brilliant try to address these problems either by directing you to look at something on your phone or just by not trying to do everything for everyone. AI cant do everything yet.

In many ways, it feels like its 2004 again. Id bet that none of these new devices will feel like a perfectly executed, entirely feature-complete product even the people who make these gadgets dont think theyve finished the job, no matter how self-serious their product videos might be. But before the iPhone turned the whole cellphone market into panes of glass, phones swiveled; they flipped; they were candy bars and clamshells and sliders and everything in between. Right now, everyones searching for the iPhone of AI, but were not getting that anytime soon. We might not get it ever, for that matter, because the promise of AI is that it doesnt require a certain kind of perfected interface it doesnt require any interface at all. What were going to get instead are the Razr, the Chocolate, the Treo, the Pearl, the N-Gage, and the Sidekick of AI. Its going to be chaos, and its going to be great.

See the rest here:
Humane, Rabbit, Brilliant, Meta: the AI gadgets are here - The Verge

Read More..