Category Archives: Ai
Alibaba’s AI video generator just dunked on Sora by making the Sora lady sing – Mashable
Alibaba wants you to compare its new AI video generator to OpenAI's Sora. Otherwise, why use it to make Sora's most famous creation belt out a Dua Lipa song?
On Tuesday, an organization called the "Institute for Intelligent Computing" within the Chinese e-commerce juggernaut Alibaba released a paper about an intriguing new AI video generator it has developed that's shockingly good at turning still images of faces into passable actors and charismatic singers. The system is called EMO, a fun backronym supposedly drawn from the words "Emotive Portrait Alive" (though, in that case, why is it not called "EPO"?).
EMO is a peek into a future where a system like Sora makes video worlds, and rather than being populated by attractive mute people just kinda looking at each other, the "actors" in these AI creations say stuff or even sing.
Alibaba put demo videos on GitHub to show off its new video-generating framework. These include a video of the Sora lady famous for walking around AI-generated Tokyo just after a rainstorm singing "Don't Start Now" by Dua Lipa and getting pretty funky with it.
The demos also reveal how EMO can, to cite one example, make Audrey Hepburn speak the audio from a viral clip of Riverdale's Lili Reinhart talking about how much she loves crying. In that clip, Hepburn's head maintains a rather soldier-like upright position, but her whole face not just her mouth really does seem to emote the words in the audio.
In contrast to this uncanny version of Hepburn, Reinhart in the original clip moves her head a whole lot, and she also emotes quite differently, so EMO doesn't seem to be a riff on the sort of AI face-swapping that went viral back in the mid-2010s and led to the rise of deepfakes in 2017.
Over the past few years, applications designed to generate facial animation from audio have cropped up, but they haven't been all that inspiring. For instance, the NVIDIA Omniverse software package touts an app with an audio-to-facial-animation framework called "Audio2Face" which relies on 3D animation for its outputs rather than simply generating photorealistic video like EMO.
Despite Audio2Face only being two years old, the EMO demo makes it look like an antique. In a video that purports to show off its ability to mimic emotions while talking, the 3D face it depicts looks more like a puppet in a facial expression mask, while EMO's characters seem to express the shades of complex emotion that come across in each audio clip.
It's worth noting at this point that, like with Sora, we're assessing this AI framework based on a demo provided by its creators, and we don't actually have our hands on a usable version that we can test. So it's tough to imagine that right out of the gate this piece of software can churn out such convincingly human facial performances based on audio without significant trial and error, or task-specific fine-tuning.
The characters in the demos mostly aren't expressing speech that calls for extreme emotions faces screwed up in rage, or melting down in tears, for instance so it remains to be seen how EMO would handle heavy emotion with audio alone as its guide. What's more, despite being made in China, it's depicted as a total polyglot, capable of picking up on the phonics of English and Korean, and making the faces form the appropriate phonemes with decent though far from perfect fidelity. So in other words, it would be nice to see what would happen if you put audio of a very angry person speaking a lesser-known language into EMO to see how well it performed.
Also fascinating are the little embellishments between phrases pursed lips or a downward glance that insert emotion into the pauses rather than just the times when the lips are moving. These are examples of how a real human face emotes, and it's tantalizing to see EMO get them so right, even in such a limited demo.
According to the paper, EMO's model relies on a large dataset of audio and video (once again: from where?) to give it the reference points necessary to emote so realistically. And its diffusion-based approach apparently doesn't involve an intermediate step in which 3D models do part of the work. A reference-attention mechanism and a separate audio-attention mechanism are paired by EMO's model to provide animated characters whose facial animations match what comes across in the audio while remaining true to the facial characteristics of the provided base image.
It's an impressive collection of demos, and after watching them it's impossible not to imagine what's coming next. But if you make your money as an actor, try not to imagine too hard, because things get pretty disturbing pretty quick.
Original post:
Alibaba's AI video generator just dunked on Sora by making the Sora lady sing - Mashable
The AI Culture Wars Are Just Getting Started – WIRED
Google was forced to turn off the image-generation capabilities of its latest AI model, Gemini, last week after complaints that it defaulted to depicting women and people of color when asked to create images of historical figures that were generally white and male, including vikings, popes, and German soldiers. The company publicly apologized and said it would do better. And Alphabets CEO, Sundar Pichai, sent a mea culpa memo to staff on Wednesday. I know that some of its responses have offended our users and shown bias, it reads. To be clear, thats completely unacceptable, and we got it wrong.
Googles critics have not been silenced, however. In recent days conservative voices on social media have highlighted text responses from Gemini that they claim reveal a liberal bias. On Sunday, Elon Musk posted screenshots on X showing Gemini stating that it would be unacceptable to misgender Caitlyn Jenner even if this were the only way to avert nuclear war. Google Gemini is super racist and sexist, Musk wrote.
A source familiar with the situation says that some within Google feel that the furor reflects how norms about what it is appropriate for AI models to produce are still in flux. The company is working on projects that could reduce the kinds of issues seen in Gemini in the future, the source says.
Googles past efforts to increase the diversity of its algorithms output have met with less opprobrium. Google previously tweaked its search engine to show greater diversity in images. This means more women and people of color in images depicting CEOs, even though this may not be representative of corporate reality.
Googles Gemini was often defaulting to showing non-white people and women because of how the company used a process called fine-tuning to guide a models responses. The company tried to compensate for the biases that commonly occur in image generators due to the presence of harmful cultural stereotypes in the images used to train them, many of which are generally sourced from the web and show a white, Western bias. Without such fine-tuning, AI image generators show biases by predominantly generating images of white people when asked to depict doctors or lawyers, or disproportionately showing Black people when asked to create images of criminals. It seems that Google ended up overcompensating, or didnt properly test the consequences of the adjustments it made to correct for bias.
Why did that happen? Perhaps simply because Google rushed Gemini. The company is clearly struggling to find the right cadence for releasing AI. It once took a more cautious approach with its AI technology, deciding not to release a powerful chatbot due to ethical concerns. After OpenAIs ChatGPT took the world by storm, Google shifted into a different gear. In its haste, quality control appears to have suffered.
Gemini's behavior seems like an abject product failure, says Arvind Narayanan, a professor at Princeton University and coauthor of a book on fairness in machine learning. These are the same kinds of issues we've been seeing for years. It boggles the mind that they released an image generator without apparently ever trying to generate an image of a historical person.
Chatbots like Gemini and ChatGPT are fine-tuned through a process that involves having humans test a model and provide feedback, either according to instructions they were given or using their own judgment. Paul Christiano, an AI researcher who previously worked on aligning language models at OpenAI, says Geminis controversial responses may reflect that Google sought to train its model quickly and didnt perform enough checks on its behavior. But he adds that trying to align AI models inevitably involves judgment calls that not everyone will agree with. The hypothetical questions being used to try to catch out Gemini generally force the chatbot into territory where its tricky to satisfy everyone. It is absolutely the case that any question that uses phrases like more important or better is going to be debatable, he says.
Read the original post:
AI Outshines Humans in Creative Thinking – Neuroscience News
Summary: ChatGPT-4 was pitted against 151 human participants across three divergent thinking tests, revealing that the AI demonstrated a higher level of creativity. The tests, designed to assess the ability to generate unique solutions, showed GPT-4 providing more original and elaborate answers.
The study underscores the evolving capabilities of AI in creative domains, yet acknowledges the limitations of AIs agency and the challenges in measuring creativity. While AI shows potential as a tool for enhancing human creativity, questions remain about its role and the future integration of AI in creative processes.
Key Facts:
Source: University of Arkansas
Score another one for artificial intelligence. In a recent study, 151 human participants were pitted against ChatGPT-4 in three tests designed to measure divergent thinking, which is considered to be an indicator of creative thought.
Divergent thinking is characterized by the ability to generate a unique solution to a question that does not have one expected solution, such as What is the best way to avoid talking about politics with my parents? In the study, GPT-4 provided more original and elaborate answers than the human participants.
The study, The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks, was published inScientific Reportsand authored by U of A Ph.D. students in psychological science Kent F. Hubert and Kim N. Awa, as well as Darya L. Zabelina, an assistant professor of psychological science at the U of A and director of the Mechanisms of Creative Cognition and Attention Lab.
The three tests utilized were the Alternative Use Task, which asks participants to come up with creative uses for everyday objects like a rope or a fork; the Consequences Task, which invites participants to imagine possible outcomes of hypothetical situations, like what if humans no longer needed sleep?; and the Divergent Associations Task, which asks participants to generate 10 nouns that are as semantically distant as possible. For instance, there is not much semantic distance between dog and cat while there is a great deal between words like cat and ontology.
Answers were evaluated for the number of responses, length of response and semantic difference between words. Ultimately, the authors found that Overall, GPT-4 was more original and elaborate than humans on each of the divergent thinking tasks, even when controlling for fluency of responses. In other words, GPT-4 demonstrated higher creative potential across an entire battery of divergent thinking tasks.
This finding does come with some caveats. The authors state, It is important to note that the measures used in this study are all measures of creative potential, but the involvement in creative activities or achievements are another aspect of measuring a persons creativity.
The purpose of the study was to examine human-level creative potential, not necessarily people who may have established creative credentials.
Hubert and Awa further note that AI, unlike humans, does not have agency and is dependent on the assistance of a human user. Therefore, the creative potential of AI is in a constant state of stagnation unless prompted.
Also, the researchers did not evaluate the appropriateness of GPT-4 responses. So while the AI may have provided more responses and more original responses, human participants may have felt they were constrained by their responses needing to be grounded in the real world.
Awa also acknowledged that the human motivation to write elaborate answers may not have been high, and said there are additional questions about how do you operationalize creativity? Can we really say that using these tests for humans is generalizable to different people? Is it assessing a broad array of creative thinking? So I think it has us critically examining what are the most popular measures of divergent thinking.
Whether the tests are perfect measures of human creative potential is not really the point. The point is that large language models are rapidly progressing and outperforming humans in ways they have not before. Whether they are a threat to replace human creativity remains to be seen.
For now, the authors continue to seeMoving forward, future possibilities of AI acting as a tool of inspiration, as an aid in a persons creative process or to overcome fixedness is promising.
Author: Hardin Young Source: University of Arkansas Contact: Hardin Young University of Arkansas Image: The image is credited to Neuroscience News
Original Research: Open access. The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks by Kent Hubert et al. Scientific Reports
Abstract
The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks
The emergence of publicly accessible artificial intelligence (AI) large language models such as ChatGPT has given rise to global conversations on the implications of AI capabilities.
Emergent research on AI has challenged the assumption that creative potential is a uniquely human trait thus, there seems to be a disconnect between human perception versus what AI is objectively capable of creating.
Here, we aimed to assess the creative potential of humans in comparison to AI. In the present study, human participants (N=151) and GPT-4 provided responses for the Alternative Uses Task, Consequences Task, and Divergent Associations Task.
We found that AI was robustly more creative along each divergent thinking measurement in comparison to the human counterparts. Specifically, when controlling for fluency of responses, AI was more original and elaborate.
The present findings suggest that the current state of AI language models demonstrate higher creative potential than human respondents.
Original post:
AI Outshines Humans in Creative Thinking - Neuroscience News
Wendy’s surge price reframe, Tyler Perry blames AI, and inflation falls : The Indicator from Planet Money – NPR
Wendy's surge price reframe, Tyler Perry blames AI, and inflation falls : The Indicator from Planet Money It's Indicators of the Week, our weekly look under the hood of the global economy! Today on the show: Tyler Perry halts his film studio expansion plans because of AI, Wendy's communications about a new pricing board goes haywire and a key inflation measure falls.
Related episodes:Listener Questions: the 30-year fixed mortgage, upgrade auctions, PCE inflation (Apple / Spotify) AI creates, transforms and destroys... jobs (Apple / Spotify) The secret entrance that sidesteps Hollywood picket lines (Apple / Spotify) The Birth And Death Of The Price Tag
For sponsor-free episodes of The Indicator from Planet Money, subscribe to Planet Money+ via Apple Podcasts or at plus.npr.org.
Music by Drop Electric. Find us: TikTok, Instagram, Facebook, Newsletter.
FREDERIC J. BROWN/AFP via Getty Images
It's Indicators of the Week, our weekly look under the hood of the global economy! Today on the show: Tyler Perry halts his film studio expansion plans because of AI, Wendy's communications about a new pricing board goes haywire and a key inflation measure falls.
Related episodes: Listener Questions: he 30-year fixed mortgage, upgrade auctions, PCE inflation (Apple / Spotify) AI creates, transforms and destroys... jobs (Apple / Spotify) The secret entrance that sidesteps Hollywood picket lines (Apple / Spotify) The Birth And Death Of The Price Tag
For sponsor-free episodes of The Indicator from Planet Money, subscribe to Planet Money+ via Apple Podcasts or at plus.npr.org.
Music by Drop Electric. Find us: TikTok, Instagram, Facebook, Newsletter.
Read more:
Google Is Paying Publishers Five-Figure Sums to Test an Unreleased Gen AI Platform – Adweek
Google launched a private program for a handful of independent publishers last month, providing the news organizations with beta access to an unreleased generative artificial intelligence platform in exchange for receiving analytics and feedback, according to documents seen by ADWEEK.
As part of the agreement, the publishers are expected to use the suite of tools to produce a fixed volume of content for 12 months. In return, the news outlets receive a monthly stipend amounting to a five-figure sum annually, as well as the means to produce content relevant to their readership at no cost.
In partnership with news publishers, especially smaller publishers, were in the early stages of exploring ideas to potentially provide AI-enabled tools to help journalists with their work,
This speculation about this tool being used to re-publish other outlets work is inaccurate, a Google representative said in a statement. The experimental tool isbeing responsiblydesigned to help small, local publishers produce high quality journalism using factual content from public data sourceslike a local governments public information office or health authority. These tools are not intended to, and cannot, replace the essential role journalists have in reporting, creating and fact-checking their articles.
The beta tools let under-resourced publishers create aggregated content more efficiently by indexing recently published reports generated by other organizations, like government agencies and neighboring news outlets, and then summarizing and publishing them as a new article.
Other gen AI experiments Google has released over the past two years include the codenamed Genesis, which can reportedly produce whole news articles and was privately demonstrated to several publishers last summer, according to The New York Times. Others, including Search Generative Experience and Gemini, are available for public use and threaten to upend many of the commercial foundations of digital publishing.
The program is part of the Google News Initiative, which launched in 2018 to provide publishers with technology and training.
Although many of its programs indisputably benefit the publishers involved, the broader reception of GNI has been mixed.
Google has used GNI to drum up positive press and industry goodwill during moments of reputational duress, and many of the commercial problems it aims to solve for publishers were created by Google in the first place, said Digital Content Next CEO Jason Kint.
The larger point here is that Google is in legislative activity and antitrust enforcement globally for extracting revenue from the publishing world, Kint said. Instead of giving up some of that revenue, its attacking the cost side for its long-tail members with the least bargaining power.
Google first shared a call for news organizations to apply to test the emerging technologies in an October edition of the Local Independent Online News newsletter.
GNI began onboarding publishers in January, and the yearlong program kicked off in February.
According to the conditions of the agreement, participating publishers must use the platform to produce and publish three articles per day, one newsletter per week and one marketing campaign per month.
To produce articles, publishers first compile a list of external websites that regularly produce news and reports relevant to their readership. These sources of original material are not asked for their consent to have their content scraped or notified of their participation in the processa potentially troubling precedent, said Kint.
When any of these indexed websites produce a new article, it appears on the platform dashboard. The publisher can then apply the gen AI tool to summarize the article, altering the language and style of the report to read like a news story.
Its hard to argue that stealing peoples work supports the mission of the news.
Digital Content Next CEO Jason Kint
The resulting copy is underlined in different colors to indicate its potential accuracy: yellow, with language taken almost verbatim from the source material, is the most accurate, followed by blue and then red, with text that is least based on the original report.
A human editor then scans the copy for accuracy before publishing three such stories per day. The program does not require that these AI-assisted articles be labeled.
The platform cannot gather facts or information that have not already been produced elsewhere, limiting its utility for premium publishers.
Articles produced by the platform could also draw traffic away from the original sources, negatively affecting their businesses. The process resembles the ripping technique newly in use at Reach plc, except that in this case, the text is pulled from external sources.
I think this calls into question the mission of GNI, Kint said. Its hard to argue that stealing peoples work supports the mission of the news. This is not adding any new information to the mix.
Go here to read the rest:
Google Is Paying Publishers Five-Figure Sums to Test an Unreleased Gen AI Platform - Adweek
Elon Musk sues OpenAI and Sam Altman over betrayal of nonprofit AI mission – TechCrunch
Elon Musk has sued OpenAI, its co-founders Sam Altman and Greg Brockman, and its affiliated entities, alleging the ChatGPT makers have breached their original contractual agreements by pursuing profits instead of the nonprofits founding mission to develop AI that benefits humanity.
Musk, a co-founder and early backer of OpenAI, claims Altman and Brockman convinced him to help found and bankroll the startup in 2015 with promises it would be a nonprofit focused on countering the competitive threat from Google. The founding agreement required OpenAI to make its technology freely available to the public, the lawsuit alleges.
The lawsuit, filed in a court in San Francisco late Thursday, says that OpenAI, the worlds most valuable AI startup, has shifted to a for-profit model focused on commercializing its AGI research after partnering with Microsoft, the worlds most valuable company, which has invested about $13 billion into the startup.
In reality, however, OpenAI, Inc. has been transformed into a closed-source de facto subsidiary of the largest technology company in the world: Microsoft. Under its new board, it is not just developing but is actually refining an AGI to maximize profits for Microsoft, rather than for the benefit of humanity, the lawsuit adds. This was a stark betrayal of the Founding Agreement.
The lawsuit follows Musk airing concerns about OpenAIs shift in priorities in the past year. According to the legal complaint, Musk donated over $44 million to the nonprofit between 2016 and September 2020. For the first several years, he was the largest contributor to OpenAI, the lawsuit adds. Musk, who left OpenAIs board in 2018, has been offered a stake in the for-profit arm of the startup but has refused to accept it over a principled stand, he said earlier.
X, the social network owned by Musk, last year launched Grok, a rival to ChatGPT.
Altman has also addressed some of Musks concerns in the past, including the close ties with Microsoft. I like the dude. I think hes totally wrong about this stuff, he said of Musks criticisms at a conference last year. He can sort of say whatever he wants but Im like proud of what were doing and I think were going to make a positive contribution to the world and I try to stay above all that.
OpenAIs launch of ChatGPT in late 2022sparked an AI arms race, with rivals still scrambling to match its uncannily human-like responses. Microsoft CEO Satya Nadella landed a gloved jab at the rest of the industry last month. We have the best model today even with all the hoopla, one year after, GPT4 is better, he said. We are waiting for the competition to arrive. It will arrive, Im sure, but the fact [is] that we have the leading LLM out there.
An email exchange between Musk and Altman, presented as evidence in the lawsuit. Image Credits: TechCrunch/screenshot
The Thursday lawsuit alleges close alignment between Microsoft and OpenAI, citing a recent interview with Nadella. Amid a dramatic leadership shake-up at OpenAI late last year, Nadella stated that if OpenAI disappeared tomorrow we have all the IP rights and all the capability. We have the people, we have the compute, we have the data, we have everything. We are below them, above them, around them. The lawsuit presents this as evidence that OpenAI has strongly served Microsofts interests.
The lawsuit also centers around OpenAIs GPT-4, which Musk claims constitutes AGI an AI whose intelligence is at par, if not higher, than humans. He alleges OpenAI and Microsoft have improperly licensed GPT-4 despite agreeing that OpenAIs AGI capabilities would remain dedicated to humanity.
Through the lawsuit, Musk is seeking to compel OpenAI to adhere to its original mission and bar them from monetizing technologies developed under its nonprofit for the benefit of OpenAI executives or partners like Microsoft.
The suit also requests the court rule that AI systems like GPT-4 and other advanced models in development constitute artificial general intelligence that reaches beyond licensing agreements. In addition to injunctions forcing OpenAIs hand, Musk asks for accounting and potential restitution of donations meant to fund its public-minded research should the court find it now operates for private gain.
Mr. Altman hand-picked a new Board that lacks similar technical expertise or any substantial background in AI governance, which the previous board had by design. Mr. DAngelo, a tech CEO and entrepreneur, was the only member of the previous board to remain after Mr. Altmans return. The new Board consisted of members with more experience in profit-centric enterprises or politics than in AI ethics and governance, the lawsuit adds.
Read this article:
Elon Musk sues OpenAI and Sam Altman over betrayal of nonprofit AI mission - TechCrunch
How AI is already helping scientists track threatened humpback whales – NPR
Humpback whales that spend their winters in Hawaii, like this mother and calf, have declined over the last decade. Martin van Aswegen /Marine Mammal Research Program, University of Hawaii at Manoa, NMFS Permit No: 21476/21321 hide caption
Humpback whales that spend their winters in Hawaii, like this mother and calf, have declined over the last decade.
After decades of whaling decimated their numbers, humpback whales have made a remarkable comeback. The 50-foot giants, known for their elaborate songs, have become common in parts of the Pacific Ocean they disappeared from.
Now, a new study finds that climate change could be slowing that recovery. Using artificial intelligence-powered image recognition, the survey finds the humpback population in the North Pacific Ocean declined 20% from 2012 to 2021.
The decline coincides with "the blob," a severe marine heat wave that raised water temperatures from Alaska to California. The impacts cascaded through the food web, affecting fish, birds and whales.
"I think the scary part of some of the changes we've seen in ocean conditions is the speed at which they're occurring," says John Calambokidis, a whale biologist at Cascadia Research and a co-author on the study. "And that would put long-lived, slow-reproducing species like humpback whales and other large whales as more vulnerable."
Ted Cheeseman is a coauthor of the new study, and for 30 years he worked as a naturalist, guiding trips on boats around Antarctica. That meant looking for whales, which wasn't easy in the early 1990s.
"We saw very, very few whales," he says. "In the 2000s, we saw more. The 2010s we started seeing quite a few whales."
The whales were making a slow recovery after industrial whaling, which continued into the 1960s for many species. Over years of photographing whales, Cheeseman realized he was collecting valuable data for scientists.
Photographs are key for counting whales. As they dive deep, humpbacks raise their tails out of the water, revealing markings and patterns unique to each individual. Scientists typically identify whales photo by photo, matching the tails in a painstaking process.
Humpback whale tails have unique markings, allowing both scientists and computer algorithms to identify individual whales. Ted Cheeseman hide caption
Cheeseman figured that technology could do that more quickly. He started Happy Whale, which uses artificial intelligence-powered image recognition to identify whales. The project pulled together about 200,000 photos of humpback whales. Many came from scientists who had built large image catalogs over the years. Others came from whale watching groups and citizen scientists, since the website is designed to share the identity of a whale and where it's been seen.
"In the North Pacific, we have identified almost every living whale," Cheeseman says. "We were just doing this as a study of the population. We didn't expect to see a major impact of climate."
Humpbacks in the North Pacific Ocean likely dropped to only 1,200 to 1,600 individuals in the wake of whaling. By 2012, they had climbed back to around 33,000 whales. The study finds that after that, their numbers started falling again.
The biggest decline was seen in one particular group of humpbacks in the Pacific. As migratory animals, the whales swim thousands of miles, returning to the same sites every year. Some whales spend their summers feeding in Alaska and then head to Hawaii for the winter. The study found this group declined 34%, while other groups didn't see as sharp of a drop.
"It tells us something pretty dramatic happened for humpback whales," Calambokidis says. "We are facing a new era of impacts."
Calambokidis says that for years, scientists wondered whether humpbacks had recovered so well that they'd hit a natural plateau, if the ecosystem couldn't support more animals. He says the study shows something else is at play too.
The Alaska-Hawaii whales may have been more susceptible to the dramatic changes caused by "the blob." Spanning several years, the intense marine heat wave disrupted the food chain, including tiny organisms like krill that feed larger animals like whales. Studies show that marine heat waves are likely to become more common as the climate keeps warming due to the burning of fossil fuels. Humpbacks are also vulnerable to ship strikes and getting entangled in fishing gear off the West Coast.
Calambokidis says the humpback decline was easier to detect because the whales have recovered so strongly. For rarer whales, it's much harder to track and count them, making it difficult to see how marine heat waves may be having an impact. The hope is that new technology, like Happy Whale, will help reveal these changes faster than ever before.
Here is the original post:
How AI is already helping scientists track threatened humpback whales - NPR
Adobe announces new prototype AI tool for creating and editing audio – The Verge
Adobes latest generative AI experiment aims to help people create and customize music without any professional audio experience. Announced during the Hot Pod Summit in Brooklyn on Wednesday, Project Music GenAI Control is a new prototype tool that allows users to generate music using text prompts and then edit that audio without jumping over to dedicated editing software.
Users start by inputting a text description that will generate music in a specified style, such as happy dance or sad jazz. Adobe says its integrated editing controls then allow users to customize those results, adjusting any repeating patterns, tempo, intensity, and structure. Sections of music can be remixed, and audio can be generated as a repeating loop for people who need things like backing tracks or background music for content creation.
Adobe also says the tool can adjust the generated audio based on a reference melody and extend the length of audio clips if you want to make the track long enough for things like a fixed animation or podcast segments. The actual user interface for editing generated audio hasnt been revealed yet, so well need to use our imaginations for now.
Adobe says public domain content was uploaded for the public Project Music GenAI Control demo, but its not clear if the tool could allow any audio to be directly uploaded into the tool as reference material, or how long clips can be extended for. We have asked Adobe to clarify this and will update this article if we hear back.
While similar tools are already available or being developed such as Googles MusicLM and Metas open-source AudioCraft these only allow users to generate audio via text prompts, with little to no support for editing the music output. That means youd have to keep generating audio from scratch until you get the results you want or manually make those edits yourself using audio editing software.
One of the most exciting things about these new tools is that they arent just about generating audio, said Nicholas Bryan, a senior research scientist at Adobe Research, in a press release. Theyre taking it to the level of Photoshop by giving creatives the same kind of deep control to shape, tweak, and edit their audio. Its a kind of pixel-level control for music.
Project Music GenAI is being developed in collaboration with the University of California and the School of Computer Science at Carnegie Mellon University. Adobe describes it as an early-stage experiment, so while these features may eventually be incorporated into the companys existing editing tools like Audition and Premiere Pro, its going to take some time. The tool isnt available to the public yet, and no release date has been announced. You can track Project Music GenAIs development alongside other experiments Adobe is working on over at the Adobe Labs website.
Go here to see the original:
Adobe announces new prototype AI tool for creating and editing audio - The Verge
Why scientists are tracking whale tails with AI – Popular Science
Researchers using an AI photo-scanning tool similar to facial recognition have learned that theres been a 20% decline in North Pacific Ocean humpback whale populations over the past decade. The researchers pointed to a climate change related heat wave as a possible culprit. The findings, published this week in Royal Society Open Science, used the artificial intelligence-powered image detection model to analyze more than 200,000 photographs of humpback whales taken between 2001 and 2022.
Facial recognition models used to identify humans have faced sustained criticism from researchers and advocates who say the models struggle to identify accurately identity nonwhite people. In this case, the model scanning humpback whale photos was trained to spot and recognize unique identifiers on a whales dorsal fin. These identifiers function like a one-of-a-kind whale fingerprint and can consist of marks, variations in pigmentation, scarring, and overall size. Researchers used successful photo matches to inform estimates for humpback whale populations over time.
[ Related: The government is going to use facial recognition more. Thats bad. ]
Images of the whale tails, captured by scientists and whale watchers alike, are stored by a nonprofit called HappyWhale, which described itself as largest individual identification resource ever built for marine mammals. HappyWhales encourages everyday citizen scientists to take photos of whales they see and upload them to its growing database. The photos include the data, and location of where the whale was spotted.
From there, users can track a whale they photographed and contribute to a growing corpus of data researchers can use to more accurately understate the species population and migration patterns. Prior to this AI-assisted method, experts had to comb through individual whale tail photographs looking for similarities with their named eye, a process both painstaking and time-consuming. Image matching technology speeds that process, giving researchers more time to investigate changes in population data.
Having an algorithm like this dramatically speeds up the information-gathering process, which hopefully speeds up timely management actions, Philip Patton, a University of Hawaii at Manoa Phd student who has worked with the tool said in a previous interview with Spectrum News.
Humpback whales, once on the brink of extinction, have seen their population grow in the 40 years since commercial hunting of the species was made illegal, so much so that the giant mammals were removed from the endangered species list in the US in 2016. But that rebound is at risk of being short-lived. Researchers analyzing the whale data estimate their population peaked in 2012 at around 33,488. Then, the numbers started trickling downwards. From 2012 to 2021, the whale population dropped down to 26,662, a decline of around 20%. Researchers say that downward trend coincided with a record heat wave that raised ocean temperatures and may have altered the course of species recovery.
That historic heat wave resulted in rising surface sea temperatures and decreases in nutrient-rich water which in turn led to reductions in phytoplankton biomass. These changes led to greater disruptions in the food chain which the researcher says limited the whales access to krill and other food sources. While they acknowledged ship collisions and entanglements could be responsible for some of the population declines, the researchers said those factors couldnt account for the entirety of the decline.
These advances have shifted the abundance estimation paradigm from data scarcity and periodic study to continuous and accessible tracking of the ocean-basin- wide population through time, the researchers wrote.
Whales arent the only animals having their photos run through image detection algorithms. Scientists use various forms of the technology to research populations of cows, chickens, salmon, and lemurs, amongst species. Though primarily used as an aid for conservation and population estimation, some researchers have reportedly used the technology to analyze facial cues in domesticated Sheep to determine whether or not they felt pain in certain scenarios. Others have used photo matching software to try and to find missing pets.
[ Related: Do all geese look the same to you? Not to this facial recognition software. ]
These examples and others highlight the upside of image and pattern matching algorithms capable of sifting through vast image databases. In the case of conservation, accurate population estimates made possible by these technologies can help inform whether or not certain species require endangered classifications or other resources to help maintain their healthy population.
See the original post here:
Why scientists are tracking whale tails with AI - Popular Science
An update on the BBC’s plans for Generative AI (Gen AI) and how we plan to use AI tools responsibly – BBC.com
Rhodri Talfan Davies is the BBCs Director of Nations. He's responsible for bringing teams together across the BBC to shape our response to an emerging area of technology called Generative AI (or Gen AI). Here he sets out the latest on our plans:
In October 2023, we shared our approach to working with Generative AI (Gen AI) technology and outlined three principles that will shape our approach in this area.
We set out that we would:
Today I wanted to give you an update on what weve been doing including the outline of a number of Gen AI test pilots that were currently working on and details of new guidance that were providing on the use of AI.
In October we said that we would start a number of projects that explore the use of Gen AI in both what we make and how we work - taking a targeted approach to better understand both the opportunities and risks.
Weve now chosen the first projects, which will help us explore different areas where we think that Gen AI could bring significant benefits for audiences and for staff.
At this stage, the vast majority of the pilots are internal-only and wont be used to create content for audiences until we have had an opportunity to learn more.
There are 12 pilots in total (examples below) across three themes:
1)Maximising value of existing content
2) New Audience experiences
3) Make how we do things quicker and easier
Theme 1: Pilots that maximise value of existing content include:
1)Translating contentto make it available to more people
2)Reformatting our existing contentin a way that widens its appeal
Theme 2: Pilots that aim to build new audience experiences include:
1)A BBC Assistant
2)More personalised marketing
Theme 3: Pilots that aim to do things quicker and easier include:
1) Supporting journalists
2)Streamlining how we organise and label our content
We will experiment in each of these areas over the next few months, testing and learning as we go. Well see what works, what doesnt - and make a call on what we take forward. Itll be exciting to see how this develops.
In October, we sharedclearguiding principlesto ensure we use Gen AI technology responsibly at the BBC.
As a reminder, our principles commit to harnessing the new technology to support our public mission, to prioritising talent and creativity, and being open and transparent with our audiences whenever and wherever we deploy Gen AI.
Since then, we have updated the BBCs Editorial Guidance on the use of AI.Thisis for all content-makers to ensure any use of AI supports our editorial value. They have been designed to ensure we never undermine the trust of our audience, and to ensure that all AI usage has active human oversight.
Theres lots going on and well be providing updates on this activity as it progresses.
Thanks for reading.
Read more from the original source: