Page 310«..1020..309310311312..320330..»

Vitalik Buterin Pioneers A Fresh Approach To Decentralize Ethereum Staking – Blockchain Magazine

March 28, 2024 by Diana Ambolis

114

Vitalik Buterin, Ethereums Co-founder, Proposes Penalizing Validators Based on Deviation from Average Failure Rate Ethereum co-founder Vitalik Buterin recently proposed a method to promote greater decentralization within the Ethereum network by implementing penalties for correlated failures among validators. In a post on March 27 to the Ethereum Research forum, Buterin advocated for reinforcing decentralized staking

Ethereum co-founder Vitalik Buterin recently proposed a method to promote greater decentralization within the Ethereum network by implementing penalties for correlated failures among validators.

In a post on March 27 to the Ethereum Research forum, Buterin advocated for reinforcing decentralized staking through anti-correlation incentives. His idea suggests that if multiple validators controlled by the same entity experience failures simultaneously, they would face more severe penalties compared to individual failures.

Buterin highlighted the likelihood of correlated failures among validators within the same cluster, such as staking pools, often sharing infrastructure. His proposal involves penalizing validators based on their deviation from the average failure rate, with higher penalties imposed when numerous validators fail in a given slot.

Also, read- Ethereums BlobScriptions Soar, Carrying Blob Fees Along for the Ride

By simulating this approach, Buterin believes it could diminish the advantage enjoyed by larger Ethereum stakers over smaller ones, as significant entities are more prone to causing spikes in failure rates due to correlated failures.

The proposed benefits include encouraging decentralization by necessitating separate infrastructure for each validator and making solo staking economically competitive relative to staking pools.

Buterin also suggested exploring other penalty schemes to mitigate the advantage of larger validators over smaller ones, as well as assessing the impact on geographical and client decentralization.

However, Buterin did not address the possibility of reducing the staking amount from the current 32 Ether, which is approximately valued at $3,560. This omission may be noteworthy, considering the popularity of staking pools and liquid staking services like Lido, which enable participation with smaller amounts of ETH.

As of now, Lido boasts $34 billion worth of ETH staked, representing about 30% of the total supply. While these services have gained traction, Ethereum advocates and developers have raised concerns about the dominance and potential cartelization of platforms like Lido, which could lead to outsized profits compared to non-pooled capital.

Read this article:

Vitalik Buterin Pioneers A Fresh Approach To Decentralize Ethereum Staking - Blockchain Magazine

Read More..

Ethereum Founder Vitalik Says Shiba Inu Outperformed His Expectations – The Crypto Basic

The Ethereum founder Vitalik Buterin has admitted that Shiba Inu (SHIB) massively outperformed his earlier expectations for the project.

Buterin disclosed this while reacting to a statement from Nathan Young, a Web Designer and Director at Frostwork. Young spotlighted the crypto assets held by Future of Life Institute (FLI), a non-profit focused on AI research, noting that he never believed the organizations crypto balance amounted to much.

For context, FLI was one of the beneficiaries of Vitalik Buterins Shiba Inu donations. Recall that the Ethereum founder received half of the entire Shiba Inu supply from Ryoshi, SHIBs anonymous founder. Buterin burned 410 trillion tokens and sent the rest to non-profits in May 2021.

One of these non-profits was FLI. Nathan Young had believed that the Shiba Inu tokens received by FLI were valued at low prices, but information from a filing revealed that their worth actually amounted to $665 million.

Responding to Youngs disclosure, Buterin noted that the FLI case is similar to what he observed with CryptoRelief, an Indian charity. It bears mentioning that Buterin also donated his SHIB tokens to CryptoRelief.

He sent 50.6 trillion SHIB to CryptoRelief on May 12, 2021. The worth of the tokens at the time stood at $1.2 billion. Interestingly, in his latest disclosure, Buterin revealed that he initially believed Shiba Inu would crash shortly after he received the tokens from Ryoshi.

- Advertisement -

Notably, SHIB had already recorded a massive price rally before Buterin donated the tokens. However, the Ethereum co-founder expected the asset to crash within a short period, as is customary with memecoins. He had to move fast and donate these tokens before the expected price collapse.

Buterin noted that he was expecting CryptoRelief to at least be able to salvage $10 to $25 million from the donation before any drop in SHIBs price. However, his expectations did not materialize, as Shiba Inu continued to rally after the donation, eventually hitting its all-time high of $0.00008845 in October 2021, five months later.

But of course, SHIB massively outperformed my expectations, Buterin disclosed in his recent remarks, calling attention to how the canine-themed project defied the odds, pointing to its eventual failure. Three years later, Shiba Inu still retains relevance, battling to seal its spot among the top 10.

While SHIBs worth did drop by 30% shortly after Buterins donation to CryptoRelief, the token bounced back. At the all-time high value in October 2021, the assets were worth $4.42 billion. If the charity organization had held onto the tokens until now, theyd still hold $1.57 billion.

Vitalik Buterin has consistently underestimated the meme market due to its lack of utility. In 2022, the Ethereum founder was surprised at the massive amount of Shiba Inu tokens held by Crypto.com. The Shiba Inu project is now gradually tilting toward utility, constantly exceeding expectations from industry leaders like Buterin.

Disclaimer: This content is informational and should not be considered financial advice. The views expressed in this article may include the author's personal opinions and do not reflect The Crypto Basics opinion. Readers are encouraged to do thorough research before making any investment decisions. The Crypto Basic is not responsible for any financial losses.

-Advertisement-

Follow this link:

Ethereum Founder Vitalik Says Shiba Inu Outperformed His Expectations - The Crypto Basic

Read More..

3 Strong Buy Cloud Computing Stocks to Add to Your Q2 Must-Watch List – InvestorPlace

These companies are global leaders in the cloud computing space

Cloud computing is the storing and accessing of data and programs over the internet rather than on a computers hard drive. With todays wired world creating data at an exponential rate, cloud computing is one of the fastest growing areas of technology. And spending on cloud computing services is growing almost as fast as the data thats being accumulated.

According to Statista, spending on cloud infrastructure services worldwide grew by $12 billion in the fourth quarter of 2023, reaching $73.7 billion for the three months ended Dec. 31. For all of last year, $270 billion was spent on cloud infrastructure services. The dollars up for grabs explain why technology companies large and small are racing into the space and trying to gain market share in the hotly contest and super competitive sector. Here are three strong buy cloud computing stock to add to your Q2 must-watch list.

Source: The Art of Pics / Shutterstock.com

Microsofts (NASDAQ:MSFT) current growth is not being driven by artificial intelligence (AI) but by its cloud-computing segment. The Seattle-based technology giant most recently reported that its Intelligent Cloud segment posted $25.88 billion in quarterly revenue, up 20% from a year ago. Within the cloud segment, revenue from Azure and other cloud services grew 30%, ahead of the 27% expected on Wall Street.

In its earnings release, Microsoft noted that it now has 53,000 Azure AI customers, and one-third of them are new to Azure in the past year. The company added that the number of commitments it has received to spend more than $1 billion on its Azure cloud services in the year ahead increased during the last quarter of 2023. The company has also introduced new custom cloud-computing microchips.

Source: Jonathan Weiss / Shutterstock.com

Oracle (NYSE:ORCL) is also seeing strong growth driven by its cloud unit. Oracles cloud service and license support segment, its largest business, saw sales rise 12% to $9.96 billion in Q4 2023. The growth in the cloud services unit offset declines in the companys other units, especially its hardware business where revenue fell 7% year-over-year to $754 million. The strong cloud growth also led Oracle to issue upbeat forward guidance.

The company expects earnings of $1.62 to $1.66 a share for the current first quarter of 2024, ahead of analysts estimates of $1.64 in earnings. The company said it anticipates revenue growth of 4% to 6% in the current quarter, also ahead of analyst expectations. Management also said during an earnings call that continued growth of its cloud-computing unit should help the company reach $65 billion in sales by 2026.

ORCL stock has gained 40% in the last 12 months, including a 20% increase so far in 2024.

Source: Tada Images / Shutterstock.com

Amazon (NASDAQ:AMZN) remains the cloud-computing king with a 31% share of the global market for cloud infrastructure. That said, Microsoft is quickly gaining on Amazon and now holds a 24% share of the market. There have been some concerns about Amazon Web Services (AWS) maintaining its dominant position. In its most recent quarterly results, the company reported that revenue from Amazon Web Services (AWS) totaled $24.20 billion, matching though not exceeding Wall Streets expectations.

The concerns might be a little premature. There are many analysts who expect Amazons cloud unit to get a big boost in coming quarters from the adoption of AI technologies. Analysts at Monness, Crespi, Hardt recently reiterated a buy rating on AMZN stock with a $215 price target, implying 20% upside from current levels. The analysts said: We believe the leading cloud service providers are well positioned to benefit from the early-stage ramp of generative AI projects, including AWS.

AMZN stock has gained 82% in the last 12 months, including a 19% increase so far this year.

On the date of publication, Joel Bagloleheld a long position in MSFT. The opinions expressed in this article are those of the writer, subject to the InvestorPlace.comPublishing Guidelines.

Joel Baglole has been a business journalist for 20 years. He spent five years as a staff reporter at The Wall Street Journal, and has also written for The Washington Post and Toronto Star newspapers, as well as financial websites such as The Motley Fool and Investopedia.

Read the original post:
3 Strong Buy Cloud Computing Stocks to Add to Your Q2 Must-Watch List - InvestorPlace

Read More..

Hive: Distributed Cloud Computing Company Raises 12 Million – Pulse 2.0

SC Ventures (Standard Chartereds ventures arm) is leading a 12 million (USD $13 million) Series A funding round for distributed cloud provider Hive, which aims to increase businesses and individuals access to sustainable and high-powered computing resources. OneRagtime, a French venture capital fund that led Hives Seed round, and a collection of private investors joined the round.

Hives distributed cloud aggregates individual devices unused hard drives and computing capacities. And Hive is reinventing the cloud from a centralized model that uses expensive physical servers to a distributed cloud infrastructure that aggregates individual devices unused hard drives and computing capacities.

Hives model helps businesses efficiently manage their cloud-related expenses, reduce dependency on several cloud providers, and significantly reduce cloud energy use.

Since October 2023, Hive has had over 25,000 active users and contributors from 147 countries. These users store their files on hiveDisk and contribute unused hard drives to hiveNet to lower their subscription costs and effectively build the distributed cloud.

The computing capacity contributed to hiveNet also powers hiveCompute, allowing companies to manage workloads, such as running GenAI inference, video processing, and 3D modeling. And HiveNets architecture provides access to additional CPU, GPU, or NPU when needed, boosting the much-needed computing power. Companies looking for more control could build their private hiveNet, where IT managers control the devices entirely.

In December, Hive unveiled a Joint Development Partner (JDP) initiative, working closely with key partners to innovate the cloud landscape for businesses leveraging GenAI LLM computations.

Hive is a champion of sustainable technological progress, offering a practical solution to the challenges posed by traditional cloud computing models. With its latest funding round, Hive is set on growing its team and global footprint to address the enterprise markets, starting with startups and SMBs. The team prioritizes several business areas, including product development, building an engaged community of contributing Hivers, and sales and marketing efforts to reach users at scale.

KEY QUOTES:

Hive is addressing the pressing need for a new cloud paradigm that democratizes access, lowers financial barriers, and encourages innovation. With over 70% of the computing power available in our devices and billions of devices connected to the Internet, Hives community driven model builds The Right Cloud to offer a greener, more resilient network and secure alternative that also promotes a more equitable cloud solution. We thank our investors, as well as INRIA and Bpifrance, for their continuous support as we look to achieve our ambitious goals.

David Gurl, Hive Founder

We are big believers of Hives distributed cloud technology that will enable cheaper and more efficient access to computing power and storage, a critical point when most of our ventures may have an AI component requiring increasing such computing power. In addition to our investment, our ventures will be leveraging Hives services.

Alex Manson, who heads SC Ventures

Cloud technology has opened up horizons of innovation, but it also comes with challenges in terms of costs, security, data privacy, and environmental impact, heightened by the increasing demand for computing resources, especially for artificial intelligence. Hive, with its pioneering approach to distributed cloud, makes cloud access more secure, affordable, and efficient for everyone, and enables the sharing of computational power resources. As an early investor and believer, OneRagtime is particularly excited to support Hives vision and team.

Stphanie Hospital, Founder & CEO at OneRagtime

See the original post here:
Hive: Distributed Cloud Computing Company Raises 12 Million - Pulse 2.0

Read More..

Cloud computing trends – Enterprise License Optimization Blog

The thirteenth annual Flexera 2024 State of the Cloud Report (previously known as the RightScale State of the CloudReport) highlights the latest cloud computing trends and statistics, including strategies, challenges and initiatives from a broad cross-section of industries and organizations. The cloud computing report explores the thinking of 753 IT professionals and executive leaders from a survey conducted in late Q4 2023 and highlights the year-over-year (YoY) changes to help identify trends. The respondentsglobal cloud decision makers and usersrevealed their experiences with cloud migration, cloud computing and their thoughts about the public, private and multi-cloud market.

Select highlights of the report on cloud computing are included below.

Terminology used:

This marks the second year in a row that managing cloud spending is the top challenge facing organizations. As in previous years, there needs to be more resources/expertise. More than a quarter of respondents spend over $12 million a year on cloud (29%), and nearly a quarter (22%) spend that much on SaaS.

Respondents saw a slight increase in multi-cloud usage, up from 87% last year to 89% this year.

Sixty-one percent of large enterprises use multi-cloud security, and 57% use multi-cloud FinOps (cost optimization) tools.

The top two multi-cloud implementations are: apps siloed on different clouds, DR/failover between clouds. Apps siloed on different clouds increased the most (up to 57% from 44% YoY). Data integration between clouds increased to 45% from 37% YoY as organizations looked for the best fit for applications and data analysis.

Adoption grew for Amazon Web Services (AWS), Microsoft Azure and Google Cloud. Forty-nine percent of respondents reported using AWS for significant workloads, while 45% reported using Azure and 21% reported using Google Cloud Platform. In contrast, Oracle Cloud Infrastructure, IBM and Alibaba Cloud usage is substantially lower and relatively unchanged compared to the previous year.

SMBs are the highest cloud adopters, but fell off slightly from the previous year, with 61% (a drop from 67% last year) of workloads and 60% of data in the public cloud for both years.

Nearly all platform-as-a-service (PaaS) offerings saw a gain in usage, with the most prominent being in the data warehouse (up to 65% from 56% YoY). Container-as-a-service (52%) and serverless (function-as-a-service) (48%) are both up nine percentage points this year. Machine learning/artificial intelligence (ML/AI) had a modest gain at 41%, up from 36% last year. However, ML/AI is the PaaS offering getting the most attention from companies experimenting (32%) or planning to use it (17%).

Forty-eight percent of respondents say they already have defined sustainability initiatives that include tracking the carbon footprint of cloud usage. When asked how sustainability compares to cost optimization, 59% prioritized cost optimization, though an additional 29% say that both cloud cost optimization and sustainability are equally prioritized.

The world has experienced extraordinary disruption in the past few years, and while organizations of all sizes are prioritizing every dollar of spend, the cloud and technology will weather economic storms. Enterprises that remain focused on digital transformation, seizing new opportunities and evolving strategic initiatives through a cost-conscious lens will be better positioned for success than their competitors.

Get the latest insights in cloud computing trends and cloud migration statistics by viewing the complete survey results here.

Read more here:
Cloud computing trends - Enterprise License Optimization Blog

Read More..

Private vs. public cloud security: Benefits and drawbacks – TechTarget

Regardless of whether an enterprise's infrastructure operates in a private, public, hybrid cloud or multiple clouds, cybersecurity is a critical component. Some cloud architectures greatly simplify security tasks and tool integrations, but that often comes at the cost of flexibility.

Let's look at the benefits and challenges organizations face as they compare private vs. public cloud security, as well as hybrid cloud security and multi-cloud security, in 2024 and beyond.

As its name implies, private clouds grant a business private access to dedicated infrastructure resources within a cloud. This infrastructure has both advantages and disadvantages.

Private clouds are attractive to organizations seeking more granular control over the underlying infrastructure. This commonly includes customer configuration access to the network, OSes and server virtualization platform.

From a security perspective, private cloud's advantages include the following:

The flexibility of private cloud comes at a cost in two areas: pricing and management.

For these two reasons, it is critically important that IT decision-makers carefully weigh the cybersecurity benefits of private clouds against the added financial expenses and management overhead.

Organizations can employ third-party cloud service providers (CSPs) to manage applications and data within their data center infrastructure. Many CSPs also provide built-in security tools to help protect business-critical data.

Businesses are attracted to public cloud infrastructures for a variety of reasons, including low Capex, service scalability and easing the management workload for in-house IT staff.

Public cloud model security benefits include the following:

Other businesses, especially larger ones with massive IT infrastructures, might find that public cloud security is not the right fit.

Potential public cloud security challenges include the following:

In hybrid cloud environments, some business applications and data reside in public clouds, while others are managed inside private clouds or private data centers.

With hybrid cloud, the whole might be greater than the sum of its parts. Security advantages of hybrid cloud infrastructure include the following:

Like with a private cloud, the flexibility of a hybrid cloud infrastructure has its downsides. For example, decisions about where applications and data reside are a significant responsibility and require deliberation.

Organizations should consider the following potential security disadvantages of the hybrid cloud model:

As the name suggests, a multi-cloud environment involves an organization using two or more cloud platforms or vendors. For example, an organization might use AWS for IaaS, Google App Engine for PaaS, and Microsoft 365 and Salesforce SaaS applications.

As in hybrid environments, multi-cloud deployments enable admins to put applications and data into the service with the most appropriate security levels. Similarly, they can adopt the most secure cloud offerings across CSPs.

Multi-cloud environments also offer the following security benefits:

Like hybrid cloud security challenges, multi-cloud environments require close management and consideration to decide where applications and data should reside. It can be difficult to apply a single security policy across multiple clouds, which can create security gaps. Using multiple clouds also requires security teams to know how to secure each cloud, as well as the best tools to use.

Multi-cloud deployments are also prone to the following security challenges:

With these challenges in mind, remember that many infrastructure security tools are now largely virtualized. This means the same security tools and policy configurations deployed within in-house data centers and across the corporate LAN can extend to private clouds to achieve hybrid or multi-cloud security parity. For many security departments, this greatly reduces security complexity from a uniformity point of view.

When it comes to cloud computing and cloud security, no single architecture is suitable for all businesses. IT architects must gauge the cybersecurity needs for all business applications and data sets. Once defined, the technology services can be categorized and earmarked for deployment in the public or private cloud -- whichever makes the most sense both from a cost and cybersecurity perspective.

Andrew Froehlich is founder of InfraMomentum, an enterprise IT research and analyst firm, and president of West Gate Networks, an IT consulting company. He has been involved in enterprise IT for more than 20 years.

Sharon Shea is executive editor of TechTarget Security.

Read the rest here:
Private vs. public cloud security: Benefits and drawbacks - TechTarget

Read More..

Bitmovin to run Live Encoder on Akamai cloud – Televisual

Bitmovin, a leading video streaming software solutions provider is launching its Live Encoder running on Akamai Cloud Computing.

By reducing data transfer out (DTO) costs, the combination of Bitmovin Live Encoding and Akamai can significantly help to lower operation costs.

Running Bitmovin Live Encoder on Akamai Cloud Computing is intended to help streaming services deliver better live viewing experiences across a host of use cases including sports/eSports, news, online fitness, eLearning, religious services, large-scale events, corporate communications, and political campaigns among others. Bitmovins Live Encoder also supports several ad monetization models, including 24/7 linear television channels and Free Ad-supported Television (FAST) channels.

Bitmovin can help its live-streaming customers deliver higher-quality viewing experiences, and reduce and better control costs by running Live Encoder on Akamai Cloud Computing, said Dan Lawrence, Vice President of Cloud Computing at Akamai (pictured). Placing and executing Live Encoders critical compute functions closer to end users can realize lower latency streaming while maintaining the high quality of service that consumers have come to expect and demand from streaming providers. It can also help dramatically reduce DTO fees in many cases. Collectively, we believe this meets the industrys desire to continue raising the standards of live streaming, provide lower and more predictable operational costs, and more opportunities to monetize content.

Bitmovins Live Encoder has a user interface designed to make it easy for users of all levels to set up live streams quickly, while Bitmovins API gives developers control over every aspect of the encoding pipeline. Live Encoder is pre-integrated with Akamai Media Services Live to support live-to-VOD and live clipping, which is part of Akamai Connected Cloud to support secure and efficient streaming at massive scale across Akamais global content delivery network (CDN).

Customers who run Bitmovins Live Encoder on Akamai will also benefit from pre-integrated third-party solutions for their video streaming workflows, including Videons LiveEdge contribution encoders, Grass Valleys Agile Media Processing Platform (AMPP) for live production, Zixi for secure transport and ingest; EZDRM for multi-DRM and content encryption; Yospace for Server-side Ad Insertion (SSAI), and more.

Our Live Encoder elevates live streaming, eliminating sub-par image and audio quality so audiences can enjoy truly immersive live experiences, said Stefan Lederer, CEO and co-founder of Bitmovin, Its a huge honor to announce our Live Encoder is running on Akamai Cloud Computing, which will help organizations of every size accelerate the quality of their live streaming workflows and deliver world-class viewing experiences.

The Bitmovin Live Transcoder running on Akamai comes by way of Bitmovin joining the Akamai Qualified Computing Partner (QCP) Program. The program is designed to make solution-based services that are interoperable with Akamai Cloud Computing services easily accessible to Akamai customers. The services are provided by Akamai technology partners that complete a rigorous qualification process to ensure they are readily available to deploy and scale across the globally distributed Akamai Connected Cloud.

Bitmovin will demonstrate its Live Encoding on Akamai Cloud Computing at the 2024 NAB Show in Las Vegas, April 14-17 (Bitmovin exhibitor stand W3013, Akamai meeting space W235LMR).

Share this story

Follow this link:
Bitmovin to run Live Encoder on Akamai cloud - Televisual

Read More..

Google DeepMind unveils ‘superhuman’ AI system that excels in fact-checking, saving costs and improving accuracy – VentureBeat

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.

A new study from Googles DeepMind research unit has found that an artificial intelligence system can outperform human fact-checkers when evaluating the accuracy of information generated by large language models.

The paper, titled Long-form factuality in large language models and published on the pre-print server arXiv, introduces a method called Search-Augmented Factuality Evaluator (SAFE). SAFE uses a large language model to break down generated text into individual facts, and then uses Google Search results to determine the accuracy of each claim.

SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results, the authors explained.

The researchers pitted SAFE against human annotators on a dataset of roughly 16,000 facts, finding that SAFEs assessments matched the human ratings 72% of the time. Even more notably, in a sample of 100 disagreements between SAFE and the human raters, SAFEs judgment was found to be correct in 76% of cases.

The AI Impact Tour Atlanta

Continuing our tour, were headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.

While the paper asserts that LLM agents can achieve superhuman rating performance, some experts are questioning what superhuman really means here.

Gary Marcus, a well-known AI researcher and frequent critic of overhyped claims, suggested on Twitter that in this case, superhuman may simply mean better than an underpaid crowd worker, rather a true human fact checker.

That makes the characterization misleading, he said. Like saying that 1985 chess software was superhuman.

Marcus raises a valid point. To truly demonstrate superhuman performance, SAFE would need to be benchmarked against expert human fact-checkers, not just crowdsourced workers. The specific details of the human raters, such as their qualifications, compensation, and fact-checking process, are crucial for properly contextualizing the results.

One clear advantage of SAFE is cost the researchers found that using the AI system was about 20 times cheaper than human fact-checkers. As the volume of information generated by language models continues to explode, having an economical and scalable way to verify claims will be increasingly vital.

The DeepMind team used SAFE to evaluate the factual accuracy of 13 top language models across 4 families (Gemini, GPT, Claude, and PaLM-2) on a new benchmark called LongFact. Their results indicate that larger models generally produced fewer factual errors.

However, even the best-performing models generated a significant number of false claims. This underscores the risks of over-relying on language models that can fluently express inaccurate information. Automatic fact-checking tools like SAFE could play a key role in mitigating those risks.

While the SAFE code and LongFact dataset have been open-sourced on GitHub, allowing other researchers to scrutinize and build upon the work, more transparency is still needed around the human baselines used in the study. Understanding the specifics of the crowdworkers background and process is essential for assessing SAFEs capabilities in proper context.

As the tech giants race to develop ever more powerful language models for applications ranging from search to virtual assistants, the ability to automatically fact-check the outputs of these systems could prove pivotal. Tools like SAFE represent an important step towards building a new layer of trust and accountability.

However, its crucial that the development of such consequential technologies happens in the open, with input from a broad range of stakeholders beyond the walls of any one company. Rigorous, transparent benchmarking against human experts not just crowdworkers will be essential to measure true progress. Only then can we gauge the real-world impact of automated fact-checking on the fight against misinformation.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

View original post here:
Google DeepMind unveils 'superhuman' AI system that excels in fact-checking, saving costs and improving accuracy - VentureBeat

Read More..

Pandas: From Messy To Beautiful. This is how to make your pandas code | by Anna Zawadzka | Mar, 2024 – Towards Data Science

Scripting around a pandas DataFrame can turn into an awkward pile of (not-so-)good old spaghetti code. Me and my colleagues use this package a lot and while we try to stick to good programming practices, like splitting code in modules and unit testing, sometimes we still get in the way of one another by producing confusing code.

I have gathered some tips and pitfalls to avoid in order to make pandas code clean and infallible. Hopefully youll find them useful too. We'll get some help from Robert C. Martin's classic Clean code specifically for the context of the pandas package. TL;DR at the end.

Lets begin by observing some faulty patterns inspired by real life. Later on, well try to rephrase that code in order to favor readability and control.

Pandas DataFrames are value-mutable [2, 3] objects. Whenever you alter a mutable object, it affects the exact same instance that you originally created and its physical location in memory remains unchanged. In contrast, when you modify an immutable object (eg. a string), Python goes to create a whole new object at a new memory location and swaps the reference for the new one.

This is the crucial point: in Python, objects get passed to the function by assignment [4, 5]. See the graph: the value of df has been assigned to variable in_df when it was passed to the function as an argument. Both the original df and the in_df inside the function point to the same memory location (numeric value in parentheses), even if they go by different variable names. During the modification of its attributes, the location of the mutable object remains unchanged. Now all other scopes can see the changes too they reach to the same memory location.

Actually, since we have modified the original instance, its redundant to return the DataFrame and assign it to the variable. This code has the exact same effect:

Heads-up: the function now returns None, so be careful not to overwrite the df with None if you do perform the assignment: df = modify_df(df).

In contrast, if the object is immutable, it will change the memory location throughout the modification just like in the example below. Since the red string cannot be modified (strings are immutable), the green string is created on top of the old one, but as a brand new object, claiming a new location in memory. The returned string is not the same string, whereas the returned DataFrame was the exact same DataFrame.

The point is, mutating DataFrames inside functions has a global effect. If you dont keep that in mind, you may:

Well fix that problem later, but here is another don't before we pass to do's

The design from the previous section is actually an anti-pattern called output argument [1 p.45]. Typically, inputs of a function will be used to create an output value. If the sole point of passing an argument to a function is to modify it, so that the input argument changes its state, then its challenging our intuitions. Such behavior is called side effect [1 p.44] of a function and those should be well documented and minimized because they force the programmer to remember the things that go in the background, therefore making the script error-prone.

When we read a function, we are used to the idea of information going in to the function through arguments and out through the return value. We dont usually expect information to be going out through the arguments. [1 p.41]

Things get even worse if the function has a double responsibility: to modify the input and to return an output. Consider this function:

It does return a value as you would expect, but it also permanently modifies the original DataFrame. The side effect takes you by surprise - nothing in the function signature indicated that our input data was going to be affected. In the next step, we'll see how to avoid this kind of design.

To eliminate the side effect, in the code below we have created a new temporary variable instead of modifying the original DataFrame. The notation lengths: pd.Series indicates the datatype of the variable.

This function design is better in that it encapsulates the intermediate state instead of producing a side effect.

Another heads-up: please be mindful of the differences between deep and shallow copy [6] of elements from the DataFrame. In the example above we have modified each element of the original df["name"] Series, so the old DataFrame and the new variable have no shared elements. However, if you directly assign one of the original columns to a new variable, the underlying elements still have the same references in memory. See the examples:

You can print out the DataFrame after each step to observe the effect. Remember that creating a deep copy will allocate new memory, so its good to reflect whether your script needs to be memory-efficient.

Maybe for whatever reason you want to store the result of that length computation. Its still not a good idea to append it to the DataFrame inside the function because of the side effect breach as well as the accumulation of multiple responsibilities inside a single function.

I like the One Level of Abstraction per Function rule that says:

We need to make sure that the statements within our function are all at the same level of abstraction.

Mixing levels of abstraction within a function is always confusing. Readers may not be able to tell whether a particular expression is an essential concept or a detail. [1 p.36]

Also lets employ the Single responsibility principle [1 p.138] from OOP, even though were not focusing on object-oriented code right now.

Why not prepare your data beforehand? Lets split data preparation and the actual computation in separate functions.:

The individual task of creating the name_len column has been outsourced to another function. It does not modify the original DataFrame and it performs one task at a time. Later we retrieve the max element by passing the new column to another dedicated function. Notice how the aggregating function is generic for Collections.

Lets brush the code up with the following steps:

The way we have split the code really makes it easy to go back to the script later, take the entire function and reuse it in another script. We like that!

There is one more thing we can do to increase the level of reusability: pass column names as parameters to functions. The refactoring is going a little bit over the top, but sometimes it pays for the sake of flexibility or reusability.

Did you ever figure out that your preprocessing was faulty after weeks of experiments on the preprocessed dataset? No? Lucky you. I actually had to repeat a batch of experiments because of broken annotations, which could have been avoided if I had tested just a couple of basic functions.

Important scripts should be tested [1 p.121, 7]. Even if the script is just a helper, I now try to test at least the crucial, most low-level functions. Lets revisit the steps that we made from the start:

1. I am not happy to even think of testing this, its very redundant and we have paved over the side effect. It also tests a bunch of different features: the computation of name length and the aggregation of result for the max element. Plus it fails, did you see that coming?

2. This is much better we have focused on one single task, so the test is simpler. We also dont have to fixate on column names like we did before. However, I think that the format of the data gets in the way of verifying the correctness of the computation.

3. Here we have cleaned up the desk. We test the computation function inside out, leaving the pandas overlay behind. Its easier to come up with edge cases when you focus on one thing at a time. I figured out that Id like to test for None values that may appear in the DataFrame and I eventually had to improve my function for that test to pass. A bug caught!

4. Were only missing the test for find_max_element:

One additional benefit of unit testing that I never forget to mention is that it is a way of documenting your code, as someone who doesnt know it (like you from the future) can easily figure out the inputs and expected outputs, including edge cases, just by looking at the tests. Double gain!

These are some tricks I found useful while coding and reviewing other peoples code. Im far from telling you that one or another way of coding is the only correct one you take what you want from it, you decide whether you need a quick scratch or a highly polished and tested codebase. I hope this thought piece helps you structure your scripts so that youre happier with them and more confident about their infallibility.

If you liked this article, I would love to know about it. Happy coding!

TL;DR

Theres no one and only correct way of coding, but here are some inspirations for scripting with pandas:

Donts:

- dont mutate your DataFrame too much inside functions, because you may lose control over what and where gets appended/removed from it,

- dont write methods that mutate a DataFrame and return nothing because that's confusing.

Dos:

- create new objects instead of modifying the source DataFrame and remember to make a deep copy when needed,

- perform only similar-level operations inside a single function,

- design functions for flexibility and reusability,

- test your functions because this helps you design cleaner code, secure against bugs and edge cases and document it for free.

The graphs were created by me using Miro. The cover image was also created by me using the Titanic dataset and GIMP (smudge effect).

Read more from the original source:

Pandas: From Messy To Beautiful. This is how to make your pandas code | by Anna Zawadzka | Mar, 2024 - Towards Data Science

Read More..

How the New Breed of LLMs is Replacing OpenAI and the Likes – DataScienceCentral.com – Data Science Central

Of course, OpenAI, Mistral, Claude and the likes may adapt. But will they manage to stay competitive in this evolving market? Last week Databricks launched DBRX. It clearly shows the new trend: specialization, lightweight, combining multiple LLMs, enterprise-oriented, and better results at a fraction of the cost. Monolithic solutions where you pay by the token encourage the proliferation of models with billions or trillions of tokens, weights and parameters. They are embraced by companies such as Nvidia, because they use a lot of GPU and make chip producers wealthy. One of the drawbacks is the cost incurred by the customer, with no guarantee of positive ROI. The quality may also suffer (hallucinations).

In this article, I discuss the new type of architecture under development. Hallucination-free, they achieve better results at a fraction of the cost and run much faster. Sometimes without GPU, sometimes without training. Targeting professional users rather than the layman, they rely on self-tuning and customization. Indeed, there is no universal evaluation metric: laymen and experts have very different ratings and expectations when using these tools.

Much of this discussion is based on the technology that I develop for a fortune 100 company. I show the benefits, but also potential issues. Many of my competitors are moving in the same direction.

Before diving into the architecture of new LLMs, lets first discuss the current funding model. Many startups get funding from large companies such as Microsoft, Nvidia or Amazon. It means that they have to use their cloud solutions, services and products. The result is high costs for the customer. Startups that rely on vendor-neutral VC funding face a similar challenge: you cannot raise VC money by saying that you could do better and charge 1000x less. VC firms expect to make billions of dollars, not mere millions. To maintain this ecosystem, players spend a lot of money on advertising and hype. In the end, if early investors can quickly make big money through acquisitions, it is a win. What happens when clients realize ROI is negative, is unimportant. As long as it does not happen too soon! But can investors even achieve this short-term goal?

The problem is compounded by the fact that researchers believe deep neural networks (DNN) are the panacea, with issues simply fixed by using bigger data, multiple transforms to make DNN work, or front-end patches such as prompt engineering, to address foundational back-end problems. Sadly, no one works on ground-breaking innovations outside DNNs. I am an exception.

In the end, very few self-funded entrepreneurs can compete, offering a far less expensive alternative with no plan on becoming a billionaire. I may be the only one able to survive and strive, long-term. My intellectual property is open-source, patent-free, and comes with extensive documentation, source code, and comparisons. It appeals to large, traditional corporations. The word is out; it is no longer a secret. In turn, it puts pressure on big players to offer better LLMs. They can see how I do it and implement the same algorithms on their end. Or come up with their own solutions independently. Either way, the new type of architecture is pretty much the same in all cases, not much different from mine. The new Databricks LLM (DBRX) epitomizes this trend. Mine is called XLLM.

Surprisingly, none of the startups working on new LLMs consider monetizing their products via advertising: blending organic output with sponsored results relevant to the user prompt. I am contemplating doing it, with a large client interested in signing-up when the option is available.

As concisely stated by one of my clients, the main issues to address are:

In addition to blending specialized LLMs (one per top category with its own set of embeddings and other summary tables) a new trend is emerging. It consists of blending multiple LLMs focusing on the same topic, each one with its own flavor: technical, general, or based on different parameters. Then, combining these models just like XGBoost combines multiple small decisions trees to get the best from all. In short, an ensemble method.

Note that speed and accuracy result from using many small, specialized tables (embeddings and so on) as opposed to a big table with long, fixed-size embedding vectors and expensive semantic / vector search. The user selects the categories that best match his prompt. In my case, there is no neural network involved, no GPU needed, yet no latency and no hallucinations. Liability is further reduced with a local implementation, and explainable AI.

Carefully selecting input sources (in many cases, corporate repositories augmented with external data) and smart crawling to reconstruct the hidden structure (underlying taxonomy, breadcrumbs, navigation links, headings, and so on), are critical components of this architecture.

For details about xLLM (technical implementation, comparing output with OpenAI and the likes on the same prompts, Python code, input sources, and documentation), see here. I also offer a free course on the topic, here.

Vincent Granville is a pioneering GenAI scientist and machine learning expert, co-founder of Data Science Central (acquired by a publicly traded company in 2020), Chief AI Scientist atMLTechniques.comandGenAItechLab.com, former VC-funded executive, author (Elsevier) and patent owner one related to LLM. Vincents past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET.Follow Vincent on LinkedIn.

Read the rest here:

How the New Breed of LLMs is Replacing OpenAI and the Likes - DataScienceCentral.com - Data Science Central

Read More..