Category Archives: Cloud Servers
Ampere Computing 2024 Roadmap Update: 256 Core 3nm CPU In 2025 – Phoronix
Ampere Computing today made public their roadmap update concerning current and future AArch64 server processors. AmpereOne availability remains tough but the company is hoping next year to introduce a 3nm CPU with up to 256 cores and supporting 12 channel DDR5 memory.
Ampere's first update to share was that they are collaborating with Qualcomm to pair Ampere CPUs with Qualcomm Cloud AI100 Ultra accelerators. Ampere Computing CPUs will be paired with Qualcomm AI accelerators in Supermicro servers as a new option for AI inference servers. (As a reminder, the Qualcomm AI accelerator hardware does have an open-source Linux kernel driver: QAIC). This announcement came as a bit of a surprise as Ampere Computing has been promoting "GPU-Free AI Inferencing" and a lot around just CPU-based inferencing... With the Qualcomm accelerators still technically GPU-Free. Initially these LLM AI servers will be using Ampere Altra processors while in the coming "months" will have AmpereOne processor options.
And then the most exciting roadmap update: Ampere Computing is planning for a 3nm AmpereOne CPU with up to 256 cores and 12 channels of DDR5 memory support in 2025. Ampere Computing says that the next iteration of AmpereOne CPUs is "ready at fab". When inquiring about the 12 channel DDR5 memory, it's anticipated to be at DDR5-5200 speeds or higher. Twelve channels of DDR5 memory support matches that of AMD EPYC Bergamo/Genoa as well as upcoming Intel Xeon 6 processors.
Being curious if the 256 core AmpereOne was simply a die shrink and core count increase, I brought up the matter of Ampere-1B... Several months back I spotted Ampere1B appearing in the LLVM Clang compiler and also the GCC compiler. When bringing up Ampere-1B in the context of the new AmpereOne CPUs in 2025, I was able to get it confirmed that indeed will be the "1B" variant. So with that some ISA additions with Ampere-1B having Armv8.7+ with FEAT WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions plus 1A additions like the Memory Tagging Extension (MTE) and SM3/SM4 cryptography.
While this teasing over a 256-core AmpereOne processor and indicating AmpereOne is "generally available", that leads to the elephant in the room: where the hell is AmpereOne? It seems next to impossible to find. Ampere Computing has been talking about AmpereOne since 2022, last May it officially was "announced" with up to 192 cores and last September AmpereOne was announced for Oracle Cloud with limited availability in Q4'2023, but it seems next to impossible to actually find in the middle of 2024.
There have also been no independent reviews/benchmarks of AmpereOne and Ampere Computing is all the happy continuing to promote Ampere Altra on social media channels, new blog posts, and with these roadmap updates. Ampere Altra has stood up well but mind you it was announced in 2020 and has been benchmarked since late 2020... Not 1~2 years later like we are now in the AmpereOne cycle. We are now at the middle of 2024 with AmpereOne supposedly "generally available" but seems next to impossible to find and still awaiting any independent AmpereOne benchmarking analysis or even access to the Oracle Cloud AmpereOne instances... Everyone I know that is independently interested in AmpereOne and trying to acquire the hardware has yet to be successful in doing so. Simply put, it's been frustrating and rather ghostly all the meanwhile Intel Xeon 6 is imminent with Sierra Forest followed by Granite Rapids for higher core counts there with 144 cores per socket and even a 288 core variant of Xeon SRF. AMD EPYC 9754 Bergamo has been available since last year with 128 cores / 256 threads per socket, Zen 4C has boasted great power efficiency improvements, and there is also AMD EPYC Zen 5 on the horizon.
When inquiring about AmpereOne availability during the advanced briefing, I was told they've been busy ramping up with their large customers and then over the months ahead they will be expanding more to their medium and smaller customers. Today's press release goes on to note that "new AmpereOne OEM and ODM platforms would be shipping within a few months." The explanation also seems a bit odd considering the lack of GA access still to AmpereOne at any of the large cloud providers. We'll see when the availability story pans out and if/when we see any AmpereOne hardware for independent reviews/benchmarking. The battle is only becoming more tough if the broad availability on AmpereOne drags out past the launch of Intel Xeon 6 Sierra Forest / Granite Rapids and AMD EPYC Zen 5 hardware. AmpereOne when originally announced was before the 128-core Bergamo even shipped while there are rumors its successor could be pushing 192 cores per socket this year and further improvements to x86_64 power efficiency.
Last year's roadmap update also signaled the AmpereOne family is for 136 to 192 cores (or now, 256 cores in 2025) while Ampere Altra is for 128 cores and less. Sadly this year's update didn't shed any light on any AmpereOne expansion for 128 cores or less or any sort of refresh for those pursuing any smaller AArch64 core counts while wanting a modern design. So it's not clear if Ampere is just pursuing the larger (cloud) servers or if in time they will release any new products for 128 cores and less. There certainly is interest and a need for AArch64 workstation type systems while for now Ampere Altra is working out for the lack of any viable competition there for socketed AArch64 desktops/workstations. It will be interesting to see what Intel Sierra Forest offers at lower core counts for pure E core efficiency and similarly what AMD is already delivering with the EPYC 8004 Siena processors for great power efficiency and scalability.
Continue reading here:
Ampere Computing 2024 Roadmap Update: 256 Core 3nm CPU In 2025 - Phoronix
Kinsing malware exploits Apache Tomcat on Linux clouds – SecurityBrief Australia
Tenable's Cloud Security Research team has unearthed a series of attacks by the Kinsing malware family, particularly targeting Linux-based cloud infrastructures. In a new development, these malicious programmes are now exploiting Apache Tomcat servers, adopting new advanced stealth techniques for file system penetration and persistence.
Kinsing, a malware family operational for numerous years, primarily attacks Linux-based cloud infrastructure. Known for exploiting a range of vulnerabilities to gain unauthorised access, the hostile actors behind the Kinsing malware frequently install backdoors and illicitly deploy cryptocurrency miners on compromised systems. Once the infection has taken hold, Kinsing co-opts system resources, employing these for cryptomining. This redirection of system resources inhibits server performance and increases operational costs.
The new information disclosed by Tenable today adds another level of complexity to these malicious endeavours exploiting Apache Tomcat servers while adopting fresh tactics for evasion on the file system. A noteworthy aspect of these new methods is Kinsing's use of seemingly innocent and non-suspicious file locations to maintain its presence on the system.
Speaking on this security concern, Ari Eitan, Manager - Research at Tenable, highlighted the growing trend of cloud cryptomining in recent times. This has been largely facilitated by the scalability and flexibility of cloud platforms. Eitan posits that, "Unlike traditional on-premises infrastructure, cloud infrastructure allows attackers to quickly deploy resources for cryptomining, making it easier to exploit." The research team previously discovered multiple servers infected with Kinsing in a single environment, including an Apache Tomcat server with critical vulnerabilities.
Thus, the emergence of the Kinsing malware and its evolution to exploit Apache Tomcat servers with new advanced stealth techniques adds an insidious threat to Linux-based cloud infrastructures. These developments signify how malicious actors are continually devising new strategies to exploit system vulnerabilities for their gain. As Ari Eitan underscores, the extensive capability of cloud infrastructure that allows swift deployment of resources for cryptomining can now equally be exploited by threat actors with relative ease.
This disclosure underscores the exponentially growing cybersecurity threat landscape. As malicious actors become more innovative in their tactics, robust and up-to-date security measures are of the utmost importance. Therefore, the essential role of cyber defense teams such as Tenable's Cloud Security Research team becomes increasingly vital as they enhance their efforts to identify, expose and mitigate such threats.
Read more:
Kinsing malware exploits Apache Tomcat on Linux clouds - SecurityBrief Australia
AWS to launch European ‘sovereign cloud’ in Germany by 2025, earmarks 7.8B – TNW
Amazon Web Services (AWS) today confirmed plans to launch its European sovereign cloud, aiming to enhance data residency and security across the EU.
The city of Bradenburg in Germany will be the first region to host the cloud servers, which are set to power upby the end of 2025. AWS will invest 7.8bn through 2040.
According to the tech giant, the European sovereign cloud will have its entire infrastructure within the EU and will operate independently from existing cloud regions. Only EU-resident and bloc-based AWS employees will have access to the system.
The service is especially designed for public sector customers and private companies operating in highly-regulated industries, with increased requirements for data residency and autonomy.
The <3 of EU tech
The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!
As such, it enables cloud users to keep all data and their metadata within the EU.
AWS expects its investment to contribute 17.2bn to Germanys GDP by 2040, supporting an average of 2,800 full-time jobs per year within the data centres supply chain.
Although it initially resisted the concept of a sovereign cloud, AWS first pledged to work on the service in 2022, amid mounting regulatory pressure and increasing competition.
Alongside Google and Microsoft, AWSs cloud services dominate the European market. But growing EU concerns over data handling and storage by overseas companies have resulted in a revamped data strategy and intensified the push towards digital sovereignty.
For their part, Google and Microsoft have already launched their own sovereign cloud solutions.
Here is the original post:
AWS to launch European 'sovereign cloud' in Germany by 2025, earmarks 7.8B - TNW
Apple to employ AI cloud servers using its own processors – Macworld
A new Bloomberg report details a project code-named ACDC, for Apple Chips in Data Centers, in which Apple will use its own silicon to provide cloud AI services. Apple is going big on AI with iOS 18 and macOS 15, and while on-device processing will be a big differentiator for the company, more advanced tasks will require the resources of big server infrastructure.
According to the report, the plan to use Apples own chips for cloud infrastructure began three years ago but has been accelerated due to the need to quickly bring to market advanced AI features. The first AI server chips will be M2 Ultra processors, it says, but there are already plans to upgrade to chips based on the M4 series in the future.
Apple is expected to perform relatively simple generative AI tasks (like summarizing your missed text messages) on-device, especially those that use your private data which Apple will surely want to ensure stays on your iPhone. The cloud would be used for more intensive gen-AI tasks like image generation or composing lengthier emails. According to the report, the upgraded version of the Siri voice assistant would use cloud processing as well, though we expect simple answers and tasks that use the information contained on your iPhone to still be processed and executed entirely on-device as they are now.
The company is expected to use its own data centers at first, but just as it does with iCloud, will augment that with third-party data centers from Google or other partners.
Well hear more about Apples AI plans and products in about one month at WWDC.
See the original post here:
Apple to employ AI cloud servers using its own processors - Macworld
Chinese server CPU beats Microsoft, Google and AWS rivals to grab performance crown Alibaba’s Yitian 710 is … – TechRadar
Alibaba Cloud's Yitian 710 processor is the most efficient Arm-based server processor for database tasks in hyperscale cloud environments around today, new research has claimed.
A recent study published in the Transactions on Cloud Computing journal by IEEE found the 128-core processor, developed in 2021, not only trumps rival Arm-based chips, but is reported to run circles around Intels Xeon Platinum (Sapphire Rapids) processor when it comes to specific database tasks in the cloud.
This finding comes from a research paper titled "Are Arm Cloud Servers Ready for Database Workloads? An Experimental Study," produced by research assistant professor Dumitrel Loghin from the School of Computing at the National University of Singapore. The study, conducted across eight cloud servers, tested performance of five Arm-powered server CPUs, including the Yitian 710, and pitted them against Intel's x86 Xeon Platinum 8488C processor (launched in 2023).
Key players such as AWS, Microsoft Azure, and Google Cloud Platform are no strangers to 64-bit Arm CPUs, having introduced their own versions of virtual machines running on these servers. AWSs Graviton2 and Graviton3, Alibabas Yitian 710, Huaweis Kunpeng 920, and Ampere Altra CPUs used by Azure and GCP were all included in the analysis.
Alibaba's Yitian 710 was ahead of its rivals in synthetic Dhrystone and Whetstone benchmarks and the study concluded that it, alongside AWSs Graviton3, are genuine rivals to Intel's Xeon CPUs, showcasing equal or even superior results for in-memory workloads. That said, for Online Analytical Processing (OLAP), Machine Learning inference, and blockchain tasks, Arm-based servers struggled to match Xeon. The lag was chalked up mainly to unoptimized software, lower clock frequency, and subpar core level performance.
You can view the full set of benchmarks in The Registers report, which also notes that the Yitian 710 has some inherent advantages: it uses a newer version of the Arm ISA, and speedy DDR5 RAM that some rival CPUs cant utilize.
The report concludes that while ARM servers spend 2X more time in Linux kernel system calls compared to Xeon servers they show great potential. Given their lower cloud computing price, ARM servers could be the ideal choice when the performance is not critical.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
See original here:
Chinese server CPU beats Microsoft, Google and AWS rivals to grab performance crown Alibaba's Yitian 710 is ... - TechRadar
Apple plans to use M2 Ultra chips in the cloud for AI – The Verge
Apple plans to start its foray into generative AI by offloading complex queries to M2 Ultra chips running in data centers before moving to its more advanced M4 chips.
Bloombergreportsthat Apple plans to put its M2 Ultra on cloud servers to run more complex AI queries, while simple tasks are processed on devices. The Wall Street Journal previously reported that Apple wanted to make custom chips to bring to data centers to ensure security and privacy in aproject the publicationsaysis called Project ACDC, or Apple Chips in Data Center.But the company now believes its existing processors already have sufficient security and privacy components.
The chips will be deployed to Apples data centers and eventually to servers run by third parties. Apple runs its own servers across the United States and has been working on anew center in Waukee, Iowa,which it first announced in 2017.
While Apple has not moved as fast on generative AI as competitors like Google, Meta, and Microsoft, the company has been putting out research on the technology. In December, Apples machine learning research teamreleased MLX, a machine learning framework that can make AI models run efficiently on Apple silicon. The company has also been releasing other research around AI models that hint atwhat AI could look like on its devicesand how existing products, like Siri,may get an upgrade.
Apple put a big emphasis on AI performance in itsannouncement of the new M4 chip, saying its new neural engine is an outrageously powerful chip for AI.
Read more here:
Apple plans to use M2 Ultra chips in the cloud for AI - The Verge
For complex iPhone AI tasks, Apple will use cloud-based servers running M-series chips – PhoneArena
Apple is planning on having more complex AI tasks for iPhones, iPads, and Macs get sent through the cloud to data centers using servers powered by Apple's powerful in-house chips. Less complicated AI tasks will be handled directly on-device which will make them faster and more secure. According to a report in Bloomberg written by the news agency's chief Apple correspondent Mark Gurman, the first chips to be used to power the servers in the data centers will be the M2 Ultra. That chip is currently used to run the Mac Pro and Mac Studio. The scuttlebutt calls for Apple to eventually develop an M4 Ultra chip to power the servers in the data centers. Apparently Apple had come up with a plan to use its own chips and cloud-based servers to run complex AI tasks three years ago but decided to accelerate the timeline once OpenAI kicked off the latest AI craze with the ChatGPT chatbot. In December 2022, when ChatGPT first started to become known to the public, Gmail developer Paul Buchheit said that AI will do to internet search what Google did to the Yellow Pages. Namely, make the older technology obsolete.
Apple will reportedly use the M2 Ultra chip to power the first servers used in the data centers
If you're like me, you can't wait to see how Siri is affected by Apple's AI initiative. The virtual digital assistant, originally launched with the iPhone 4s in 2011, soon found itself not as useful as Google Assistant with too many responses consisting of excepts from three websites. Hopefully the use of AI will help Siri deliver more precise responses to queries.
Read more:
For complex iPhone AI tasks, Apple will use cloud-based servers running M-series chips - PhoneArena
IBM’s New Power Server Extends AI Workloads from Core to Cloud to Edge – HPCwire
May 7, 2024 IBM announced today the expansion of its portfolio of servers with the introduction of IBM Power S1012. This 1-socket, half-wide Power10 processor-based system delivers up to 3X more performance per core versus Power S812, extending AI workloads from core to cloud to edge for added business value across industries. In this recent blog post, Steve Sibley, vice president of IBM Power Product Management, takes a look at the new offering.
As more organizations embrace the promise of artificial intelligence to further drive business value, we are seeing clients in industries such as retail, manufacturing, healthcare, and transportation are deploying workloads at the edge to capitalize on data where it originates.
In their March 2024 Market Guide for Edge Computing, Gartner noted, By placing data, data management capabilities and analytic workloads at optimal points, ranging all the way to endpoint devices, enterprises can enable more real-time use cases. In addition, the flexibility to move data management workloads up and down the continuum from centralized data centers or from the cloud-to-edge devices will enable greater optimization of resources.
To aid in that effort, today IBM announced the expansion of its portfolio of servers with the introduction of IBM Power S1012. This 1-socket, half-wide Power10 processor-based system delivers up to 3X more performance per core versus Power S812. 2 It is available in a 2U rack-mounted or tower deskside form factor and is optimized for edge-level computing and also delivers the lowest entry price point in the Power portolio to run core workloads for small and medium-sized organizations. IBM Power S1012 provides clients the flexibility to run AI inferencing workloads in remote office and back office (ROBO) locations outside mainstream datacenter facilities, and in direct connection to cloud services such as IBM Power Virtual Server for backup and disaster recovery.
Achieving More in Less Space at the Edge
IBM Power S1012 is designed to enhance remote management capabilities for clients looking to expand applications such as AI inferencing from core to cloud and at the edge. Edge computing can also provide a competitive advantage with real-time insights across industries, with examples that include analyzing customer behavior in retail; monitoring and optimizing production processes in manufacturing; and many others.
IBM and enhanced analytics ecosystem partner Equitus joint clients use IBM Power to run AI models at the edge to provide object classification for defense purposes. Equitus Corp. needed mission-critical hardware platforms for deep edge, forward operations, air-gapped, and traditional cloud environments. We found that IBM Power10 and its Matrix Math Accelerator (MMA) delivered the best tech for inferencing on the edge as easily as in the data center, said Matt Niessen, President, Equitus Federal Corp. Today, clients can deploy our Equitus Video Sentinel (EVS) and Knowledge Graph Neural Network (KGNN) AI systems on IBM Power10 servers and Red Hat OpenShift Container Platform for many use cases, including the most crucial ones like helping protect national security. IBM Power S1012 will provide the latest capabilities to support AI inferencing where the data itself is generated.
IBM Power S1012 is engineered to:
Support and Availability
Maintaining high availability throughout the life of systems like IBM Power S1012 is critical. IBM Power Expert Care offers a way of attaching services and support through a tiered approach right away. Clients can receive an optimum level of support for the mission-critical requirements of their IT infrastructure with options ranging from 3 to 5 years of coverage depending on the support tier. Additionally, there are optional committed service levels available, depending on client needs, which can provide further customization and support.
IBM Power S1012 will be generally available from IBM and certified Business Partners on June 14, 2024.
Source: Steve Sibley, Vice President IBM Power Product Management
Read more:
IBM's New Power Server Extends AI Workloads from Core to Cloud to Edge - HPCwire
Apple to Utilize Cloud Servers for iOS 18 AI Features – NextPit International
TheiOS 18 is widely rumored to add a slew of new generative AI features on the iPhonesand iPads. While it was originally believed these will be processed locally on the device, it is now shaping up that Apple could actually utilize cloud servers to power much of the features. And the company could even tapits good ol'chipset for this to happen.
Citing an internal source, Mark Gurman says in hislatest Power On newsletterthat Apple is buildingnew AIserversto power the generative AI features on the iPhones and iPads. Gurman addsthat the servers will be utilizing Apple's custom M2 Ultra chipset thatdebuted in Mac desktopslast year instead of using non-Apple chipsets.
According to the journalist's source,this is part of Apple's new project called ACDC, or Apple Chips in Data Centers. It was highlightedthat the choice for M2 Ultra chip will give the advantage of increased security and privacy compared with third-party hardware.
However, it wasn't ruled out that the Cupertino willuse third-party cloud servers to handle less crucial tasks when it expands in the future. Plus, it hasnoticed that it will put up AI servers fitted with its more powerful M4 chips in the future.
Gurman saysthat only the most advanced AI taskswill be done in the cloud, which could cover features like images and complex textgenerations.Meanwhile, those basic tasks such as live translation willbe processed on the iPhone utilizing the supposed Apple mobile chipset these devices will run on.
It appears Apple is following Samsung and Google's approach into infusing next-level AI on smartphones. For instance, the GooglePixel 8 Pro's (review)Video Boost feature uses Google's cloud servers to enhance the quality of videos due to the intensive requirements needed. But in contrast to the Samsung Galaxy S24 range, only some of those Galaxy AI features that use an internet connection are said to use the cloud.
It's also unclear how Siri will be utilized with all these new AI features coming out. But earlier reports suggested that Apple will upgrade the assistant by incorporating its Ajax LLM (Large Language Model). So, it's possible that Siri will handle many of these prompts and tasks that will be given by the user.
iOS 18 will be previewed at WWDC developers conference in June, while it would only be released in the fall alongside the iPhone 16 and iPhone 16 Pro.
Affiliate offer
How do you thinkthese AI servers will give an edge to new AI features on the iPhone, aside from running intensive processes? Let us hear your opinion.
More here:
Apple to Utilize Cloud Servers for iOS 18 AI Features - NextPit International
The iPhone’s big new iOS 18 AI features will be powered by data centers running Apple silicon – TweakTown
There have been a lot of rumors and reports of late that claim Apple is going to bring some big much-needed AI-powered features to the iPhone when iOS 18 is released later this year and while we've heard that those features will run on-device, others will require a server. Those cloud servers will allow Apple to handle more complex tasks, including generative AI workflows, and a new report now suggests that the servers will run Apple's own custom chips.
We've been hearing more and more about Apple's plans to put its own chips into servers of late, and it's a plan that makes sense. Apple's Macs, iPhones, and iPads all use custom-designed chips that are built by TSMC and it's proven to be a real boon for the company. More control means that Apple has a better lock on power usage and performance, and it can tailor chips to specific needs as well. In the case of servers, it's suggested Apple will produce chips that can run AI-related workflows particularly well.
VIEW GALLERY - 2 IMAGES
There was previously no timeline for when the Apple-designed chips would be used, but a new Bloomberg report by Mark Gurman suggests that Apple will have its in-house server chips ready soon enough to power the cloud component for the iOS 18 AI push.
Gurman says that the company will put new high-end chips into cloud computing servers and then use those servers to handle more complex AI tasks that would not be practical to run on iPhones locally. However, the report notes that simpler AI-related features will still be processed directly on iPhones, iPads, and Macs.
The benefits to running some AI tasks on-device include speed and privacy. By not having to process AI tasks in the cloud Apple can remove a performance bottleneck associated with wireless data connections, for example. Privacy is a key aspect for Apple as well, and removing the need to send data to a cloud-based server has obvious benefits here.
In terms of performance, Gurman believes that the first chips to be used in Apple's data centers will be the M2 Ultra, a chip that customers can already buy in the Mac Pro and Mac Studio. However, Apple recently announced the M4 as part of the iPad Pro refresh so it's surely only a matter of time before an Ultra version of that chip is being used as well. Apple is yet to confirm when Macs with M4-series chips will be announced, however.
Originally posted here:
The iPhone's big new iOS 18 AI features will be powered by data centers running Apple silicon - TweakTown