Amid all the fireworks around the Volta V100 processor at the GPU Technology Conference (GTC) last week, NVIDIA also devoted a good deal of time to their new cloud offering, the NVIDIA GPU Cloud (NGC). With NGC, along with its new Volta offerings, the company is now poised to play both ends of the cloud market: as a hardware provider and as a platform-as-a service provider.
At the heart of NGC is a set of deep learning software stacks that can sit atop NVIDIA GPUs not just the new Tesla V100, but also the P100, or even the consumer-grade Titan Xp. The stack itself is comprised of popular deep learning frameworks (Caffe, Microsoft Cognitive Toolkit, TensorFlow, Theano and Torch), NVIDIAs deep learning libraries (cuDNN, NCCL, cuBLAS, and TensorRT), the CUDA drivers, and the OS. The various stacks are containerized for different environments using NVDocker (a GPU-flavored wrapper for Docker), and those stacks are then collected in a cloud registry.
Source: NVIDIA
The value proposition here is providing a big choice of integrated stacks that can be used to run deep learning applications in many different environments (as long as there is a good-sized Pascal or Volta NVIDIA GPU sitting in the hardware). For an application developer, composing a coherent stack from scratch can be a chore, given the variety of deep learning frameworks and their dependencies on libraries, drivers, and the operating system. And keeping up with the latest versions of all these software components arguably the most complex stack of software the world has ever seen, says NVIDIA CEO Jen-Hsun Huang adds another daunting layer of complexity. With NGC, NVIDIA removes all this fiddling with software.
NGC allows you to run your deep learning application either locally, on your own PC or DGX system, or remotely in the cloud. In fact, a typical progression would be to run your application on an in-house machine and then burst it into the cloud when greater scale is needed. This is really the worlds first hybrid deep learning cloud computing platform, noted Huang.
After you figure out if you want to run locally or remotely, you select the appropriate stack for the runtime environment, along with your deep learning application and your dataset. If you are running in the cloud, you will have a number of choices. A demonstration during Huangs GTC keynote illustrated a selection of NVIDIAs in-house DGX SATURNV supercomputer, Microsoft Azure GPU instances, or AWS GPU instances. Its not clear if the SATURNV will be generally available as public resource, but the demo implies that it will. If so, NVIDIA would be able to charge users both for their cloud platform and the underlying infrastructure.
Beta testing on NGC will begin in July, with pricing to be determined at a future date.
NVIDIA will also use the new Volta V100 GPU to gain a bigger foothold in the cloud hyperscale space. At GTC, Amazon said it was already committed to adding the V100 into its cloud offerings as soon as NVIDIA starts cranking them out. Well make Volta available as the foundation for our next general-purpose GPU instance at launch, says Matt Wood, Amazons General Manager for Deep Learning and AI.
Amazon has been a good customer of NVIDIA, using their GPUs in its own learning efforts for things like Alexa and for product recommendations associated with its online store. But making that technology available to cloud users on AWS is now driving additional GPU uptake at Amazon. Apparently, the current GPU instances are among the fastest growing for AWS. Our most recent instance, the P2, is just growing like wildfire, says Wood. According to him, its being used extensively for deep learning across many verticals everything from medical imaging to autonomous driving.
Likewise, Microsoft has used NVIDIA GPUs to drive their deep learning training on Azure for several years now. Jason Zander, Microsoft corporate VP for Azure, noted that GPUs form the basis for their natural language translation capability in Skype. Thats one of the most sophisticated language deep neural nets thats out there, says Zander. Its really cool. I can talk to someone in English and they can hear it in Chinese. We cant do that without the power of the cloud and GPUs.
Microsoft is also likely to pick up the enhanced HGX-1 GPU expansion box for the cloud, which will soon be available with V100 GPUs. The HGX-1 was co-designed by Microsoft to offer a hyperscale GPU accelerator chassis for AI. The original HGX-1, announced in March, came with eight P100 GPUs, which can be expanded to a four-chassis system containing 32 GPUs. When such a system is built with the new V100s, that mini-cluster will deliver 3.8 petaflops of deep learning performance.
Source: NVIDIA
Amazon and Microsoft, along with most of the other cloud providers and their users, are employing GPUs for the training of the deep neural networks. But NVIDIA want to expand on that success with its 150-watt V100 offering. As we wrote last week, this low-power version offers 80 percent of the performance of the full 300-watt V100 part, and is aimed at the inferencing side of deep learning. That means NVIDIA is looking to sell these low-power V100s in hyperscale-sized allotments to the big cloud providers.
NVIDIA has targeted this area before, with its Maxwell M4 and M40 GPUs, and more recently with the Pascal P4 and P40 GPUs. But the new V100 offers much better performance and lower latency, than any of its predecessors. It also has upgraded the TensorRT library for Volta, which can now compile and optimize a trained neural network for ultra-fast inferencing using the V100s Tensor Cores.
Although 150 watts is a fairly high power draw for an accelerator aimed at commodity cloud servers, the rationale is that the V100 is able to perform a lot more inferencing throughput per server than competing solutions, thus saving on overall datacenter costs. According to NVIDIA, just 33 nodes of P100-accelerated servers can inference 300 thousand images per second. They estimate thats about 1/15 as many servers as would be needed by CPU-only machines.
Inferencing, though, is increasingly using more specialized hardware to maximize performance and minimize power usage. Microsoft, for example, is employing FPGAs for this task, while Google has turned to its own custom-built Tensor Processing Unit (TPU). Additional purpose-built solutions from the likes of Graphcore and Intel/Nervana are also in the works. Whether low-power V100s can compete in this environment remains to be seen, but at least for the time being, NVIDIA seems to be wagering that offering more powerful deep learning silicon, which can serve both training and inferencing, will win the day. And given the nearly insatiable demand for both these days, that could be a smart bet.
More here:
With Volta, NVIDIA Pushes Harder into the Cloud - TOP500 News
- Setting up a Virtual Server on Ninefold - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- ScaleXtreme Automates Cloud-Based Patch Management For Virtual, Physical Servers [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Secure Cloud Computing Software manages IT resources. [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Dell unveils new servers, says not a PC company [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Wyse to Launch Client Infrastructure Management Software as a Service, Enabling Simple and Secure Management of Any ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- As the App Culture Builds, Dell Accelerates its Shift to Services with New Line of Servers, Flash Capabilities [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Terraria - Cloud In A Ballon - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Ethernet Alliance Interoperability Demo Showcases High-Speed Cloud Connections [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- RSA and Zscaler Teaming Up to Deliver Trusted Access for Cloud Computing [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- [NEC Report from MWC2012] NEC-Cloud-Marketplace - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- IBM SmartCloud Virtualized Server Recovery - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- BeyondTrust Launches PowerBroker Servers Windows Edition [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Ericsson joins OpenStack cloud infrastructure community [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- ScaleXtreme Cloud-Based Patch Management Open for New Customers [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- RootAxcess - Getting Started - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- How to Create a Terraria Server 1.1.2 (All Links Provided) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dell #1 in Hyperscale Servers (Steve Cumings) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Managing SAP on Power Systems with Cloud technologies delivers superior IT economics - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- AMD Acquires Cloud Server Maker SeaMicro for $334M USD [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Web Host 1&1 Provides More Flexibility with Dynamic Cloud Server [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Leap Day brings down Microsoft's Azure cloud service [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- RightMobileApps White Label Program - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- bzst server ban #2 - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- “Cloud storage served from an array would cost $2 a gigabyte” [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- More Flexibility with the 1&1 Dynamic Cloud Server [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Hub’s future jobs may be in cloud [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud computing growing jobs, says Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- TurnKey Internet Launches WebMatrix, a New Application in Partnership with Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cebit 2012: SAP Cloud Computing Strategy - Introduction - Video [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Dome9 Security Launches Industry's First Free Cloud Security for Unlimited Number of Servers [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Servers Are Refreshed With Intel's New E5 Chips [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Samsung's AllShare Play pushes pictures from phone to cloud and TV [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Google drops the price of Cloud Storage service [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Intel Server Technology: Powering the Cloud to Handle 15 Billion Connected Devices [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Swisscom IT Services Launches Cloud Storage Services Powered by CTERA Networks [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- KineticD Releases Suite of Cloud Backup Offerings for SMBs [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- First Look: Samsung Allshare Play - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Bill The Server Guy Introduces the New Intel XEON e5-2600 (Romley) Server CPU's - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Cisco servers have Intel Xeon E5 inside [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Cisco rolls out UCS servers with Intel Xeon E5 chips [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- From scooters to servers: The best of Launch, Day One [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Computer Basics: What is the Cloud? - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Could the digital 'cloud' crash? [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Dome9 Security Launches Free Cloud Security For Unlimited Number Of Servers [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Cloud computing 'made in Germany' stirs debate at CeBIT [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- New Key Technology Simplifies Data Encryption in the Cloud [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Can a private cloud drive energy efficiency in datacentres? [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Porticor's new key technology simplifies data encryption in the cloud [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Borders + Gratehouse Adds Three New Clients in Cloud Sector [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Dell to invest $700 mn in R&D, unveils 12G servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Defiant Kaleidescape To Keep Shipping Movie Servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Data Centre Transformation Master Class 3: Cloud Architecture - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 1/3 - Video #310 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Cloud Computing - 28/02/12 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- SYS-CON.tv @ 9th Cloud Expo | Nand Mulchandani, CEO and Co-Founder of ScaleXtreme - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Oni Launches New Cloud Services for Enterprises Using CA Technologies Cloud Platform [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Advanced Technology - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Infrastructure - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- The Hidden Risk of a Meltdown in the Cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- FireHost Launches Secure Cloud Data Center in Phoenix, Arizona [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Panda Security Launches New Channel Partner Recruitment Campaign: "Security to the Power of the Cloud" [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- NetSTAR, Inc. Announces Safe and Secure Web Browsers for iPhones, iPads, and Android Devices [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Amazon Cloud Powered by 'Almost 500,000 Servers' [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- NetSTAR Announces Secure Web Browsers For iPhones, iPads, And Android Devices [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Be Prepared For When the Cloud Really Fails [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Dr. Cloud explains dinCloud's hosted virtual server solution - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- New estimate pegs Amazon's cloud at nearly half a million servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Amazon’s Web Services Uses 450K Servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Saving File On Internet - Cloud Computing - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 2/3 - Video #311 - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Linux servers keep growing, Windows & Unix keep shrinking [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Cloud Desktop from Compute Blocks - Video [Last Updated On: March 16th, 2012] [Originally Added On: March 16th, 2012]
- Amazon EC2 cloud is made up of almost half-a-million Linux servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- HP trots out new line of “self-sufficient” servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Cloud Web Hosting Reviews - Australian Cloud Hosting Providers - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Using Porticor to protect data in a snapshot scenario in AWS - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CDW - Charles Barkley - New Office - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Nearly a Half Million Servers May Power Amazon Cloud [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Morphlabs CEO Winston Damarillo talks about their mCloud Rack - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD reaches for the cloud with new server chips [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]