- Blockchain Council
- November 30, 2023
In a major stride towards advancing artificial intelligence (AI) infrastructure, Amazon Web Services (AWS) and NVIDIA have announced an expanded strategic collaboration. This collaboration aims to deliver state-of-the-art infrastructure, software, and services, specifically tailored to power generative AI innovations across various industries.
The synergy between NVIDIA’s cutting-edge multi-node systems and AWS technologies, including Nitro System, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability, forms the foundation of this collaboration. The goal is to provide a robust platform for training foundational models and constructing advanced generative AI applications.
According to Adam Selipsky, CEO at AWS, “AWS and NVIDIA have collaborated for more than 13 years, beginning with the world’s first GPU cloud instance. Today, we offer the widest range of NVIDIA GPU solutions for workloads including graphics, gaming, high-performance computing, machine learning, and now, generative AI.”
One notable aspect of this collaboration is AWS becoming the first cloud provider to host NVIDIA GH200 Grace Hopper Superchips. These superchips, featuring multi-node NVLink technology, create a formidable platform on Amazon Elastic Compute Cloud (Amazon EC2), equipped with advanced networking capabilities.
Jensen Huang, founder and CEO of NVIDIA, highlights the transformative impact of generative AI on cloud workloads, stating, “Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation.”
The collaboration extends to hosting NVIDIA DGX Cloud, an AI-training-as-a-service, powered by GH200 NVL32, on AWS. This move facilitates developers with access to the largest shared memory in a single instance, accelerating the training of advanced generative AI and large language models.
The collaborative effort further encompasses Project Ceiba, a groundbreaking initiative to design the world’s fastest GPU-powered AI supercomputer. With 16,384 NVIDIA GH200 Superchips, this supercomputer is set to process 65 exaflops of AI, serving as a catalyst for NVIDIA’s next wave of generative AI innovation.
AWS introduces three new Amazon EC2 instances, catering to diverse AI workloads. P5e instances, powered by NVIDIA H200 Tensor Core GPUs, target large-scale generative AI and high-performance computing. G6 and G6e instances, equipped with NVIDIA L4 and L40S GPUs, offer versatility for applications such as AI fine-tuning, inference, graphics, and video workloads. G6e instances, specifically designed for 3D workflows and digital twins, leverage NVIDIA Omniverse™.
According to Jensen Huang, these advancements are part of a shared mission with AWS, stating, “By focusing our chip designs on real workloads that matter to customers, we’re able to deliver the most advanced cloud infrastructure to them.”
NVIDIA’s software innovations on AWS play a pivotal role in boosting generative AI development. The introduction of NVIDIA NeMo™ Retriever microservice and NVIDIA BioNeMo™ on Amazon SageMaker enhances the capabilities to create highly accurate chatbots and accelerates pharmaceutical companies’ drug discovery processes.
AWS leverages NVIDIA software to innovate its services and operations. For instance, the use of the NVIDIA NeMo framework to train next-generation Amazon Titan LLMs and the collaboration with Amazon Robotics using NVIDIA Omniverse Isaac for building digital twins are indicative of the impact these innovations have on real-world applications.
AWS unveils its next-generation chips, Graviton4 and Trainium2, further solidifying its commitment to offering diverse, high-performance AI infrastructure options. Graviton4 promises up to 30% better compute performance, addressing the growing demand for higher performance and larger instance sizes for in-memory databases and analytics workloads.
Trainium2, designed for AI model training, boasts up to 4x faster training performance and scalability up to 100,000 chips in EC2 UltraClusters. The focus on providing scalable and energy-efficient options for AI workloads reflects AWS’s commitment to meeting the evolving needs of its customers.