AI models are exploding in complexity and size as they improve conversational AI with hundreds of billions of parameters, enhance deep recommender systems with embedding tables of tens of terabytes of data, and enable new scientific discoveries. These massive models are pushing the limits of today's systems. Continuing to scale them for accuracy and usefulness requires fast access to a large pool of memory and a tight coupling of the CPU and GPU.
The NVIDIA Grace CPU leverages the flexibility of the Arm® architecture to create a CPU and server architecture designed from the ground up for accelerated computing. This innovative design will deliver up to 30X higher aggregate bandwidth compared to today's fastest servers and up to 10X higher performance for applications running terabytes of data. NVIDIA Grace is designed to enable scientists and researchers to train the world’s largest models to solve the most complex problems.