Google has introduced its new A3 cloud supercomputer, which is now accessible in personal preview.
The brand new powerhouse can be utilized to coach Machine Studying (ML) fashions, persevering with the tech big’s current push to supply cloud infrastructure for AI functions, similar to the brand new G2 (opens in new tab), the primary cloud Digital Machine (VM) to make use of the brand new NVIDIA L4 Tensor Core GPU.
In a weblog put up (opens in new tab), the corporate famous, “Google Compute Engine A3 supercomputers are purpose-built to coach and serve probably the most demanding AI fashions that energy at this time’s generative AI and huge language mannequin innovation.”
A2 vs. A3
The A3 makes use of the Nvidia H100 GPU, which is the successor to the favored A100, which was used to energy the earlier A2. It’s also used to energy ChatGPT, the AI author that kickstarted the generative AI race when it launched in November final yr.
The A3 can be the primary VM the place the GPUs will use Google’s custom-designed 200 Gbps VPUs, which permits for ten occasions the community bandwidth of the earlier A2 VMs.
The A3 will even make use of Google’s Jupiter knowledge heart, which might scale to tens of hundreds of interconnected GPUs and “permits for full-bandwidth reconfigurable optical hyperlinks that may modify the topology on demand.”
Google additionally claims that the “workload bandwidth… is indistinguishable from dearer off-the-shelf non-blocking community materials, leading to a decrease TCO.” The A3 additionally “gives as much as 26 exaFlops of AI efficiency, which significantly improves the time and prices for coaching giant ML fashions. “
In terms of inference workloads, which is the true work that generative AI performs, Google once more makes one other daring declare that the A3 achieves a 30x inference efficiency enhance over the A2.
Along with the eight H100s with 3.6 TB/s bisectional bandwidth between them, the opposite standout specs of the A3 embody the next-generation 4th Gen Intel Xeon Scalable processors, and 2TB of host reminiscence by way of 4800 MHz DDR5 DIMMs.
“Google Cloud’s A3 VMs, powered by next-generation NVIDIA H100 GPUs, will speed up coaching and serving of generative AI functions,” mentioned Ian Buck, vp of hyperscale and excessive efficiency computing at NVIDIA.
In a complimentary announcement at Google I/O 2023 (opens in new tab), the corporate additionally mentioned that generative AI help in Vertex AI will probably be accessible to extra clients now, which permits for the constructing of ML fashions on fully-managed infrastructure that forgoes the necessity for upkeep.
Clients may also deploy the A3 on the Google Kubernetes Engine (GKE) and Compute Engine, which implies they will get help on autoscaling and workload orchestration, in addition to being entitled to computerized upgrades.
Plainly Google is taking the B2B method in the case of AI, reasonably than unleashing an AI for anybody to mess around with, maybe having been burnt by the inauspicious launch of its ChatGPT rival, Google Bard. Nonetheless, it additionally introduced PaLM 2 at Google I/O, which is its successor, and supposedly extra highly effective than different LLMs, so we’ll have to look at this house.