Google revealed the sixth iteration of its Tensor Processing Unit (TPU), named Trillium, for data centers at the I/O 2024 Developer Conference. Although the exact release date remains unspecified, Google confirmed that Trillium will be released later this year.
Google CEO Sundar Pichai highlighted the company’s long-standing dedication to AI innovation, stating, “Google was born for this moment. We have been a pioneer in GPUs for more than a decade.”
What does Trillium TPU offer?
Pichai then revealed the substantial performance enhancements of Trillium. This sixth-generation TPU delivers an astonishing 4.7 times increase in computing power per chip compared to the previous generation. This improvement is achieved through advancements in the chip’s matrix multiplication unit (MXU) and an increase in overall clock speed. Additionally, Trillium benefits from twice the memory bandwidth.
Trillium also features Google’s third-generation SparseCore technology, described as “a purpose-built accelerator for common large-scale tasks in advanced ranking and recommendation workloads.” This allows Trillium TPUs to train models more quickly and provide lower latency when serving those models.
Google also focused on energy efficiency, with Pichai calling Trillium the company’s “most energy-efficient” TPU to date. This is particularly important given the growing demand for AI chips, which can have a significant environmental impact. Google claims that Trillium is 67% more energy-efficient than its predecessor.
“Trillium can scale up to 256 TPUs in a single high-bandwidth, low-latency pod. Beyond this pod-level scalability, with multislice technology and Titanium Intelligence Processing Units (IPUs), Trillium TPUs can scale to hundreds of pods, connecting tens of thousands of chips in a building-scale supercomputer interconnected by a multi-petabit-per-second datacenter network. “
While the spotlight often shines on software announcements and AI advancements, it’s the robust hardware developments like Trillium that power these advancements and make them possible. The unveiling of this new TPU underscores a fundamental truth in the tech world: processing power is everything.
Trillium TPUs form an integral part of Google Cloud’s AI Hypercomputer, an advanced supercomputing framework crafted specifically for high-end AI workloads. This architecture combines performance-optimized infrastructure, including Trillium TPUs, with open-source software frameworks and adaptable consumption models.
Google’s dedication to open-source libraries such as JAX, PyTorch/XLA, and Keras 3 empowers developers to innovate freely. The support for JAX and XLA ensures that declarative model descriptions designed for earlier TPU generations are fully compatible with the new hardware and networking capabilities of Trillium TPUs. Furthermore, Google collaborates Hugging Face on Optimum-TPU simplifies the process of model training and deployment.
Google Cloud TPUs represent the pinnacle of AI acceleration, engineered and optimized to power large-scale artificial intelligence models. Available exclusively through Google Cloud, these TPUs offer unmatched performance and cost-efficiency for both training and deploying AI solutions. Whether dealing with the intricate complexities of large language models or the creative demands of image generation, TPUs enable developers and researchers to extend the frontiers of artificial intelligence.
Featured image credit: Rajeshwar Bachu/Unsplash