AMD's MI300 Series: Revolutionizing Energy Efficiency with Smart Shift Technology and Holistic Design Innovations


01/02/2024


In the contemporary era, our thirst for data seems insatiable. As artificial intelligence (AI) adoption surges and industries reliant on intensive computing pursue heightened performance, the demand for energy continues to soar. To meet this demand, new data centers are cropping up and expanding, catering to the needs of cloud infrastructures, high-performance AI applications, supercomputing, and more.

However, as we push the boundaries of computing capabilities, the prospect of future advancements is increasingly constrained by energy limitations. There is a collective trajectory towards consuming more energy than the market can sustain in the next two decades, underscoring the urgency for innovative energy solutions.
 
AMD is at the forefront of driving innovation for energy-efficient computing performance. Our focus on energy efficiency extends deeply into product development, where we meticulously design for power optimization across architecture, connectivity, packaging, and software. This commitment to energy efficiency is not just about cost reduction but also about preserving natural resources and mitigating climate impacts.
 
In 2021, we unveiled our vision to achieve a 30x improvement in energy efficiency by 2025 from a 2020 baseline for accelerated data center compute nodes, known as our 30x25 goal. These nodes, powered by AMD EPYC™ CPUs and AMD Instinct™ accelerators, cater to the rapidly growing computing needs in AI training and high-performance computing (HPC) applications.

Despite the increasing demands for computing power and energy, these applications play a crucial role in addressing some of humanity's most pressing challenges. Supercomputers, such as AMD's Frontier at Oak Ridge National Laboratory, leverage accelerated compute to advance scientific research, genomics, drug discoveries, climate predictions, and pivotal advancements in AI, shaping the next generation of computing. AMD now proudly powers eight of the top 10 most energy-efficient supercomputers on the latest Green500 list, including Frontier and the Adastra supercomputer.
 
As of 2023, we are excited to report significant progress towards our 30x25 goal with the launch of the AMD Instinct MI300 series accelerators. Building on the strides made in the previous year, we remain optimistic about achieving our goal by 2025. The latest performance data indicates a remarkable 13.5x improvement from the 2020 baseline using a configuration of four AMD Instinct MI300A APUs (GPU with integrated 4th Gen EPYC™ “Genoa” CPU). Our progress aligns with a measurement methodology validated by Dr. Jonathan Koomey, a renowned researcher and author specializing in compute energy efficiency.
 
In our AMD Instinct MI300 Series accelerators, we harness the power of AMD Smart Shift technology, a groundbreaking innovation that dynamically allocates power between the CPU and GPU based on the specific needs of applications, such as generative AI. This exemplifies the strength of our APU architecture, seamlessly integrating CPU and GPU capabilities within a single package, and showcases the innovation driven by our ambitious energy efficiency goals. The inception of Smart Shift technology is rooted in our pursuit of the 25x20 energy efficiency goal, resulting in an impressive 31.7x enhancement in the energy efficiency of our mobile processors compared to the 2014 baseline.
 
The design decisions incorporated into the MI300A propel it to deliver a superior >2x performance per watt advantage over comparable chips from competitors. This translates into substantial benefits, including noteworthy reductions in electricity consumption, greenhouse gas (GHG) emissions, and the overall cost of ownership at the solution level.
 
While this progress is commendable, there is still work to be done.
 
As Moore's Law exhibits diminishing returns and transistor costs escalate, the industry faces a turning point in technology improvement trends. The slowdown is evident in both density and energy per operation improvements, especially as we advance beyond 5nm nodes into 3nm and 2nm territories. Simply relying on smaller transistors is insufficient to drive significant performance and efficiency gains in future processor generations. To navigate this challenge and continue innovating, we adopt a holistic design perspective, reimagining everything from packaging and architecture to memory and software.
 
Although there's more ground to cover to achieve our 30x25 goal, I remain pleased with the dedication of our engineers and optimistic about the outcomes thus far. The MI300 represents a substantial leap forward, and we invite you to stay updated with our annual progress reports as we strive towards our objectives.