w.media HPC Live Webinar 2026: Deep-diving into the next wave of AI and HPC

April 28, 2026 at 12:03 PM GMT+8

w.media conducted HPC Live Webinar 2026 recently where several industry leaders discussed key engineering and infrastructure issues shaping the next wave of AI and HPC development. These themes will influence the foundation of our upcoming High Performance Computing Southeast Asia Summit on June 24th, co-located with the SIJORI CDC event on June 25th. The HPC summit will bring together hyperscalers, operators, research institutions, manufacturers, and policy leaders to discuss the future of HPC in Southeast Asia.

Moderated by Paul Mah, Executive Editor of w.media, the webinar comprises three separate panel sessions with some answers highlighted below:

Session 1: Scaling Next-Generation AI: How HPC Clusters Will Evolve in 2025–2030

Jordan Nanos, a member of the technical team from SemiAnalysis and Paul Skaria who manages the Southeast Asia technical and solution engagement for HPC and AI workloads at AMD, share their opinions.

Paul Mah: Paul [Skaria], from AMD’s vantage point across both CPUs and accelerators (GPUs), how are AI and HPC workloads reshaping system architecture over the next few years?

Paul Skaria: If you look from a workload perspective, there is no one-size-fits-all. AMD is unique because we have a broad portfolio: CPUs for general-purpose high-performance computing, GPUs for accelerated computing, FPGAs for flexibility and programmability, and edge computing with adaptive solutions. Of course, there is also the networking piece. It is a broad range of computation solutions that allows customers to match their specific workload.

Traditionally, HPC has been very CPU-focused. We continuously push the performance and density with each generation of our EPYC CPU. Every generation of these five generations comes with significant performance per core and performance per socket. If you look at the 5th generation Turin, it offers 192 cores in a single socket. That is six times more compute compared with the 1st generation. Over five generations, we’ve seen a six-times increase in compute density. The next generation, Venice, will follow the same path and increase compute density up to 256 cores. This means compute becomes more efficient in a smaller footprint.

HPC applications are slowly tapping into GPU acceleration, where our Instinct GPUs are key. The MI250 series was already powering the world’s fastest exascale supercomputer, the Frontier. The MI300A, the following generation, is an APU. From an architecture point of view, it integrated the CPU, the GPU, and high-bandwidth memory into a unified chiplet package using 3D stacking technology. That tight integration removes one of the biggest performance bottlenecks in GPU-based computing: the mem-copy. Typically, you need to copy CPU memory to GPU memory, which is expensive in terms of latency and power consumption. That integration allows coherent access to the same dataset for both CPU and GPU. The current Top 500 number one supercomputer, El Capitan, is powered by this chip.

AI has added a new dimension. While HPC focuses on full precision FP64 for scientific simulation, AI doesn’t need that much accuracy. It can run effectively on lower precision like FP4, FP8, or FP16. This allows for faster processing and lower power. That’s why we created a separate streamline of GPU accelerators. In 2023, we launched the MI300X series specifically for AI workloads. We also increased the HBM memory, which is vital for AI workloads to fit bigger models in a small footprint. This helps customers achieve better TCO when adopting Instinct-based solutions designed for AI.

Paul Mah: Jordan, from a technical perspective, what are some of the most misunderstood trends in HPC and AI infrastructure today?

Jordan Nanos: I think we can look at this across three layers. First, there are misunderstandings about how these systems behave. In the past, CPUs and GPUs constrained performance based on flops; you just added more flops for more performance. Now, especially for LLMs with low batch sizes, you need more memory bandwidth. However, at scale, the bottleneck moves to the network.

The network bandwidth required to run training and inference for large models requires high-bandwidth connections between machines. Performance is now tied to the programming model for clusters—things like disaggregation of prefill and decode, KV cache offloading, or wide expert parallelism. People use the term “CUDA moat,” but it would be more accurate to talk about collective libraries like NCCL or serving frameworks like Dynamo or vLLM. Understanding how to program a cluster versus a single chip is often misunderstood.

Second is packaging. People talk about lithography (5nm to 3nm), but advancements are driven by advanced packaging. We saw this with HBM and CoWoS, but now it is moving to system-level advanced packaging like Co-Packaged Optics (CPO). We are optimizing outside the server at the rack level.

Finally, bottlenecks are moving. We saw stock prices change for SSD and hard drive vendors, and recently we saw huge price increases in DRAM. Next will be data center equipment and switch ASICs. There are many bottlenecks to be found as we build this out.

Paul Mah: Do you see a difference in how people approach HPC versus AI?

Jordan Nanos: AI is just another HPC workload, but it is distinct. It uses low precision and has a massive emphasis on the network. It is being done at a completely different scale than traditional weather or crash simulations. The biggest trend is that AI is eating traditional HPC workloads. Whether it is drug discovery or weather prediction, performance is improved by using an AI model.

Paul Skaria: The infrastructure is similar, but the precision requirements differ. From a chip point of view, you have to make a tradeoff on silicon space between low-precision and high-precision engines. That’s why we have separate product lines (MI430X for HPC vs MI455X for AI). In the future, more HPC workloads may use AI to simulate, but we are not there yet.

 

‘The biggest trend is that AI is eating traditional HPC workloads.’

 

 

Session 2. Building HPC-Ready Data Centers: Liquid Cooling, Power Strategy & 100kW+ Racks

Speakers were Mark Langford is Regional Technical Director for STULZ, a Germany-based global manufacturer of precision cooling equipment and Indrama YM Purba, CEO for NeutraDC Nexera Batam, a joint venture between Telkom, through NeutraDC, and Singtel, through Nexera.

Paul Mah: Indrama, with regards to your campus in Batam, how are customer requirements for AI and HPC fundamentally different from traditional colocation demand?

Indrama YM Purba: Traditionally, power density per rack was low, around 5 to 15 kW, using standard air cooling and gradual scaling. Right now, AI and HPC customers come to us looking for high-density requirements up to 60 kW, and sometimes even 100 kW per rack. It is a huge demand. Furthermore, it is not gradual; customers want that capacity on Day 1. Every piece of infrastructure—cooling, power, and low-latency connectivity—must be ready immediately. We also had to anticipate the live load of the slabs. GPUs are much heavier than traditional CPUs, so the floor strength is a non-negotiable requirement. If you cannot support them from Day 1, they will go to another player.

Paul Mah: Mark, as densities go beyond 40-60 kW, where do traditional air cooling approaches start to break down?

Mark Langford: Designs once generated for general IT are being challenged. In the past, a telco rack was less than a kilowatt. Now, as we move beyond 50 or 60 kW, we face challenging situations. Simplistically, air becomes a problem when it can no longer support the workload. Raised floors are a major limit. They struggle with the physical weight of AI servers and the plenums simply don’t provide enough airflow. Air cooling was great for delivering 20-degree air to low-to-medium density servers, but as we scale, the airflow requirements become impossible to meet. You end up with noise problems, velocity issues, and hotspots. Eventually, you hit a wall where you cannot force enough air into the box. The server fans aren’t large enough to make that an efficient process anymore.

Paul Mah: At what point does it become necessary to use liquid cooling rather than just as an optimization?

Mark Langford: There are three lines in the sand. Up to 40-50 kW, liquid cooling is an optimization—it’s an energy-saving tool to reduce the load on chillers. At 60 to 70 kW, liquid becomes necessary. You might use in-row cooling or rear-door heat exchangers, but you should be planning for liquid. Once you look at the roadmap toward 130 kW, 300 kW, or 600 kW per rack, liquid cooling is an absolute non-negotiable. Currently, most servers are hybrid: about 80% of the heat (CPU and GPU) is liquid-cooled, but 20% (RAM and power modules) still requires air.

Indrama YM Purba: For our baseline, up to 30 kW can use air, but beyond 60 kW we move to liquid. Installation is risky; every vendor must provide certified installers to ensure there are no leaks. We use a hybrid model: low-density zones use air, and HPC zones are liquid-ready. We let the customer choose their technology and work with them to fit it into our data hall.

 

Session 3. The Fabric Behind HPC: Scaling Interconnects for AI, Simulation, and Real-Time Workloads

The sole speaker is Magelli Roxas from the Supermicro Rack Team whose goal is to design and deploy the GPU systems and rack-scale products.

Paul Mah: How have server designs evolved to support modern AI workloads?

Magelli Roxas: It is a major shift to liquid cooling and packing as much as possible into a small footprint. It isn’t just about space; it is about power and cooling density. We are also increasing data rates—we are talking 800 gig on CX-8 ports. We use fiber more than copper now because copper cannot provide the necessary reach. We are seeing racks move from 42U to 48U and even 52U. We are also seeing a migration toward OCP MGX racks where power requirements differ from the traditional EIA standard.

Paul Mah: How do customers decide between air cooling, hybrid, or liquid-cooled server platforms? 

Magelli Roxas: It depends on the location. Many data centers are simply not set up for liquid cooling. For them, the easy answer is to stay with air-cooled products for now. However, if a data center is looking to expand, we suggest liquid because that is where the market is headed. If you are upgrading your facility, you might as well do it now. The hybrid solution is also popular; even on liquid systems, some components are still air-cooled. We are cooling roughly 95% of the server with liquid, depending on the technology. For those conscious about heat release into the room, we offer rear-door heat exchangers. These can consume 50 to 80 kW of heat, allowing you to deploy in a legacy data center without a containment aisle, provided you have the power.

Paul Mah: What role does modularity play in helping customers scale or pivot?

Magelli Roxas: Customers want it now. Investors need to see the term “AI” in their deployments immediately. NVIDIA has been great at designing reference architectures that people follow. At Supermicro, we follow those designs while optimizing components and validating them before shipping. Modularity is key because most people don’t bring 100 MW online at once; they scale up. Standardizing the design makes it deploy much faster.