Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P5en instances, powered by the latest NVIDIA H200 Tensor Core GPUs. These instances deliver the highest performance in Amazon EC2 for deep learning and high performance computing (HPC) applications.
You can use Amazon EC2 P5en instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative AI applications. You can also use P5en instances to deploy demanding HPC applications at scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.
P5en instances feature up to 8 H200 GPUs which have 1.7x GPU memory size and 1.5x GPU memory bandwidth than H100 GPUs featured in P5 instances. P5en instances pair the H200 GPUs with high performance custom 4th Generation Intel Xeon Scalable processors, enabling Gen5 PCIe between CPU and GPU which provides up to 4x the bandwidth between CPU and GPU and boosts AI training and inference performance. P5en, with up to 3200 Gbps of third generation of EFA using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications. To address customer needs for large scale at low latency, P5en instances are deployed in Amazon EC2 UltraClusters, and provide market-leading scale-out capabilities for distributed training and tightly coupled HPC workloads.
P5en instances are now available in the US East (Ohio), US West (Oregon), and Asia Pacific (Tokyo) AWS Regions and US East (Atlanta) Local Zone us-east-1-atl-2a in the p5en.48xlarge size.
To learn more about P5en instances, see Amazon EC2 P5en Instances .