admin – Page 26

Amazon SageMaker introduces new capabilities to accelerate scaling of Generative AI Inference

December 6, 2024 By admin

We are excited to announce two new capabilities in SageMaker Inference that significantly enhance the deployment and scaling of generative AI models: Container Caching and Fast Model Loader. These innovations address critical challenges in scaling large language models (LLMs) efficiently, enabling faster response times to traffic spikes and more cost-effective scaling. By reducing model loading times and accelerating autoscaling, these features allow customers to improve the responsiveness of their generative AI applications as demand fluctuates, particularly benefiting services with dynamic traffic patterns.

Container Caching dramatically reduces the time required to scale generative AI models for inference by pre-caching container images. This eliminates the need to download them when scaling up, resulting in significant reduction in scaling time for generative AI model endpoints. Fast Model Loader streams model weights directly from Amazon S3 to the accelerator, loading models much faster compared to traditional methods. These capabilities allow customers to create more responsive auto-scaling policies, enabling SageMaker to add new instances or model copies quickly when defined thresholds are reached, thus maintaining optimal performance during traffic spikes while at the same time managing costs effectively.

These new capabilities are accessible in all AWS regions where Amazon SageMaker Inference is available. To learn more see our documentation for detailed implementation guidance.

Amazon Aurora now available as a quick create vector store in Amazon Bedrock Knowledge Bases

December 6, 2024 By admin

Amazon Aurora PostgreSQL is now available as a quick create vector store in Amazon Bedrock Knowledge Bases. With the new Aurora quick create option, developers and data scientists building generative AI applications can select Aurora PostgreSQL as their vector store with one click to deploy an Aurora Serverless cluster preconfigured with pgvector in minutes. Aurora Serverless is an on-demand, autoscaling configuration where capacity is adjusted automatically based on application demand, making it ideal as a developer vector store.

Knowledge Bases securely connects foundation models (FMs) running in Bedrock to your company data sources for Retrieval Augmented Generation (RAG) to deliver more relevant, context-specific, and accurate responses that make your FM more knowledgeable about your business. To implement RAG, organizations must convert data into embeddings (vectors) and store these embeddings in a vector store for similarity search in generative artificial intelligence (AI) applications. Aurora PostgreSQL, with the pgvector extension, has been supported as a vector store in Knowledge Bases for existing Aurora databases. With the new quick create integration with Knowledge Bases, Aurora is now easier to set up as a vector store for use with Bedrock.

The quick create option in Bedrock Knowledge Bases is available in these regions with the exception of AWS GovCloud (US-West) which is planned for Q4 2024. To learn more about RAG with Amazon Bedrock and Aurora, see Amazon Bedrock Knowledge Bases .

Amazon Aurora combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. To get started using Amazon Aurora PostgreSQL as a vector store for Amazon Bedrock Knowledge Bases, take a look at our documentation .

Amazon EC2 Hpc6id instances are now available in Europe (Paris) region

December 6, 2024 By admin

Starting today, Amazon EC2 Hpc6id instances are available in additional AWS Region Europe (Paris). These instances are optimized to efficiently run memory bandwidth-bound, data-intensive high performance computing (HPC) workloads, such as finite element analysis and seismic reservoir simulations. With EC2 Hpc6id instances, you can lower the cost of your HPC workloads while taking advantage of the elasticity and scalability of AWS.

EC2 Hpc6id instances are powered by 64 cores of 3rd Generation Intel Xeon Scalable processors with an all-core turbo frequency of 3.5 GHz, 1,024 GB of memory, and up to 15.2 TB of local NVMe solid state drive (SSD) storage. EC2 Hpc6id instances, built on the AWS Nitro System , offer 200 Gbps Elastic Fabric Adapter (EFA) networking for high-throughput inter-node communications that enable your HPC workloads to run at scale. The AWS Nitro System is a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software. It delivers high performance, high availability, and high security while reducing virtualization overhead.

To learn more about EC2 Hpc6id instances, see the product detail page .

Amazon EC2 Hpc7a instances are now available in Europe (Paris) region

December 6, 2024 By admin

Starting today, Amazon EC2 Hpc7a instances are available in additional AWS Region Europe (Paris). EC2 Hpc7a instances are powered by 4th generation AMD EPYC processors with up to 192 cores, and 300 Gbps of Elastic Fabric Adapter (EFA) network bandwidth for fast and low-latency internode communications. Hpc7a instances feature Double Data Rate 5 (DDR5) memory, which enables high-speed access to data in memory.

Hpc7a instances are ideal for compute-intensive, tightly coupled, latency-sensitive high performance computing (HPC) workloads, such as computational fluid dynamics (CFD), weather forecasting, and multiphysics simulations, helping you scale more efficiently on fewer nodes. To optimize HPC instances networking for tightly coupled workloads, you can access these instances in a single Availability Zone within a Region.

To learn more, see Amazon Hpc7a instances .

What caught your eye? (BAE jobs, Intel CEO, Space cargo)

December 6, 2024 By Alun Williams

We’re talking about BAE Systems recruiting for 2,400 jobs across the UK, Pat Gelsinger leaving Intel, and a space startup delivering cargo to precise locations on Earth…

The post What caught your eye? (BAE jobs, Intel CEO, Space cargo) appeared first on Electronics Weekly .

Fable: The CEO Who Did 6 Years Inside

December 6, 2024 By David Manners

There was once a CEO who spent $2 million on his wife’s birthday party and charged half to the company. It was an excellent bash, held in Sardinia, which included …

The post Fable: The CEO Who Did 6 Years Inside appeared first on Electronics Weekly .

The 18A Enigma

December 6, 2024 By David Manners

The abruptness of the departure of Intel’s CEO – reportedly on a ‘resign or be removed’ ultimatum – could be explained by a horrendous report out today in the Korean …

The post The 18A Enigma appeared first on Electronics Weekly .

Prototype photonic chip uses quantum states of squeezed light

December 6, 2024 By Alun Williams

BBN Technologies, of the RTX Group, is working to deliver a prototype photonic chip that uses exotic quantum states of “squeezed light”. It is part of DARPA’s INSPIRED (Intensity Squeezed …

The post Prototype photonic chip uses quantum states of squeezed light appeared first on Electronics Weekly .

Tiny automotive leadless ICs

December 6, 2024 By David Manners

Nexperia has brought out a portfolio of logic ICs in tiny automotive-qualified MicroPak XSON5 leadless packaging. MicroPak XSON5 is a thermally enhanced plastic enclosure with a 75% smaller PCB footprint …

The post Tiny automotive leadless ICs appeared first on Electronics Weekly .

Broadcom SiP enables development of custom accelerators

December 6, 2024 By David Manners

Broadcom is shipping its 3.5D eXtreme Dimension System in Package (XDSiPTM) platform technology which enables consumer AI customers to develop custom accelerators (XPUs). The 3.5D XDSiP integrates more than 6000 …

The post Broadcom SiP enables development of custom accelerators appeared first on Electronics Weekly .