Accelerate your generative AI distributed training workloads with the NVIDIA NeMo Framework on Amazon EKS

In today’s rapidly evolving landscape of artificial intelligence (AI), training large language models (LLMs) poses significant challenges. These models often require enormous computational resources and sophisticated infrastructure to handle the vast amounts of data and complex algorithms involved. Without a …

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container

Amazon Web Services is excited to announce the launch of the AWS Neuron Monitor container, an innovative tool designed to enhance the monitoring capabilities of AWS Inferentia and AWS Trainium chips on Amazon Elastic Kubernetes Service (Amazon EKS). This solution …

Get started quickly with AWS Trainium and AWS Inferentia using AWS Neuron DLAMI and AWS Neuron DLC

Starting with the AWS Neuron 2.18 release, you can now launch Neuron DLAMIs (AWS Deep Learning AMIs) and Neuron DLCs (AWS Deep Learning Containers) with the latest released Neuron packages on the same day as the Neuron SDK release. When …

Scale AI training and inference for drug discovery through Amazon EKS and Karpenter

This is a guest post co-written with the leadership team of Iambic Therapeutics.

Iambic Therapeutics is a drug discovery startup with a mission to create innovative AI-driven technologies to bring better medicines to cancer patients, faster.

Our advanced generative and …

Open source observability for AWS Inferentia nodes within Amazon EKS clusters

Recent developments in machine learning (ML) have led to increasingly large models, some of which require hundreds of billions of parameters. Although they are more powerful, training and inference on those models require significant computational resources. Despite the availability of …

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

This is a guest post co-written with Fred Wu from Sportradar.

Sportradar is the world’s leading sports technology company, at the intersection between sports, media, and betting. More than 1,700 sports federations, media outlets, betting operators, and consumer platforms across …

文 » A