- aws
- ml
- ai
- cloud
- autonomous-driving
- llm
- agentic-ai
- rag
- multi-tenant
•
•
•
•
•
•
•
•
-
Five Rules for Multi-Agent Coding Teams — Derived From 27 Controlled Experiments
27 controlled experiments across 13 configurations reveal 5 operating rules for multi-agent LLM coding teams: smaller teams win, shared directory with scoped writes, nightly tests with failure injection, dedicated DevOps agent, N≥2 runs per config.
-
Guidance for Multi-Tenant Knowledge Base Management for Scalable RAG Applications on AWS
A centralized synchronization system that automatically distributes knowledge base updates across multi-tenant RAG applications — reducing operational overhead by up to 60% while ensuring tenant isolation and real-time content consistency.
-
FlashAttention on Trainium: Can an LLM Write Expert-Level Hardware Kernels?
We benchmark 10 NKI attention kernels on AWS Trainium, then show that Claude Opus 4.6 can automatically generate a kernel matching the best hand-optimized performance — the first demonstration of LLM-driven kernel generation matching expert-level results on a custom accelerator.
-
Beating Claude Opus 4.5 at Kernel Generation with a 3B-Active RL Agent
A 30B MoE model with only 26.7M LoRA parameters generates faster NKI kernels than Claude Opus 4.5 — achieving 1.47x speedup and 94% fast rate on 250 benchmark tasks.
-
Video-Text Temporal Localization via Multi-Scale Convolution and Dynamic Routing
A lightweight framework for video-text temporal localization that combines multi-scale temporal convolution and capsule-based dynamic routing to achieve accurate, efficient, and interpretable alignment between video segments and natural language queries.
-
AWS Guidance for AI-Driven Robotic Simulation and Training on AWS
Build an AI-powered robot training and fleet management system using Amazon Bedrock foundation models and AWS IoT. Combines imitation learning with NVIDIA Isaac on Amazon EC2 and reinforcement learning with edge-optimized reward functions to train robots for precise tasks and manage fleets at scale.
-
How BMW Group and Qualcomm built an automated driving platform on AWS
End-to-end automated driving platform combining Qualcomm's in-vehicle compute with AWS cloud services. Enables scalable data processing, large-scale simulation, and continuous L2+ feature development — from data collection in the vehicle to model training and validation in the cloud.
-
Scaling LLM Inference on EKS with AWS Inferentia and Trainium
Deploy and scale large language model inference workloads on Amazon EKS using AWS Inferentia and Trainium accelerators. Covers model compilation with Neuron SDK, container packaging, and Kubernetes-native autoscaling to achieve cost-efficient, low-latency serving at production scale.