Artificial Intelligence Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM – Amazon Web Services Google Inc. April 15, 2026 April 15, 2026 Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM Amazon Web Services