machine learning machine learning deployment Faster LLMs with speculative decoding and AWS Inferentia2 – AWS Blog Google Inc. August 5, 2024 August 5, 2024 Faster LLMs with speculative decoding and AWS Inferentia2 AWS Blog