machine learning machine learning deployment Fireworks AI Introduces FireAttention: A Custom CUDA Kernel Optimized for Multi-Query Attention Models – MarkTechPost Google Inc. January 21, 2024 January 21, 2024 Fireworks AI Introduces FireAttention: A Custom CUDA Kernel Optimized for Multi-Query Attention Models MarkTechPost