Jay van Zyl @ ecosystem.Ai

Jay van Zyl @ ecosystem.Ai

Infinite context windows? Streaming LLMs can be extended to infinite sequence lengths without any fine-tuning.

LLMs like GPT-3 struggle in streaming uses like chatbots because their performance tanks on long texts exceeding their training length. I checked out a new paper investigating why windowed attention fails for this. By visualizing the attention maps, th…