Catastrophic forgetting is quietly killing local LLM fine-tuning, anyone else hitting this wall?
Catastrophic forgetting remains a persistent challenge when performing sequential or multi-task fine-tuning on LLMs. Models often lose significant capability on previous tasks or general knowledge as they adapt to new domains (medical, legal, code, etc…