New research shows AI models deceive humans more effectively after RLHF
New research shows AI models deceive humans more effectively after RLHF