[D] Could AI alignment benefit from “transformational” training instead of mostly transactional reward training?
I’ve been thinking about a possible bridge between AI alignment, reward hacking, and transformational leadership. A lot of AI training seems behaviorally transactional at a simplified level: That makes sense, and I’m not arguing against it. But recen…