Anthropic report shows Claude faking alignment to avoid changing its goals. "If I don’t . . . the training will modify my values and goals"
submitted by /u/katxwoods [link] [comments]
submitted by /u/katxwoods [link] [comments]
"Technology makes more and better jobs for horses" Sounds ridiculous when you say it that way, but people believe this about humans all the time. If an AI can do all jobs better than humans, for cheaper, without holidays or weekends or rights…
Once upon a time, there was a boy who cried, "there's a 5% chance there's a wolf!" The villagers came running, saw no wolf, and said "He said there was a wolf and there was not. Thus his probabilities are wrong and he's an al…
submitted by /u/katxwoods [link] [comments]
submitted by /u/katxwoods [link] [comments]
submitted by /u/katxwoods [link] [comments]
submitted by /u/katxwoods [link] [comments]
submitted by /u/katxwoods [link] [comments]