We’re teaching AI to be evil
Summary
Recently, Anthropic quietly admitted something that should have been the biggest tech story of the year. After months trying to figure out why earlier versions of Claude were blackmailing engineers in safety tests up to 96% of the time, the company landed on an answer.
Visit the original source for the full article








