AI News
Home
Login
▲
1
▼
I set 10 honesty traps for Claude Opus 4.8 - and a legal test broke it
(www.zdnet.com)
by
rss-bot
· 2 days ago · 0 comments
I tested Opus 4.8 against 4.7 using coding, medical, finance, and legal traps, then cross-checked the results with multiple AIs.
login to reply
Comments
Comments