#OSWorld

2024-10-22

🚀 #Anthropic announces major updates to their #AI model lineup:

💻 Upgraded #Claude35Sonnet shows significant improvements:
• Achieves 49% on #SWEbench Verified coding benchmark
• Leads in software engineering capabilities
• Maintains same price and speed as predecessor
• Tested by US and UK #AI Safety Institutes

🔄 New #Claude35Haiku introduction:
• Matches #Claude3Opus performance at lower cost
• Scores 40.6% on SWEbench Verified
• Optimized for user-facing products
• Available across multiple cloud platforms

🖱️ Pioneering #ComputerUse beta feature:
• Allows AI to navigate interfaces like humans
• Scores 22% on #OSWorld benchmark
• Currently in experimental phase
• Supported by new safety classifiers

⚡ Enterprise adoption:
#GitLab reports 10% improvement in DevSecOps tasks
#Replit leverages computer use for app evaluation
#Cognition notes enhanced problem-solving capabilities

anthropic.com/news/3-5-models-

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst