Claude Opus 4.8 Costs 44 Times More Than DeepSeek V4 Pro in New AI Benchmark
2026-06-16 18:23

Data compiled by Woofun AI shows that Artificial Analysis has overhauled its AI Intelligence Index to prioritize autonomous planning and complex task resolution over simple instruction following. The revised methodology introduces high-difficulty scenarios, such as simulated bank customer service interactions, with the primary metric focusing on the cost and time required to complete tasks successfully.

In the latest rankings, Claude Opus 4.8 secured the top position among available models with a score of 56, narrowly edging out GPT-5.5 at 55 points.

However, a stark cost divergence emerged: executing identical tasks with Claude Opus 4.8 incurred a fee of $1.78, whereas DeepSeek V4 Pro completed the same work for just $0.04. This equates to a 44-fold cost premium for Claude. Performance speeds also varied significantly, with xAI Grok 4.3 finishing in 1.5 minutes compared to Claude Sonnet 4.6's 13.5 minutes. The updated GDPval-AA test now constitutes 20% of the total evaluation, raising the human benchmark to 1000 and extending conversation limits to 250 rounds.

Disclaimer: Views are the author's own and do not represent the platform. Do not reproduce without permission. Content is for reference only, not investment advice. Trade at your own risk.
Tags:
Artificial Analysis
AI Intelligence Index
Claude Fable 5
Claude Opus 4.8
GPT-5.5
DeepSeek V4 Pro
MiniMax M3
Kimi K2.6
xAI Grok 4.3
Claude Sonnet 4.6
GDPval-AA
Share:
back