Performance Benchmarks for deepseek-v4-flash-thinking

Analyze how deepseek-v4-flash-thinking stacks up against other frontier models in terms of cost-to-performance and raw intelligence (LMArena Elo).

2026 Performance Summary

🚀 Intelligence (Elo): 1438
💰 API Cost (Input): $1.4e-7 / 1M tokens
⚡ API Cost (Output): $2.8e-7 / 1M tokens
⏱️ Real-world Throughput: 78 tokens/s

Analysis: While deepseek-v4-flash-thinking is a capable model, it is currently outperformed in efficiency by gemini-3-flash. gemini-3-flash sits higher on the Pareto line, offering approximately 35 points more intelligence (Elo) for a comparable or lower API cost.

Highly Recommended Alternatives

Frontier Model	LMArena Elo	API Cost (1M)	Throughput
gemini-3-flash	1473	$5e-7	66
mimo-v2.5-pro	1465	$4.35e-7	74
gemini-3.5-flash-lite	1460	$3e-7	55
hy3	1457	$1.32e-7	51
qwen3-235b-a22b-instruct-2507	1423	$9e-8	56

*These models represent the Pareto Frontier (optimal cost-to-performance).*