GPT 5.5发布,在AI agent工具调用协调方面 遥遥领先!
Rank Agent Model Date Agent Org Model Org <br />Accuracy<br />
<br />
<br />
1<br />
Codex GPT-5.5 2026-04-23 OpenAI OpenAI <br />
82.0%± 2.2<br />
<br />
<br />
2<br />
ForgeCode GPT-5.4 2026-03-12 ForgeCode OpenAI <br />
81.8%± 2.0<br />
<br />
<br />
3<br />
TongAgents Gemini 3.1 Pro 2026-03-13 BIGAI Google <br />
80.2%± 2.6<br />
<br />
<br />
4<br />
ForgeCode Claude Opus 4.6 2026-03-12 ForgeCode Anthropic <br />
79.8%± 1.6<br />
<br />
<br />
5<br />
SageAgent GPT-5.3-Codex 2026-03-13 OpenSage OpenAI <br />
78.4%± 2.2<br />
<br />
<br />
6<br />
ForgeCode Gemini 3.1 Pro 2026-03-02 ForgeCode Google <br />
78.4%± 1.8<br />
<br />
<br />
7<br />
Droid GPT-5.3-Codex 2026-02-24 Factory OpenAI <br />
77.3%± 2.2<br />
<br />
<br />
8<br />
Capy Claude Opus 4.6 2026-03-12 Capy Anthropic <br />
75.3%± 2.4<br />
<br />
<br />
9<br />
Simple Codex GPT-5.3-Codex 2026-02-06 OpenAI OpenAI <br />
75.1%± 2.4<br />
<br />
<br />
10<br />
Terminus-KIRA Gemini 3.1 Pro 2026-02-23 KRAFTON AI Google <br />
74.8%± 2.6<br />
<br />
<br />
<br />
======================================================<br />
<br />
国产开源模型 kimi最高, 排名62 <br />
62<br />
Terminus 2 Kimi K2.5 2026-02-04 AfterQuery Kimi <br />
43.2%± 2.9
页:
[1]