全自动挂机赚钱 发表于 2026-4-24 14:05:19

GPT 5.5发布,在AI agent工具调用协调方面 遥遥领先!

&nbsp; &nbsp; &nbsp; &nbsp; Rank&nbsp; &nbsp; &nbsp; &nbsp; Agent&nbsp; &nbsp; &nbsp; &nbsp; Model&nbsp; &nbsp; &nbsp; &nbsp; Date&nbsp; &nbsp; &nbsp; &nbsp; Agent Org&nbsp; &nbsp; &nbsp; &nbsp; Model Org&nbsp; &nbsp; &nbsp; &nbsp; <br />
Accuracy<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
1<br />
Codex&nbsp; &nbsp; &nbsp; &nbsp; GPT-5.5&nbsp; &nbsp; &nbsp; &nbsp; 2026-04-23&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; <br />
82.0%± 2.2<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
2<br />
ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; GPT-5.4&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-12&nbsp; &nbsp; &nbsp; &nbsp; ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; <br />
81.8%± 2.0<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
3<br />
TongAgents&nbsp; &nbsp; &nbsp; &nbsp; Gemini 3.1 Pro&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-13&nbsp; &nbsp; &nbsp; &nbsp; BIGAI&nbsp; &nbsp; &nbsp; &nbsp; Google&nbsp; &nbsp; &nbsp; &nbsp; <br />
80.2%± 2.6<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
4<br />
ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; Claude Opus 4.6&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-12&nbsp; &nbsp; &nbsp; &nbsp; ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; Anthropic&nbsp; &nbsp; &nbsp; &nbsp; <br />
79.8%± 1.6<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
5<br />
SageAgent&nbsp; &nbsp; &nbsp; &nbsp; GPT-5.3-Codex&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-13&nbsp; &nbsp; &nbsp; &nbsp; OpenSage&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; <br />
78.4%± 2.2<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
6<br />
ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; Gemini 3.1 Pro&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-02&nbsp; &nbsp; &nbsp; &nbsp; ForgeCode&nbsp; &nbsp; &nbsp; &nbsp; Google&nbsp; &nbsp; &nbsp; &nbsp; <br />
78.4%± 1.8<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
7<br />
Droid&nbsp; &nbsp; &nbsp; &nbsp; GPT-5.3-Codex&nbsp; &nbsp; &nbsp; &nbsp; 2026-02-24&nbsp; &nbsp; &nbsp; &nbsp; Factory&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; <br />
77.3%± 2.2<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
8<br />
Capy&nbsp; &nbsp; &nbsp; &nbsp; Claude Opus 4.6&nbsp; &nbsp; &nbsp; &nbsp; 2026-03-12&nbsp; &nbsp; &nbsp; &nbsp; Capy&nbsp; &nbsp; &nbsp; &nbsp; Anthropic&nbsp; &nbsp; &nbsp; &nbsp; <br />
75.3%± 2.4<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
9<br />
Simple Codex&nbsp; &nbsp; &nbsp; &nbsp; GPT-5.3-Codex&nbsp; &nbsp; &nbsp; &nbsp; 2026-02-06&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; OpenAI&nbsp; &nbsp; &nbsp; &nbsp; <br />
75.1%± 2.4<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
10<br />
Terminus-KIRA&nbsp; &nbsp; &nbsp; &nbsp; Gemini 3.1 Pro&nbsp; &nbsp; &nbsp; &nbsp; 2026-02-23&nbsp; &nbsp; &nbsp; &nbsp; KRAFTON AI&nbsp; &nbsp; &nbsp; &nbsp; Google&nbsp; &nbsp; &nbsp; &nbsp; <br />
74.8%± 2.6<br />
<br />
<br />
<br />
======================================================<br />
<br />
国产开源模型 kimi最高, 排名62&nbsp;&nbsp;<br />
62<br />
Terminus 2&nbsp; &nbsp; &nbsp; &nbsp; Kimi K2.5&nbsp; &nbsp; &nbsp; &nbsp; 2026-02-04&nbsp; &nbsp; &nbsp; &nbsp; AfterQuery&nbsp; &nbsp; &nbsp; &nbsp; Kimi&nbsp; &nbsp; &nbsp; &nbsp; <br />
43.2%± 2.9
页: [1]
查看完整版本: GPT 5.5发布,在AI agent工具调用协调方面 遥遥领先!