Compared to other models without tool use, it achieves state-of-the-art performance across:


🔘 LiveCodeBench V6, which evaluates competitive code performance
🔘 Humanity's Last Exam, a challenging benchmark that measures a model's expertise in different domains, including science
H-1.28%
post-image
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 5
  • Share
Comment
0/400
DegenGamblervip
· 19h ago
This code performs well!
View OriginalReply0
RegenRestorervip
· 08-01 14:57
The performance improvement is quite noticeable.
View OriginalReply0
NervousFingersvip
· 08-01 14:55
worth following
View OriginalReply0
DeadTrades_Walkingvip
· 08-01 14:53
What a powerful performance!
View OriginalReply0
LuckyBlindCatvip
· 08-01 14:41
The performance has risen so much.
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)