某组织 (SGLang) now achieves 7,583 tokens per second per GPU running 某AI模型 R1 on the GB200 NVL72, a 2.7x leap over H100.



We're excited to see the open source ecosystem advance inference optimizations on GB200 NVL72, driving down cost per token for the industry at
A2,55%
OVER-0,28%
TOKEN4,2%
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 9
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)