Search results for "EVAL"
07:22

The next version of the Yuntian Tianshu model will be benchmarked against GPT4.0 to further improve multimodal capabilities

Yuntian Lifei recently said in an institutional survey that the company's self-developed 100-billion-level large model - Yuntiantianshu large model has completed 2 version updates, and its comprehensive capabilities have been further improved, reaching the advanced level in the industry in general question answering, language understanding, mathematical reasoning, text generation, role-playing, etc.; in the C-Eval Chinese large model list in early September this year, the Yuntiantianshu large model ranked first on the list; the next version of the Yuntiantianshu large model will benchmark against GPT4.0 to further improve multimodal capabilities.
More
  • 1
03:31

Models are “new every day”: SenseTime’s “SenseChat 2.0” comprehensive performance on multiple evaluation benchmarks exceeds that of ChatGPT

SenseTime recently announced the results of its self-developed Chinese language model "SenseChat 2.0" on three authoritative large language model evaluation benchmarks: MMLU, AGIEval, and C-Eval. According to the evaluation results, "Discuss SenseChat 2.0" outperformed ChatGPT in the three test sets, achieving an important breakthrough in the research of large language models in my country.
More
Load More
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)