Skip to content

Latest commit

 

History

History
17 lines (16 loc) · 2.91 KB

news.md

File metadata and controls

17 lines (16 loc) · 2.91 KB

News

  • [2023.09.08] We update the leaderboard with Baichuan-2/Tigerbot-2/Vicuna-v1.5, welcome to our homepage for more details.
  • [2023.09.06] Baichuan2 team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.
  • [2023.09.02] We have supported the evaluation of Qwen-VL in OpenCompass.
  • [2023.08.25] TigerBot team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.
  • [2023.08.21] Lagent has been released, which is a lightweight framework for building LLM-based agents. We are working with Lagent team to support the evaluation of general tool-use capability, stay tuned!
  • [2023.08.18] We have supported evaluation for multi-modality learning, include MMBench, SEED-Bench, COCO-Caption, Flickr-30K, OCR-VQA, ScienceQA and so on. Leaderboard is on the road. Feel free to try multi-modality evaluation with OpenCompass !
  • [2023.08.18] Dataset card is now online. Welcome new evaluation benchmark OpenCompass !
  • [2023.08.11] Model comparison is now online. We hope this feature offers deeper insights!
  • [2023.08.11] We have supported LEval.
  • [2023.08.10] OpenCompass is compatible with LMDeploy. Now you can follow this instruction to evaluate the accelerated models provide by the Turbomind.
  • [2023.08.10] We have supported Qwen-7B and XVERSE-13B ! Go to our leaderboard for more results! More models are welcome to join OpenCompass.
  • [2023.08.09] Several new datasets(CMMLU, TydiQA, SQuAD2.0, DROP) are updated on our leaderboard! More datasets are welcomed to join OpenCompass.
  • [2023.08.07] We have added a script for users to evaluate the inference results of MMBench-dev.
  • [2023.08.05] We have supported GPT-4! Go to our leaderboard for more results! More models are welcome to join OpenCompass.
  • [2023.07.27] We have supported CMMLU! More datasets are welcome to join OpenCompass.