-
Notifications
You must be signed in to change notification settings - Fork 406
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature] Integrate lmdeploy pipeline api (#1198)
* integrate lmdeploy's pipeline api * fix linting * update user guide * rename * update * update * update * rollback class name * update * remove unused code * update * update * fix ci check * compatibility * remove concurrency * Update configs/models/hf_internlm/lmdeploy_internlm2_chat_7b.py * Update docs/zh_cn/advanced_guides/evaluation_lmdeploy.md * [Bug] fix lint --------- Co-authored-by: Songyang Zhang <[email protected]> Co-authored-by: tonysy <[email protected]>
- Loading branch information
1 parent
d2ab51a
commit b52ba65
Showing
16 changed files
with
249 additions
and
955 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,24 @@ | ||
from opencompass.models import TurboMindModelwithChatTemplate | ||
|
||
|
||
models = [ | ||
dict( | ||
type=TurboMindModelwithChatTemplate, | ||
abbr='internlm2-chat-7b-turbomind', | ||
abbr=f'internlm2-chat-7b-lmdeploy', | ||
path='internlm/internlm2-chat-7b', | ||
engine_config=dict(session_len=8192, max_batch_size=16, tp=1), | ||
gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096), | ||
# inference backend of LMDeploy. It can be either 'turbomind' or 'pytorch'. | ||
# If the model is not supported by 'turbomind', it will fallback to | ||
# 'pytorch' | ||
backend='turbomind', | ||
# For the detailed engine config and generation config, please refer to | ||
# https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/messages.py | ||
engine_config=dict(tp=1), | ||
gen_config=dict(do_sample=False), | ||
max_seq_len=8192, | ||
max_out_len=4096, | ||
batch_size=16, | ||
# the max number of prompts that LMDeploy receives | ||
# in `generate` function | ||
batch_size=5000, | ||
run_cfg=dict(num_gpus=1), | ||
) | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# Evaluation with LMDeploy | ||
|
||
We now support evaluation of models accelerated by the [LMDeploy](https://github.com/InternLM/lmdeploy). LMDeploy is a toolkit designed for compressing, deploying, and serving LLM. It has a remarkable inference performance. We now illustrate how to evaluate a model with the support of LMDeploy in OpenCompass. | ||
|
||
## Setup | ||
|
||
### Install OpenCompass | ||
|
||
Please follow the [instructions](https://opencompass.readthedocs.io/en/latest/get_started/installation.html) to install the OpenCompass and prepare the evaluation datasets. | ||
|
||
### Install LMDeploy | ||
|
||
Install lmdeploy via pip (python 3.8+) | ||
|
||
```shell | ||
pip install lmdeploy | ||
``` | ||
|
||
The default prebuilt package is compiled on CUDA 12. However, if CUDA 11+ is required, you can install lmdeploy by: | ||
|
||
```shell | ||
export LMDEPLOY_VERSION=0.6.0 | ||
export PYTHON_VERSION=310 | ||
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118 | ||
``` | ||
|
||
## Evaluation | ||
|
||
When evaluating a model, it is necessary to prepare an evaluation configuration that specifies information such as the evaluation dataset, the model, and inference parameters. | ||
|
||
Taking [internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b) as an example, the evaluation config is as follows: | ||
|
||
```python | ||
# configure the dataset | ||
from mmengine.config import read_base | ||
|
||
|
||
with read_base(): | ||
# choose a list of datasets | ||
from .datasets.mmlu.mmlu_gen_a484b3 import mmlu_datasets | ||
from .datasets.ceval.ceval_gen_5f30c7 import ceval_datasets | ||
from .datasets.triviaqa.triviaqa_gen_2121ce import triviaqa_datasets | ||
from opencompass.configs.datasets.gsm8k.gsm8k_0shot_v2_gen_a58960 import \ | ||
gsm8k_datasets | ||
# and output the results in a chosen format | ||
from .summarizers.medium import summarizer | ||
|
||
datasets = sum((v for k, v in locals().items() if k.endswith('_datasets')), []) | ||
|
||
# configure lmdeploy | ||
from opencompass.models import TurboMindModelwithChatTemplate | ||
|
||
|
||
|
||
# configure the model | ||
models = [ | ||
dict( | ||
type=TurboMindModelwithChatTemplate, | ||
abbr=f'internlm2-chat-7b-lmdeploy', | ||
# model path, which can be the address of a model repository on the Hugging Face Hub or a local path | ||
path='internlm/internlm2-chat-7b', | ||
# inference backend of LMDeploy. It can be either 'turbomind' or 'pytorch'. | ||
# If the model is not supported by 'turbomind', it will fallback to | ||
# 'pytorch' | ||
backend='turbomind', | ||
# For the detailed engine config and generation config, please refer to | ||
# https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/messages.py | ||
engine_config=dict(tp=1), | ||
gen_config=dict(do_sample=False), | ||
# the max size of the context window | ||
max_seq_len=7168, | ||
# the max number of new tokens | ||
max_out_len=1024, | ||
# the max number of prompts that LMDeploy receives | ||
# in `generate` function | ||
batch_size=5000, | ||
run_cfg=dict(num_gpus=1), | ||
) | ||
] | ||
``` | ||
|
||
Place the aforementioned configuration in a file, such as "configs/eval_internlm2_lmdeploy.py". Then, in the home folder of OpenCompass, start evaluation by the following command: | ||
|
||
```shell | ||
python run.py configs/eval_internlm2_lmdeploy.py -w outputs | ||
``` | ||
|
||
You are expected to get the evaluation results after the inference and evaluation. |
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.