Skip to content

Latest commit

 

History

History
117 lines (90 loc) · 4.35 KB

README.md

File metadata and controls

117 lines (90 loc) · 4.35 KB

Evaluation for Osprey 🔎

This document provides instructions on evaluating Osprey on four representative tasks, including open-vocabulary segmentation, referring object classification, detailed region description and region level captioning.

We have developed two types of models:the first is Osprey, the second is Osprey-Chat(denote Osprey* in our paper). Osprey-Chat exhibits better conversation and image-level understanding&reasoning capabilities with additional llava data(llava_v1_5_mix665k.json).

1. Open-Vocabulary Segmentation

  • Download SentenceBERT model, which is used for calculating the semantic similarity.
  • The evaluation is based on detectron2, please install the following dependences.
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2
pip install git+https://github.com/cocodataset/panopticapi.git
pip install git+https://github.com/mcordts/cityscapesScripts.git

Cityscapes

cd osprey/eval
python eval_open_vocab_seg_detectron2.py --dataset cityscapes --model path/to/osprey-7b --bert path/to/all-MiniLM-L6-v2

Ade20K

cd osprey/eval
python eval_open_vocab_seg_detectron2.py --dataset ade --model path/to/osprey-7b --bert path/to/all-MiniLM-L6-v2

2. Referring Object Classification

LVIS

cd osprey/eval
python lvis_paco_eval.py --model path/to/osprey-7b --bert path/to/all-MiniLM-L6-v2 --img path/to/coco-all-imgs --json lvis_val_1k_category.json

PACO

cd osprey/eval
python lvis_paco_eval.py --model path/to/osprey-7b --bert path/to/all-MiniLM-L6-v2 --img path/to/coco-all-imgs --json paco_val_1k_category.json

3. Detailed Region Description

  • Fill in the gpt interface in eval_gpt.py.
  • Change the path in gpt_eval.sh.
cd osprey/eval
sh gpt_eval.sh

4. Ferret-Bench

Note that we have converted the boxes in box_refer_caption.json and box_refer_reason.json to polygon format denoted by segmentation.

Referring Description

cd osprey/eval
python ferret_bench_eval.py --model_name path/to/osprey-chat-7b --root_path path/to/coco_imgs --json_path ./ferret_bench/box_refer_caption.json

Referring Reasoning

cd osprey/eval
python ferret_bench_eval.py --model_name path/to/osprey-chat-7b --root_path path/to/coco_imgs --json_path ./ferret_bench/box_refer_reason.json

Then use GPT-4 to evaluate the result as in Ferret.

5. POPE

  • Download coco from POPE and put under osprey/eval/pope.
  • Change the path in pope_eval.sh.
cd osprey/eval
sh pope_eval.sh

6. Region Level Captioning

cd osprey/eval
python refcocog_eval.py --model path/to/Osprey-7B-refcocog-fintune --img path/to/coco-all-imgs --json finetune_refcocog_val_with_mask.json
  • Finally, evaluate the output json file using CaptionMetrics.