Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect new object and Keep tracking of old obejct #5

Open
Greywan opened this issue Sep 23, 2024 · 3 comments
Open

Detect new object and Keep tracking of old obejct #5

Greywan opened this issue Sep 23, 2024 · 3 comments

Comments

@Greywan
Copy link

Greywan commented Sep 23, 2024

Hello,
Thanks for this very wonderful and useful project.
I was wondering if it's possible to keep detecting new objects (e.g. giving box) and tracking them, while keeping track of the old ones (even if they temporarily disappear due to occlusion).
Thanks in advance for this!

@heyoeyo
Copy link
Owner

heyoeyo commented Sep 23, 2024

Thanks for checking out the repo!

Yes it's possible to track multiple objects. The video_segmentation example script has the code needed for handling a single object. The code that's needed to 'start' tracking an object is this part:

# Get initial detection/memory data for an object
init_mask, init_mem, init_ptr = sammodel.initialize_video_masking(
    init_encoded_img, boxes_tlbr_norm_list, fg_xy_norm_list, bg_xy_norm_list
)
prompt_mems = deque([init_mem])
prompt_ptrs = deque([init_ptr])
prev_mems = deque([], maxlen=6)
prev_ptrs = deque([], maxlen=15)

And then the tracking code is this part:

# Update tracking of a single object
obj_score, best_mask_idx, mask_preds, mem_enc, obj_ptr = sammodel.step_video_masking(
  encoded_imgs_list, prompt_mems, prompt_ptrs, prev_mems, prev_ptrs
)
prev_mems.appendleft(mem_enc)
prev_ptrs.appendleft(obj_ptr)

So each new object would need it's own copy of the prompt_mems, prompt_ptrs, prev_mems, prev_ptrs variables, and they would just need to be updated in a loop while processing frames. For occlusions, the SAMv2 model already handles it quite well, but you may want to stop recording the memory data (i.e. which is the .appendleft(...) parts above) whenever the obj_score is less than 0. This helps to avoid having bad data corrupt the memory when the object disappears.

Alternatively, if you just want the tracking and don't need the code, you can use the run_video script, which can keep track of multiple objects using the 'buffers' (you can add more buffers by calling the script with the -n flag). Here's an example on one of the videos from the MedSAM2 demo):

multiobj_tracking_example.webm

@heyoeyo
Copy link
Owner

heyoeyo commented Sep 23, 2024

I've just posted another example script for doing multi-object video segmentation, which might help if you're looking for a code-based starting point. It has hard-coded prompts which can be updated for your own video, but it's set up to work with a short video of horses available here:
https://www.pexels.com/video/horses-running-on-grassland-4215784/

@Greywan
Copy link
Author

Greywan commented Sep 29, 2024

Thank you for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants