stanford_drone_dataset_to_semantic_task

Task: get dataset Stanford Drone Dataset and study network for semantic segmentation not only for moving labeled objects ['Biker' 'Pedestrian' 'Skater' 'Cart' 'Car' 'Bus'], but also for background categories so that such segmentation can be used to navigate a mobile robot

Stages of solution:

Background categories were selected: "road", "sidewalk", "greens", "other_stuff"
For each video sequence get reference frame and label it using some tool. I used coco annotator tool for labeling this frames and save results to coco format. Example of labeling for class sidewalk:

image	mask

Merge annotations from two domains: one from original stanford dataset, and another from my labeling. The only two sequences were labeled: deathCircle->video1 and bookstore->video0. The result of merging is creating colored masks, where categories have next priority (from lowest to highest): ['other_stuff'] -> ['road'] -> ['sidewalk'] -> ['greens'] -> ['Biker'|'Pedestrian'|'Skater'|'Cart'|'Car'|'Bus']. Result:

image	mask

Use Segmentation models pytorch repo for study U-net network using transfer learning (using pretrained on ImageNet dataset weights)

How to use

For working with video from Stanford Drone Dataset use VideoDataset class from utils.py:

from utils import VideoDataset
v_dataset = VideoDataset(data_root='path to data')

1.1. Show all available scenes:

print("Scenes: ".format(v_dataset.get_scenes()))

1.2. Get first frame from specific video

scene_name = "deathCircle"
video_name = "video1"
first_frame = v_dataset.get_frame(scene_name, video_name)

1.3. Get last frame from specific video

scene_name = "deathCircle"
video_name = "video1"
last_frame = v_dataset.get_frame(scene_name, video_name, is_last=True)

1.4. Split video sequence into frames for specific scene and video

v_dataset.split_video(scene_name, video_name, destination_root='destination path')

Creating color masks. For creating color masks for specific scene and video move corresponding stuff.json from directory ./background_categories_annotations/scene/video/stuff.json to data directory into ./annotations/scene/video/stuff.json and then use next code:

v_dataset.create_color_masks(scene_name, video_name, idx_frame_from=0)

Split dataset into train/val/test:

v_dataset.split_dataset(parts_size=[0.7, 0.2, 0.1], out_path='output path')

Use notebook transfer_learning_unet.ipynb for transfer learning. This notebook based on next example from segmentation_models.pytorch repo

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
background_categories_annotations		background_categories_annotations
images		images
README.md		README.md
data_exploration.ipynb		data_exploration.ipynb
data_preprocessing.ipynb		data_preprocessing.ipynb
main.py		main.py
requirements.txt		requirements.txt
transfer_learning_unet.ipynb		transfer_learning_unet.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stanford_drone_dataset_to_semantic_task

Stages of solution:

How to use

About

Releases

Packages

Languages

MZHI/stanford_drone_dataset_to_semantic_task

Folders and files

Latest commit

History

Repository files navigation

stanford_drone_dataset_to_semantic_task

Stages of solution:

How to use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages