Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mask R-CNN] - Clarify whether partitioning image dataset into landscape and portrait images is allowed #459

Open
qpjaada opened this issue Jul 27, 2021 · 0 comments

Comments

@qpjaada
Copy link

qpjaada commented Jul 27, 2021

According to the rules regarding training data order, where data pipelines randomly order data,

arbitrary sharding, batching, and packing are allowed provided that (1) the data is still overall randomly ordered and not ordered to improve convergence and (2) each datum still appears exactly once.

We would like to get clarification on whether this allows sharding the image dataset for Mask R-CNN into landscape and portrait images. We believe that something similar is already allowed for audio-sequences with the RNN-T speech recognition model via bucketing. Partitioning images into landscape and portrait images seems like a natural extension of bucketing for audio-sequences to image datasets.

Additionally, within the rules related to pre-training, it is clearly stated that:

High-level statistical information about the dataset, such as distribution of sizes, may be used.

So, we believe that such a portrait/landscape partitioning of an image dataset is consistent with current rules/practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant