Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[datasets] Allow detection task for built-in datasets #1717

Merged
merged 4 commits into from
Oct 1, 2024

Conversation

felixdittrich92
Copy link
Contributor

This PR:

  • makes it possible to use built-in datasets also to plug in the detection training script
  • extend tests
  • update eval scripts - no it works with different args out of the box

Any feedback is welcome

@felixdittrich92 felixdittrich92 added type: bug Something isn't working topic: documentation Improvements or additions to documentation type: enhancement Improvement ext: tests Related to tests folder module: datasets Related to doctr.datasets ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: text detection Related to the task of text detection ext: docs Related to docs folder labels Sep 5, 2024
@felixdittrich92 felixdittrich92 added this to the 0.9.1 milestone Sep 5, 2024
@felixdittrich92 felixdittrich92 self-assigned this Sep 5, 2024
Copy link

codecov bot commented Sep 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.42%. Comparing base (9045dcf) to head (3991bdc).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1717      +/-   ##
==========================================
+ Coverage   96.40%   96.42%   +0.02%     
==========================================
  Files         164      164              
  Lines        7782     7869      +87     
==========================================
+ Hits         7502     7588      +86     
- Misses        280      281       +1     
Flag Coverage Δ
unittests 96.42% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

doctr/datasets/cord.py Outdated Show resolved Hide resolved
doctr/datasets/iiit5k.py Show resolved Hide resolved
Comment on lines 60 to 65
if recognition_task and detection_task:
raise ValueError(
"recognition_task and detection_task cannot be set to True simultaneously "
+ "to get the whole dataset with boxes and labels leave both to False"
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this part can be moved in VisionDataset or even in an abstraction above as this configuration is always forbidden.
It'll also reduce the number of copy paste

Copy link
Contributor

@felixT2K felixT2K Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that all datasets inerhit from AbstractDataset, but not all datasets provides the functionality to be used for recognition and/or detection for example MJSynth is a pure recognition dataset :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So recognition_task / detection_task is only available on the top level .. We could do something like raise_for on VisionDataset but not sure if we really want something 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so if you don't want to to be able to pass recognition_task or detection_task to all AbstractDataset, the code can stay like this, I'm fine with it. My goal was to move the logic "if both variables are set to True, then raise an error as it's never possible".

@felixdittrich92
Copy link
Contributor Author

felixdittrich92 commented Sep 30, 2024

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅

Do you know what i mean ?

Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

@odulcy-mindee
Copy link
Collaborator

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅

Do you know what i mean ?

Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

Here, I see what you mean. It lacks a variable is_recognition_dataset and is_detection_dataset in my opinion, but that's for another PR

@felixdittrich92
Copy link
Contributor Author

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅
Do you know what i mean ?
Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

Here, I see what you mean. It lacks a variable is_recognition_dataset and is_detection_dataset in my opinion, but that's for another PR

Correct :) Something like that :)

@felixdittrich92
Copy link
Contributor Author

I think too for the moment we can stay with it as is, but this would be an possible way to clean up the code a bit 👍

@felixdittrich92 felixdittrich92 merged commit 7f6757c into mindee:main Oct 1, 2024
80 of 81 checks passed
@felixdittrich92 felixdittrich92 deleted the detection-task branch October 1, 2024 08:42
@felixdittrich92 felixdittrich92 modified the milestones: 0.9.1, 0.10.0 Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: docs Related to docs folder ext: references Related to references folder ext: tests Related to tests folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend module: datasets Related to doctr.datasets topic: documentation Improvements or additions to documentation topic: text detection Related to the task of text detection type: bug Something isn't working type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants