[datasets] Allow detection task for built-in datasets #1717

felixdittrich92 · 2024-09-05T07:52:23Z

This PR:

makes it possible to use built-in datasets also to plug in the detection training script
extend tests
update eval scripts - no it works with different args out of the box

Any feedback is welcome

codecov · 2024-09-05T08:10:26Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.42%. Comparing base (9045dcf) to head (3991bdc).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1717      +/-   ##
==========================================
+ Coverage   96.40%   96.42%   +0.02%     
==========================================
  Files         164      164              
  Lines        7782     7869      +87     
==========================================
+ Hits         7502     7588      +86     
- Misses        280      281       +1

Flag	Coverage Δ
unittests	`96.42% <100.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

doctr/datasets/cord.py

doctr/datasets/iiit5k.py

odulcy-mindee · 2024-09-30T12:07:50Z

doctr/datasets/funsd.py

+        if recognition_task and detection_task:
+            raise ValueError(
+                "recognition_task and detection_task cannot be set to True simultaneously "
+                + "to get the whole dataset with boxes and labels leave both to False"
+            )
+


I guess this part can be moved in VisionDataset or even in an abstraction above as this configuration is always forbidden.
It'll also reduce the number of copy paste

The problem here is that all datasets inerhit from AbstractDataset, but not all datasets provides the functionality to be used for recognition and/or detection for example MJSynth is a pure recognition dataset :)

So recognition_task / detection_task is only available on the top level .. We could do something like raise_for on VisionDataset but not sure if we really want something 😅

Ok, so if you don't want to to be able to pass recognition_task or detection_task to all AbstractDataset, the code can stay like this, I'm fine with it. My goal was to move the logic "if both variables are set to True, then raise an error as it's never possible".

felixdittrich92 · 2024-09-30T12:59:44Z

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅

Do you know what i mean ?

Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

odulcy-mindee · 2024-10-01T08:38:26Z

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅

Do you know what i mean ?

Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

Here, I see what you mean. It lacks a variable is_recognition_dataset and is_detection_dataset in my opinion, but that's for another PR

felixdittrich92 · 2024-10-01T08:40:56Z

@odulcy-mindee I tried to handle it with kwargs in the _AbstractDataset but to be honest I didn't like to forward both args only to raise the error ..because in this case it feels like both (detection_task and recognition_task) would do anything in the "background" 😅
Do you know what i mean ?
Edit: double checked - we would need to init _AbstractDataset with detection_task and recognition_task to handle this well at one point for each dataset ... but from the logic which is handled by the base class this doesn't fit

class _AbstractDataset:
    data: List[Any] = []
    _pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None

    def __init__(
        self,
        root: Union[str, Path],
        img_transforms: Optional[Callable[[Any], Any]] = None,
        sample_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
        pre_transforms: Optional[Callable[[Any, Any], Tuple[Any, Any]]] = None,
    ) -> None:
        if not Path(root).is_dir():
            raise ValueError(f"expected a path to a reachable folder: {root}")

        self.root = root
        self.img_transforms = img_transforms
        self.sample_transforms = sample_transforms
        self._pre_transforms = pre_transforms
        self._get_img_shape = get_img_shape

    def __len__(self) -> int:
        return len(self.data)

    def _read_sample(self, index: int) -> Tuple[Any, Any]:
        raise NotImplementedError

    def __getitem__(self, index: int) -> Tuple[Any, Any]:
        # Read image
        img, target = self._read_sample(index)
        # Pre-transforms (format conversion at run-time etc.)
        if self._pre_transforms is not None:
            img, target = self._pre_transforms(img, target)

        if self.img_transforms is not None:
            # typing issue cf. https://github.com/python/mypy/issues/5485
            img = self.img_transforms(img)

        if self.sample_transforms is not None:
            # Conditions to assess it is detection model with multiple classes and avoid confusion with other tasks.
            if (
                isinstance(target, dict)
                and all(isinstance(item, np.ndarray) for item in target.values())
                and set(target.keys()) != {"boxes", "labels"}  # avoid confusion with obj detection target
            ):
                img_transformed = _copy_tensor(img)
                for class_name, bboxes in target.items():
                    img_transformed, target[class_name] = self.sample_transforms(img, bboxes)
                img = img_transformed
            else:
                img, target = self.sample_transforms(img, target)

        return img, target

    def extra_repr(self) -> str:
        return ""

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}({self.extra_repr()})"

wdyt ?

Here, I see what you mean. It lacks a variable is_recognition_dataset and is_detection_dataset in my opinion, but that's for another PR

Correct :) Something like that :)

felixdittrich92 · 2024-10-01T08:41:47Z

I think too for the moment we can stay with it as is, but this would be an possible way to clean up the code a bit 👍

allow detection task for built-in datasets

78a6818

felixdittrich92 added this to the 0.9.1 milestone Sep 5, 2024

felixdittrich92 requested a review from odulcy-mindee September 5, 2024 07:52

felixdittrich92 self-assigned this Sep 5, 2024

update tests

e0a2cde

felixdittrich92 mentioned this pull request Sep 5, 2024

[Bug] Fix eval scripts + possible overflow in Resize #1715

Merged

odulcy-mindee reviewed Sep 30, 2024

View reviewed changes

felixdittrich92 added 2 commits September 30, 2024 14:50

Apply suggestions

f4c49a1

revert

3991bdc

felixdittrich92 requested a review from odulcy-mindee September 30, 2024 13:00

odulcy-mindee approved these changes Oct 1, 2024

View reviewed changes

felixdittrich92 merged commit 7f6757c into mindee:main Oct 1, 2024
80 of 81 checks passed

felixdittrich92 deleted the detection-task branch October 1, 2024 08:42

felixdittrich92 modified the milestones: 0.9.1, 0.10.0 Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[datasets] Allow detection task for built-in datasets #1717

[datasets] Allow detection task for built-in datasets #1717

felixdittrich92 commented Sep 5, 2024

codecov bot commented Sep 5, 2024 •

edited

Loading

odulcy-mindee Sep 30, 2024

felixT2K Sep 30, 2024 •

edited by felixdittrich92

Loading

felixT2K Sep 30, 2024

odulcy-mindee Oct 1, 2024

felixdittrich92 commented Sep 30, 2024 •

edited

Loading

odulcy-mindee commented Oct 1, 2024

felixdittrich92 commented Oct 1, 2024

felixdittrich92 commented Oct 1, 2024

[datasets] Allow detection task for built-in datasets #1717

[datasets] Allow detection task for built-in datasets #1717

Conversation

felixdittrich92 commented Sep 5, 2024

codecov bot commented Sep 5, 2024 • edited Loading

Codecov Report

odulcy-mindee Sep 30, 2024

Choose a reason for hiding this comment

felixT2K Sep 30, 2024 • edited by felixdittrich92 Loading

Choose a reason for hiding this comment

felixT2K Sep 30, 2024

Choose a reason for hiding this comment

odulcy-mindee Oct 1, 2024

Choose a reason for hiding this comment

felixdittrich92 commented Sep 30, 2024 • edited Loading

odulcy-mindee commented Oct 1, 2024

felixdittrich92 commented Oct 1, 2024

felixdittrich92 commented Oct 1, 2024

codecov bot commented Sep 5, 2024 •

edited

Loading

felixT2K Sep 30, 2024 •

edited by felixdittrich92

Loading

felixdittrich92 commented Sep 30, 2024 •

edited

Loading