Fixing uniform empty tensor handling #283

Narsil · 2023-06-27T14:01:58Z

What does this PR do?

Empty tensors would be accepted if single, disallowed if many (because storage overlap).
This fixes it by simply ignoring empty tensors like meta tensors are ignored.

Fixes Cannot save more than 1 empty pyTorch tensor in a safetensors file. #280

Fixes # (issue) or description of the problem this PR solves.

- Fixes #280

sgugger

Makes sense, thanks for adding this!

sgugger · 2023-06-27T14:04:41Z

bindings/python/tests/test_pt_comparison.py

+        self.assertTrue(torch.equal(data["test2"], reloaded["test2"]))
+
+    def test_disjoint_tensors_shared_storage(self):
+        A = torch.zeros((10, 10))


Nit, not fan of capitalized names for variables. Not fan of one-letter names either ;-)

Do you have a better suggestion ?

tensor, array, matrix

thomasw21 · 2023-06-27T17:01:18Z

bindings/python/py_src/safetensors/torch.py

@@ -36,7 +36,7 @@ def storage_size(tensor: torch.Tensor) -> int:
 def _find_shared_tensors(state_dict: Dict[str, torch.Tensor]) -> List[Set[str]]:
    tensors = defaultdict(set)
    for k, v in state_dict.items():
-        if v.device != torch.device("meta"):
+        if v.device != torch.device("meta") and storage_ptr(v) != 0 and storage_size(v) != 0:


Can't you just test that the tensor.numel() != 0 ? Nothing against the current implementation but somehow I don't understand why you need to check the pointer.

Because it's the storage pointer, not the tensor I'm looking for.
You can have an empty tensor, that still is backed by shared storage, in which case I still want to yell.

Hum so why not do storage_size(v) != 0 only? You should accept any tensor that has empty storage regardless on where they are actually stored no?

No, I'm also trying to safeguard users from doing wrong things.
Saving empty tensors should worry most users.

Here this was never allocated in the first place, so it looks intentional.

I don't understand. Am I right in the following?

torch.tensor([]) # want to support torch.zeros((2,0)) # want to support torch.zeros((2,1))[:, :0] # no support

If so, storage_size(v) != 0 should work.

>>> l = [torch.tensor([]), torch.zeros((2,0)), torch.zeros((2,1))[:, :0]] >>> for elt in l: ... print(elt.untyped_storage().size()) ... 0 0 8

torch.zeros((2,1))[:, :0] this will work if it's alone in the state dict actually. But if 2 tensors share the same storage, then crash will happen. I don't really want to start looking if the slices actually overlap or not.

Hum what I mean is if your storage is 0, there's no overlap or anything.

torch.zeros((2,1))[:, :0] has not a 0 sized storage.

Okay I think I may be misunderstanding something. Anyway, it's not very important.

Fixing uniform empty tensor handling

030be35

- Fixes #280

Narsil requested review from sgugger and thomasw21 June 27, 2023 14:02

sgugger approved these changes Jun 27, 2023

View reviewed changes

thomasw21 reviewed Jun 27, 2023

View reviewed changes

Narsil merged commit e967126 into main Jun 30, 2023
9 of 10 checks passed

Narsil deleted the harmonize_empty_tensor_handling branch June 30, 2023 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing uniform empty tensor handling #283

Fixing uniform empty tensor handling #283

Narsil commented Jun 27, 2023

sgugger left a comment

sgugger Jun 27, 2023

Narsil Jun 27, 2023

sgugger Jun 27, 2023

thomasw21 Jun 27, 2023

Narsil Jun 28, 2023

thomasw21 Jun 28, 2023

Narsil Jun 28, 2023

thomasw21 Jun 28, 2023 •

edited

Loading

thomasw21 Jun 28, 2023

Narsil Jun 28, 2023

thomasw21 Jun 28, 2023

Narsil Jun 28, 2023

thomasw21 Jun 28, 2023

Fixing uniform empty tensor handling #283

Fixing uniform empty tensor handling #283

Conversation

Narsil commented Jun 27, 2023

What does this PR do?

sgugger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasw21 Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasw21 Jun 28, 2023 •

edited

Loading