Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format error in JSON body: data did not match any variant of untagged enum VectorStruct #744

Open
achillesliu opened this issue Aug 19, 2024 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@achillesliu
Copy link

achillesliu commented Aug 19, 2024

python lib qdrant-client raises an error when upserting sparse vector

Current Behavior

the traceback is like the following

UnexpectedResponse Traceback (most recent call last)
Cell In[17], line 1
----> 1 qclient.upsert(
2 collection_name=collection_name,
3 points=[
4 models.PointStruct(
5 id=1,
6 payload={'metadata': 'metadata'},
7 vector={
8 # 'default_dense': [1.] * 1024,
9 'text': models.SparseVector(
10 indices=[],
11 values=[],
12 ),
13 },
14 )
15 # for text, sparse_dict, dense_vec in zip(
16 # hypo_answers_original[:100],
17 # # sparse_emb_answers['lexical_weights'],
18 # dense_emb_answers['dense_vecs'][:100],
19 # dense_emb_answers['dense_vecs'][:100],
20 # )
21 ]
22 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/qdrant_client.py:1349, in QdrantClient.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs)
1321 """
1322 Update or insert a new point into the collection.
1323
(...)
1345 Operation Result(UpdateResult)
1346 """
1347 assert len(kwargs) == 0, f"Unknown arguments: {list(kwargs.keys())}"
-> 1349 return self._client.upsert(
1350 collection_name=collection_name,
1351 points=points,
1352 wait=wait,
1353 ordering=ordering,
1354 shard_key_selector=shard_key_selector,
1355 **kwargs,
1356 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/qdrant_remote.py:1756, in QdrantRemote.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs)
1753 if isinstance(points, models.Batch):
1754 points = models.PointsBatch(batch=points, shard_key=shard_key_selector)
-> 1756 http_result = self.openapi_client.points_api.upsert_points(
1757 collection_name=collection_name,
1758 wait=wait,
1759 point_insert_operations=points,
1760 ordering=ordering,
1761 ).result
1762 assert http_result is not None, "Upsert returned None result"
1763 return http_result

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api/points_api.py:1667, in SyncPointsApi.upsert_points(self, collection_name, wait, ordering, point_insert_operations)
1657 def upsert_points(
1658 self,
1659 collection_name: str,
(...)
1662 point_insert_operations: m.PointInsertOperations = None,
1663 ) -> m.InlineResponse2006:
1664 """
1665 Perform insert + updates on points. If point with given ID already exists - it will be overwritten.
1666 """
-> 1667 return self._build_for_upsert_points(
1668 collection_name=collection_name,
1669 wait=wait,
1670 ordering=ordering,
1671 point_insert_operations=point_insert_operations,
1672 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api/points_api.py:852, in _PointsApi.build_for_upsert_points(self, collection_name, wait, ordering, point_insert_operations)
850 if "Content-Type" not in headers:
851 headers["Content-Type"] = "application/json"
--> 852 return self.api_client.request(
853 type
=m.InlineResponse2006,
854 method="PUT",
855 url="/collections/{collection_name}/points",
856 headers=headers if headers else None,
857 path_params=path_params,
858 params=query_params,
859 content=body,
860 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api_client.py:79, in ApiClient.request(self, type_, method, url, path_params, **kwargs)
77 kwargs["timeout"] = int(kwargs["params"]["timeout"])
78 request = self.client.build_request(method, url, **kwargs)
---> 79 return self.send(request, type
)

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api_client.py:102, in ApiClient.send(self, request, type_)
100 except ValidationError as e:
101 raise ResponseHandlingException(e)
--> 102 raise UnexpectedResponse.for_response(response)

UnexpectedResponse: Unexpected Response: 400 (Bad Request)
Raw response content:
b'{"status":{"error":"Format error in JSON body: data did not match any variant of untagged enum VectorStruct at line 1 column 63"},"time":0.0}'

Steps to Reproduce

  1. Setup conda env with python 3.8 or 3.10 (both tried and error found), qdrant clinet version 1.7 and 1.10 and 1.11 tried (all failed)
  2. Setup qdrant docker container with sudo docker run -d -p 6333:6333 -v qdrant_data:/qdrant/storage:z qdrant/qdrant
  3. Create a qdrant collection with the following code
qclient = QdrantClient(url="http://localhost:6333")
collection_name = 'test_multiple_vectors'
qclient.recreate_collection(
    collection_name=collection_name,
    vectors_config={
        'default_dense': models.VectorParams(size=1024, distance=models.Distance.COSINE)
    },
    sparse_vectors_config={
        "text": models.SparseVectorParams(index=models.SparseIndexParams(on_disk=False))
    },
)
  1. upsert with the following code
qclient.upsert(
    collection_name=collection_name,
    points=[
        models.PointStruct(
            id=1,
            vector={
                'text': models.SparseVector(
                    indices=[1, 3, ],
                    values=[0.1 0.3, ],
                ),
            },
        )

Possible Solution

Tried making a new conda env but still got an error

@achillesliu achillesliu added the bug Something isn't working label Aug 19, 2024
@timvisee timvisee transferred this issue from qdrant/qdrant Aug 19, 2024
@achillesliu
Copy link
Author

Update: with the qdrant clinet created like the following:
client = QdrantClient(':memory'), all the things works

@joein
Copy link
Member

joein commented Aug 21, 2024

hi @achillesliu,
could you please tell us which qdrant version you are using?

unfortunately, we could not reproduce the issue

@DhavalArGEP
Copy link

I am also facing the same issue with sparse embedding.

using qdrant-client version 1.10.1

using sparse model: prithivida/Splade_PP_en_v1

@joein
Copy link
Member

joein commented Aug 29, 2024

@DhavalArGEP hi, could you please try out qdrant-client 1.11.1 and qdrant at least 1.10 (better 1.11.1)
Unfortunately, we could not reproduce it ourselves

@DhavalArGEP
Copy link

@joein with qdrant-client 1.11.1 also I am facing the same issue.

@joein
Copy link
Member

joein commented Aug 30, 2024

@DhavalArGEP is it reproducible with the code provided in the issue, or could you maybe provide your own?

Which version of qdrant (not qdrant-client) are you using?

@DhavalArGEP
Copy link

@joein I am not able to provide code as it is used in my organization. But the same code is working on windows platform but not on Linux (I am using Docker).

@joein
Copy link
Member

joein commented Sep 2, 2024

@DhavalArGEP can you create a minimal reproducible example? Not exactly the same code you use, but the one which is enough for pinpointing the issue

Could you also tell us the version of qdrant you are using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants