Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Registering Pretrained Opensearch Model Fails Due to Null Model Config #2981

Open
mohamey opened this issue Sep 24, 2024 · 2 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@mohamey
Copy link

mohamey commented Sep 24, 2024

What is the bug?
A clear and concise description of the bug.

On Opensearch 2.11.0, attempting to register Opensearch Pretrained models like amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill fails with the below error:

{
  "task_type": "REGISTER_MODEL",
  "function_name": "SPARSE_TOKENIZE",
  "state": "FAILED",
  "worker_node": [
    "<redacted>"
  ],
  "create_time": 1727095445453,
  "last_update_time": 1727095445784,
  "error": "model config is null",
  "is_async": true
}

The request used is:

POST /_plugins/_ml/models/_register?deploy=true
{
  "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill",
  "function_name": "SPARSE_TOKENIZE",
  "version": "1.0.0",
  "model_format": "TORCH_SCRIPT",
  "model_content_hash_value": "86bab435d031edb2a6d921fd9ac317a7541d5d95666f642b606e7d0ebfb84358",
  "description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves only in ingestion and customer should use tokenizer model in query."
}

model config is null only appears in two places in the codebase, but I suspect it's coming from this class. However, I don't think the above requests matches the conditions for this error - which is what leads me to believe this might be a bug.

Can you confirm? And if yes, maybe you can provide some context on how it should work and I'm happy to submit a fix.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Provision a new Opensearch 2.11 server, this bug was verified by myself on both the docker image & a managed Opensearch cluster provided by AWS
  2. Configure the ML Plugin using the following:
{
  "persistent": {
    "plugins.ml_commons.only_run_on_ml_node": false,
    "plugins.ml_commons.native_memory_threshold": "99",
    "plugins.ml_commons.model_access_control_enabled": "true"
  }
}
  1. Make the following request to register a pretrained OS model (which should not require a URL as per docs):
POST /_plugins/_ml/models/_register?deploy=true
{
  "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill",
  "function_name": "SPARSE_TOKENIZE",
  "version": "1.0.0",
  "model_format": "TORCH_SCRIPT",
  "model_content_hash_value": "86bab435d031edb2a6d921fd9ac317a7541d5d95666f642b606e7d0ebfb84358",
  "description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves only in ingestion and customer should use tokenizer model in query."
}
  1. Get the task ID from the response in Step 3, and check it's status using `GET /_plugins/_ml/tasks/:task_id
  2. Observe the error which states model config is null

What is the expected behavior?
The requested model should register and deploy successfully.

What is your host/environment?

  • OS: AWS Managed Opensearch Cluster 2.11
  • Version - 2.11
  • Plugins N/A

Do you have any screenshots?
N/A

Do you have any additional context?
Largely been following this tutorial provided by the Opensearch Docs.

@mohamey mohamey added bug Something isn't working untriaged labels Sep 24, 2024
@mingshl mingshl removed the untriaged label Sep 24, 2024
@dhrubo-os
Copy link
Collaborator

Not sure why do you need to provide model_content_hash_value in the request. Can you try this example?

@xinyual could you please look into this issue?

@mohamey
Copy link
Author

mohamey commented Oct 2, 2024

Sorry for the delayed response, the below request fails -

Request:

POST /_plugins/_ml/models/_register?deploy=true
{
  "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill",
  "version": "1.0.0",
  "model_format": "TORCH_SCRIPT"
}

Task Status:

{
  "task_type": "REGISTER_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "FAILED",
  "worker_node": [
    "e8peipyTTKK5FNLxCPGVXg"
  ],
  "create_time": 1727860126324,
  "last_update_time": 1727860127103,
  "error": "model config is null",
  "is_async": true
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: On-deck
Development

No branches or pull requests

3 participants