Some InferenceClient tasks missing `parameters` argument, inconsistent with task specifications #2557

hanouticelina · 2024-09-20T13:32:21Z

Some task methods in the huggingface_hub.InferenceClient do not include a parameters argument to allow passing additional inference params.
The tasks are : audio-classification, automatic-speech-recognition, fill-mask, image-classification, image-segmentation, object-detection, question-answering, table-question-answering, text-classification, token-classification and translation.
This inconsistency makes the implementation not fully aligned with the task specs here and the documentation here.

taking the example of text-classification:

with python requests:

import os

import requests

from huggingface_hub import InferenceClient


API_URL = "https://api-inference.huggingface.co/models/distilbert/distilbert-base-uncased-finetuned-sst-2-english"
headers = {"Authorization": f"Bearer {os.environ.get('HF_TOKEN')}"}


def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()


output = query(
    {
        "inputs": "I like you. I love you",
        "parameters": {"top_k": 1},
    }
)

print(output)

[{'label': 'POSITIVE', 'score': 0.9998738765716553}]

With InferenceClient

client = InferenceClient()

output = client.text_classification(
    model="distilbert-base-uncased-finetuned-sst-2-english",
    text="I like you. I love you",
    parameters={"top_k": 1},
)

print(output)

output = client.text_classification(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: InferenceClient.text_classification() got an unexpected keyword argument 'parameters'

The text was updated successfully, but these errors were encountered:

Wauplin · 2024-09-20T13:46:25Z

Yes that is an identified problem. For context, when we first implemented the InferenceClient, we didn't have that much harmonization yet (specs didn't exists). To avoid future maintenance we decided not to add them in a lot of tasks -always better to do less and avoid future breaking changes-. That's why the tasks you've listed above don't have parameters.

That been said, if we start adding parameters, we most likely would add a parameters parameter to the method signatures but individual parameters instead (e.g. client.text_classification(..., top_k=1)).

EDIT: another part to take into consideration is that those tasks are much much less used than the text generation or image generation ones, hence the reduced scope.

hanouticelina · 2024-09-20T14:00:28Z

Agree, I prefer too adding a parameters argument instead of individual parameters, at least for those "less popular" tasks. Despite having low usage of those tasks though, I think it's still important to have the client aligned with the Inference API documentation.

Wauplin · 2024-09-20T14:07:47Z

Sorry sorry, my bad 🙈

I meant we most likely wouldn't add a parameters parameter to the method signatures but individual parameters instead.

The problem of a parameters parameter is that it doesn't play well with annotations/autocomplete. Even if we start using typing.TypedDict as OAI does, it's still not as natively supported as kwargs.

Wauplin · 2024-09-20T14:08:17Z

I think it's still important to have the client aligned with the Inference API documentation.

Makes sense now that we have a proper documentation 👍

Wauplin · 2024-09-20T14:19:25Z

Since we are talking about ~10 tasks that follows exactly the same pattern (i.e. inputs + parameters), I think we should think of a semi-automated way to do that as well. This is how I would picture it (taking text_classification example):

take TextClassificationParameters class that is auto-generated based on the specs
take InferenceClient.text_classification method that is for now entirely manually written
check if a parameter is not listed in text_classification method. If a parameter is missing,
- add it to the method signature with the same type annotation as the dataclass (for instance function_to_apply: Optional["ClassificationOutputTransform"] = None)
- add the attribute docstring (for instance """When specified, limits the output to the top K most probable classes.""" to the method docstring inside the args section
- make sure the imports are correct (typically, from ... import ClassificationOutputTransform)

With such a script, we would be sure all those "less used methods" are still well maintained according to the specs. In the past, I've created a few script to auto-generate things (see the ./utils folder), always based on static analysis i.e. only working on the module text. For such a script here, I think we'll need to work at the ast level.

It's usually good for such a script to have a "check" mode (fails if something has to be changed) and a "format" mode (modify the file inplace). Once we have it, we can integrate it to the makefile / CI. Let me know what you think :)

hanouticelina · 2024-09-20T14:59:34Z

Sorry sorry, my bad 🙈

I meant we most likely wouldn't add a parameters parameter to the method signatures but individual parameters instead.

The problem of a parameters parameter is that it doesn't play well with annotations/autocomplete. Even if we start using typing.TypedDict as OAI does, it's still not as natively supported as kwargs.

Yes, I was actually thinking about adding parameters to avoid modifying the client code if there are future changes in the parameters on the API side. But makes sense to add individual parameters, I don't think there will be much changes so let's do that.

Since we are talking about ~10 tasks that follows exactly the same pattern (i.e. inputs + parameters), I think we should think of a semi-automated way to do that as well. This is how I would picture it (taking text_classification example):

take TextClassificationParameters class that is auto-generated based on the specs

take InferenceClient.text_classification method that is for now entirely manually written

check if a parameter is not listed in text_classification method. If a parameter is missing,

add it to the method signature with the same type annotation as the dataclass (for instance function_to_apply: Optional["ClassificationOutputTransform"] = None)

add the attribute docstring (for instance """When specified, limits the output to the top K most probable classes.""" to the method docstring inside the args section

make sure the imports are correct (typically, from ... import ClassificationOutputTransform)

With such a script, we would be sure all those "less used methods" are still well maintained according to the specs. In the past, I've created a few script to auto-generate things (see the ./utils folder), always based on static analysis i.e. only working on the module text. For such a script here, I think we'll need to work at the ast level.

It's usually good for such a script to have a "check" mode (fails if something has to be changed) and a "format" mode (modify the file inplace). Once we have it, we can integrate it to the makefile / CI. Let me know what you think :)

Yes, I like the idea, that will be cool to have this in the makefile/CI.
I think the script should:

Focus solely on adding new missing parameters
Not handle parameter renaming or deprecation. For example, if function_to_apply is renamed to fn_to_apply, we'd still need to manually keep the old name and deprecate it. The idea is really to make the script simple with not a lot of maintenance to do.

I will open later a separate issue for this 👍

Wauplin · 2024-09-20T15:03:34Z

Not handle parameter renaming or deprecation. For example, if function_to_apply is renamed to fn_to_apply, we'd still need to manually keep the old name and deprecate it. The idea is really to make the script simple with not a lot of maintenance to do.

Agree 👍 And if it's too annoying in future maintenance, we can complete the script later.

I will open later a separate issue for this 👍

As you wish, it's also fine to start a PR directly or rename this issue.

hanouticelina self-assigned this Sep 20, 2024

hanouticelina linked a pull request Sep 24, 2024 that will close this issue

[Inference Client] Add task parameters and a maintenance script of these parameters #2561

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some InferenceClient tasks missing `parameters` argument, inconsistent with task specifications #2557

Some InferenceClient tasks missing `parameters` argument, inconsistent with task specifications #2557

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024 •

edited

Loading

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Wauplin commented Sep 20, 2024

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Some InferenceClient tasks missing parameters argument, inconsistent with task specifications #2557

Some InferenceClient tasks missing parameters argument, inconsistent with task specifications #2557

Comments

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024 • edited Loading

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Wauplin commented Sep 20, 2024

hanouticelina commented Sep 20, 2024

Wauplin commented Sep 20, 2024

Some InferenceClient tasks missing `parameters` argument, inconsistent with task specifications #2557

Some InferenceClient tasks missing `parameters` argument, inconsistent with task specifications #2557

Wauplin commented Sep 20, 2024 •

edited

Loading