Arrays and primitive types are not supported in structured output (native or otherwise) #5521

kzu · 2024-10-15T19:06:31Z

If you need an array (or list) of typed values from completion rather than a single object, things fail whether you're using native json schema support or not. These two tests fail, for example:

    [ConditionalFact]
    public virtual async Task CompleteAsync_StructuredOutputArray()
    {
        SkipIfNotEnabled();

        var response = await _chatClient.CompleteAsync<Person[]>("""
            Who are described in the following sentence?
            Jimbo Smith is a 35-year-old software developer from Cardiff, Wales.
            Josh Simpson is a 25-year-old software developer from Newport, Wales.
            """,
            useNativeJsonSchema: _chatClient.Metadata.ProviderName == "openai");

        Assert.Equal(2, response.Result.Length);
        Assert.Contains(response.Result, x => x.FullName == "Jimbo Smith");
        Assert.Contains(response.Result, x => x.FullName == "Josh Simpson");
    }

    [ConditionalFact]
    public virtual async Task CompleteAsync_StructuredOutputPrimitive()
    {
        SkipIfNotEnabled();

        var response = await _chatClient.CompleteAsync<int>("""
            As of today (october 2024), Jimbo Smith is a 35-year-old software developer from Cardiff, Wales.
            Which year was he born in?
            """,
            useNativeJsonSchema: _chatClient.Metadata.ProviderName == "openai");

        Assert.Equal(1989, response.Result);
    }

    [ConditionalFact]
    public virtual async Task CompleteAsync_StructuredOutputEnum()
    {
        SkipIfNotEnabled();

        var response = await _chatClient.CompleteAsync<Architecture>("""
            I'm using a Macbook Pro with an M2 chip. What architecture am I using?
            """,
            useNativeJsonSchema: _chatClient.Metadata.ProviderName == "openai");

        Assert.Equal(Architecture.Arm64, response.Result);
    }

    [ConditionalFact]
    public virtual async Task CompleteAsync_StructuredOutputString()
    {
        SkipIfNotEnabled();

        var response = await _chatClient.CompleteAsync<string>("""
            The software developer, Jimbo Smith, is a 35-year-old software developer from Cardiff, Wales.
            What's his full name?
            """,
            useNativeJsonSchema: _chatClient.Metadata.ProviderName == "openai");

        Assert.Equal("Jimbo Smith", response.Result);
    }

    [ConditionalFact]
    public virtual async Task CompleteAsync_StructuredOutputBool()
    {
        SkipIfNotEnabled();

        var response = await _chatClient.CompleteAsync<bool>("""
            The software developer, Jimbo Smith, is a 35-year-old software developer from Cardiff, Wales.
            Is he a medical doctor?
            """,
            useNativeJsonSchema: _chatClient.Metadata.ProviderName == "openai");

        Assert.False(response.Result);
    }

Except for the first test, the others don't even compile due to a constraint on the T being a class, but there is really no need for that to be the case and it seriously restricts the usefulness of the typed API, forcing the creation of intermediate types just to hold those values (i.e. a superfluous Payload<T>).

The native JSON schema in OpenAI does require the schema type to be an object (not an array or anything else). But this could be solved by having the API itself providing the wrapping object automatically, so that users fall more easily in the proverbial pit of success.

The text was updated successfully, but these errors were encountered:

Structured outputs in OpenAI require an `object` schema object. By detecting this situation and wrapping always in a `Payload<T>(T Data)` record, we significantly improve the developer experience by making the API transparent to that limitation (which might even be a temporary one?). The approach works for both OpenAI as well as Azure Inference without native structured outputs. In order to signal a wrapped result to the `ChatCompletion<T>`, we use the `AdditionalProperties` dictionary with a non-standard `$wrapped` property which is a typical convention for JSON properties that are not intended for end-user consumption (like $schema). Fixes dotnet#5521

stephentoub · 2024-10-15T20:59:37Z

cc: @SteveSandersonMS

SteveSandersonMS · 2024-10-16T07:32:04Z

The native JSON schema in OpenAI does require the schema type to be an object (not an array or anything else). But this could be solved by having the API itself providing the wrapping object automatically, so that users fall more easily in the proverbial pit of success.

We could potentially do that, but it has other risks:

The more this library takes responsibility for controlling the behavior, the more it becomes responsible for cases where it doesn't work or behaves inconsistently across LLMs.
It's harder for developers to work out what's going on when looking at telemetry or debugging, since they don't just see their own object shapes, but rather something else provided by the library

Based on this we've taken the decision (so far at least - it could change) to limit how much magic this library does around structured output. This means:

For native structured output, try to pass through the functionality given, not extend it
For non-native structured output, be limited to an autogenerated prompt that just says "Output JSON in this schema: {schema}" and not give any other further hints about how to write the output. Because anything we write might have side effects on some LLMs, for example even just writing this in English will make the LLM more likely to respond in English, even if the developer/user wants the response in another language.

As such it's up to the developer to augment this behavior further if they want. For scalar outputs, they should provide their own wrapping object. And if the non-native prompt isn't sufficient, they should add their own further prompting.

It's a tricky area and we'll only know if the design is right once people start using this at scale in hundreds of different app scenarios. If we go on to collect evidence from further feedback that a particular change is warranted, we'll certainly be open to making that change. Hope that seems reasonable!

kzu · 2024-10-16T17:48:10Z

Hm. This is one of the areas that starts to feel like old EF where the LINQ you got was actually a VERY leaky abstraction that didn't hold in a ton of scenarios, requiring knowledge of what worked and what didn't. Here you see a T, where a large number of instances just won't work. Lots of frustration and unnecessary wrappers by everyone. I view it as a VERY low-hanging fruit, and something that I'm almost sure future OpenAI LLMs may likely solve natively too, since it just looks like an oversight (or a bug).

I don't see how automatically wrapping (AND passing that as the schema you tell the LLM to use, either natively or via your existing prompt) can have vastly different results as it already can have with the existing mechanism. My PR shows how you basically pass the same schema (wrapped top-level) in both cases, so the delta in behavior would be the same as it could be today.

Sure, this can end up in a package with a single extension method CompleteAsyncForAny<T>, say, so it's not as if it's a showstopper...

kzu linked a pull request Oct 15, 2024 that will close this issue

Add support for arrays, enums and primitive types #5522

Open

dotnet-policy-service bot added the work in progress 🚧 label Oct 15, 2024

stephentoub added the area-AI label Oct 15, 2024

stephentoub assigned SteveSandersonMS Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arrays and primitive types are not supported in structured output (native or otherwise) #5521

Arrays and primitive types are not supported in structured output (native or otherwise) #5521

kzu commented Oct 15, 2024 •

edited

Loading

stephentoub commented Oct 15, 2024

SteveSandersonMS commented Oct 16, 2024 •

edited

Loading

kzu commented Oct 16, 2024

Arrays and primitive types are not supported in structured output (native or otherwise) #5521

Arrays and primitive types are not supported in structured output (native or otherwise) #5521

Comments

kzu commented Oct 15, 2024 • edited Loading

stephentoub commented Oct 15, 2024

SteveSandersonMS commented Oct 16, 2024 • edited Loading

kzu commented Oct 16, 2024

kzu commented Oct 15, 2024 •

edited

Loading

SteveSandersonMS commented Oct 16, 2024 •

edited

Loading