Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for arrays, enums and primitive types #5522

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kzu
Copy link
Contributor

@kzu kzu commented Oct 15, 2024

Structured outputs in OpenAI require an object schema object. By detecting this situation and wrapping always in a Payload<T>(T Data) record, we significantly improve the developer experience by making the API transparent to that limitation (which might even be a temporary one?).

The approach works for both OpenAI as well as Azure Inference without native structured outputs. In order to signal a wrapped result to the ChatCompletion<T>, we use the AdditionalProperties dictionary with a non-standard $wrapped property which is a typical convention for JSON properties that are not intended for end-user consumption (like $schema).

NOTE: there will be significant conflicts with #5513, so I'll adjust after that merge, if this is deemed a useful addition 🙏

Fixes #5521

Microsoft Reviewers: Open in CodeFlow

Structured outputs in OpenAI require an `object` schema object. By detecting this situation and wrapping always in a `Payload<T>(T Data)` record, we significantly improve the developer experience by making the API transparent to that limitation (which might even be a temporary one?).

The approach works for both OpenAI as well as Azure Inference without native structured outputs. In order to signal a wrapped result to the `ChatCompletion<T>`, we use the `AdditionalProperties` dictionary with a non-standard `$wrapped` property which is a typical convention for JSON properties that are not intended for end-user consumption (like $schema).

Fixes dotnet#5521
@kzu kzu requested a review from a team as a code owner October 15, 2024 19:44
@@ -40,8 +40,7 @@ public static Task<ChatCompletion<T>> CompleteAsync<T>(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
bool? useNativeJsonSchema = null,
CancellationToken cancellationToken = default)
where T : class =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of whether we take this change, I'd be in favor of removing this constraint since it isn't a good predictor of the JSON shape of the type. We can instead just fail at runtime depending on the value of the corresponding JsonTypeInfo.Kind property.

Rather than relying on the type system, since a source-generated serializer options would not be able to deal with it.
@kzu
Copy link
Contributor Author

kzu commented Oct 16, 2024

Addressed feedback and improved code a bit.

@SteveSandersonMS
Copy link
Member

Thanks for your work on this, @kzu!

I'm not yet certain whether or not we'd add this feature, as per the comment at #5521 (comment). However, I do appreciate you've been able to do this without expanding the built-in prompt. And in many ways the end result is similar to what happens if the developer supplies a wrapper class manually.

One of the key questions that would impact whether we want to build this in is how the reliability would compare to if the developer provides their own wrapper class. A wrapper class has the advantage that its property is named semantically to reinforce what value is desired, whereas the auto-wrapper always calls the property data which is less clear.

It might be that it works fine, but like you say in #5521, we're looking for a "pit of success" here and so if the best results come from having a semantically-named property, we'd want to lead developers to do that even if it means a few more lines of code.

When it comes to reliability, I'm not too worried about OpenAI models (GPT 3.5T and later) since they seem to behave pretty solidly with almost any structured-output case, even the models that don't support native structured output. It's much more relevant to benchmark the reliability of small models, such as the 7B/8B parameter Lllama 2/3, Mistral, or phi3:medium. I found it pretty hard to get those to give valid structured output at all [1], since JSON schema is not a good way of describing the output format (examples are much better). Providing them with a JSON schema that's more abstract could add to the challenge.

If you have any quantification about the reliability of smaller models on tasks like "return a single enum value" or "return a single number" (which developers will often want to do) and how manual wrapping compares with autogenerating a wrapper, that could help with making a decision here. In the extreme, we might conclude that smaller models just aren't reliable for this in any case, which would lead to some other strategy around generating a JSON example instead of a JSON schema, and at that time loosen the rules about what value types you can request.

[1] Sidenote: it turns out that changing the prompt augmentation to use User messages instead of System messages makes the Ollama-based models much more compliant about structured output, so we should do that. But it's still hard for them.

@kzu
Copy link
Contributor Author

kzu commented Oct 16, 2024

Fair enough. May I request then that instead of an opaque result ("couldn't convert to schema"), the API instead detects this situation and errors with a meaningful error, stating that a wrapper IS REQUIRED otherwise things won't work? It's a super easy "bug" to hit and the current behavior isn't great.

@kzu
Copy link
Contributor Author

kzu commented Oct 16, 2024

On the semantic help you get when defining your own wrapper, I'd suggest you add this by default in the transform callback, since the description can add a lot of context for the model:

          TransformSchemaNode = (context, node) =>
          {
              var description = context.PropertyInfo?.AttributeProvider?.GetCustomAttributes(typeof(DescriptionAttribute), false)
                  .OfType<DescriptionAttribute>()
                  .FirstOrDefault()?.Description;

              if (description != null)
                  node["description"] = description;

              return node;
          },

Otherwise, you're just relying on the propery names alone. Should I report that as a separate issue?

@SteveSandersonMS
Copy link
Member

Otherwise, you're just relying on the propery names alone. Should I report that as a separate issue?

That would be great. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Arrays and primitive types are not supported in structured output (native or otherwise)
4 participants