Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(openai): add structured output instrumentation #2111

Merged
merged 4 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -432,7 +432,11 @@ def _set_completions(span, choices):
return

_set_span_attribute(span, f"{prefix}.role", message.get("role"))
_set_span_attribute(span, f"{prefix}.content", message.get("content"))

if message.get("refusal"):
_set_span_attribute(span, f"{prefix}.refusal", message.get("refusal"))
else:
_set_span_attribute(span, f"{prefix}.content", message.get("content"))

function_call = message.get("function_call")
if function_call:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,32 @@ def _instrument(self, **kwargs):
"Assistants.create",
assistants_create_wrapper(tracer),
)
wrap_function_wrapper(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@9dogs we should wrap these in try-except since this may fail on old OpenAI SDK versions (this is why the tests are currently failing)

"openai.resources.beta.chat.completions",
"Completions.parse",
chat_wrapper(
tracer,
tokens_histogram,
chat_choice_counter,
duration_histogram,
chat_exception_counter,
streaming_time_to_first_token,
streaming_time_to_generate,
),
)
wrap_function_wrapper(
"openai.resources.beta.chat.completions",
"AsyncCompletions.parse",
achat_wrapper(
tracer,
tokens_histogram,
chat_choice_counter,
duration_histogram,
chat_exception_counter,
streaming_time_to_first_token,
streaming_time_to_generate,
),
)
wrap_function_wrapper(
"openai.resources.beta.threads.runs",
"Runs.create",
Expand All @@ -217,7 +243,7 @@ def _instrument(self, **kwargs):
"Messages.list",
messages_list_wrapper(tracer),
)
except AttributeError:
except (AttributeError, ModuleNotFoundError):
pass

def _uninstrument(self, **kwargs):
Expand Down
1,452 changes: 844 additions & 608 deletions packages/opentelemetry-instrumentation-openai/poetry.lock

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ pytest = "^8.2.2"
pytest-sugar = "1.0.0"
vcrpy = "^6.0.1"
pytest-recording = "^0.13.1"
openai = {extras = ["datalib"], version = "^1.31.1"}
openai = {extras = ["datalib"], version = ">=1.50.0"}
opentelemetry-sdk = "^1.27.0"
pytest-asyncio = "^0.23.7"

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
interactions:
- request:
body: '{"messages": [{"role": "system", "content": "You are a poetic assistant,
skilled in explaining complex programming concepts with creative flair."}, {"role":
"user", "content": "Compose a poem that explains the concept of recursion in
programming."}], "model": "gpt-4o", "response_format": {"type": "json_schema",
"json_schema": {"schema": {"properties": {"poem": {"title": "Poem", "type":
"string"}, "style": {"title": "Style", "type": "string"}}, "required": ["poem",
"style"], "title": "StructuredAnswer", "type": "object", "additionalProperties":
false}, "name": "StructuredAnswer", "strict": true}}, "stream": false}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '620'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.51.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.51.2
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.7
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAA2xVTY/bRgy9+1cQuuSiXWydbXa7t6RtigApWjRBg6IuFvSIkpiMSHVI2XGDBfIj
cunfyy8pKHk/gvRiGEPy8b03HOrDCqDiprqCKvXoaRjzydOfZEu//kKvX1L/+8+v9/8cdPzlh7R7
drk+7Ks6KnT7lpLfVp0mHcZMzipLOBVCp0D95mJ9+e2Ti/V3l3Ng0IZylHWjn5zryfpsfX5ydnly
9uRY2CsnsuoK/lwBAHyYf4OiNPS+uoKz+vZkIDPsqLq6SwKoiuY4qdCMzVG8qu+DScVJZtYfNtWo
NGyqq031QsB7gn3PNlJh6WCv2hhoCwhj0a7gMFB5ZDCwNPVmIy+ZDBCSSqLRwXv0RwZb9R4WJ94D
SgPvWJrTzUZ+ozQVY5XPHz+xf/74r80NKfUKLJC14zSfNjogS3R401MhaCdJYapBwpyjaDDKu2je
IcvcZP4XXTYbeTFgx0KAMHApWgKdpMlkBoW6Qmb1zKfNlDyUshvlFmYsYGlZ2AmayAzMVwquCo1S
ABxVAA3bgonAezZoUBIF6lNIhSyRNBrWBWP7/PETYeqDORfQvUDqI/+Wr4M5FjfYs/eAd4JrMA4f
Z4UtcokGzyaf81iCNmy1OUDPzXIVk0joMcEyoz9f1EdiZiEDtluxJNYXFmqOrGdrdU7VfbQmHMKy
fAAS30fmke+PoYVlpwmDJWR+F2ZnwhbaogMgeKHZjWcFJfVBqZ2K9xRsXG/Hg/0QkH+QAwYvmKTV
3FgNCFs0ghQ/ewKhheZrhb8nWmYgggmbxRwjh8mgLXTH8g17r5Mv1xMstASu9TzCXqfcwJYgq3kA
v0LOwTJwjdAAHQaVaSBxzJDU/PSB9w/51QuZQ8oU7joOC9fvdRKfZ46dCi4DPElDJV6MF801SCQf
+T7H5FoYs9XwnLcqmBLX4LqnYmAKPXf9fFc5g2neUQPbw/081tDx7qhhgD1LZ3GfbT4cR3jAsqMc
yu5qHhlQpg7FYeCu92UUtoRJ4x1A0oalC9u6ieMv5Eg73VT1pjI/ZJpXx1PIh8IJ83wXAzmOvS4H
c1INI7LML+3++YTJ4DzQ7NH8fubytzoVoUMNk0XBjnfcAA/YUTmEIMpT4gadFt+X5XO6qW4e7rhC
7WQYK1amnI/nN3dLM2s3Ft3aMX53Hg/f+utCaCqxIM11rObozQrgr3k5T1/s22osOox+7fqOJAAv
Lxa46v5rcB9cXz4+Rl0d833g8cVxpX+Jd92QI2d7sN6rhKmn5r70bPVA3Ndd/w9iEcjSfYWyOiJV
djCn4bpl6aiMhZcPRjte4/oMz4ken59Xq5vVfwAAAP//AwC6gsp3OQcAAA==
headers:
CF-Cache-Status:
- DYNAMIC
CF-RAY:
- 8d06fd810889b018-BEG
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Thu, 10 Oct 2024 13:35:03 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
openai-organization:
- traceloop
openai-processing-ms:
- '12493'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '30000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '29999939'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_69ccccc4d1f064d1a237d11cf4438738
status:
code: 200
message: OK
version: 1
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import pytest
from openai import OpenAI
from opentelemetry.semconv_ai import SpanAttributes, Meters
from pydantic import BaseModel


@pytest.fixture
Expand Down Expand Up @@ -73,6 +74,55 @@ def test_chat_completion_metrics(metrics_test_context, openai_client):
assert found_duration_metric is True


@pytest.mark.vcr
def test_chat_parsed_completion_metrics(metrics_test_context, openai_client):
_, reader = metrics_test_context

class StructuredAnswer(BaseModel):
poem: str
style: str

openai_client.beta.chat.completions.parse(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with "
"creative flair.",
},
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming.",
},
],
response_format=StructuredAnswer,
)

metrics_data = reader.get_metrics_data()
resource_metrics = metrics_data.resource_metrics
assert len(resource_metrics) > 0

found_token_metric = False
found_choice_metric = False
found_duration_metric = False

for rm in resource_metrics:
for sm in rm.scope_metrics:
for metric in sm.metrics:
for data_point in metric.data.data_points:
model = data_point.attributes.get(SpanAttributes.LLM_RESPONSE_MODEL)
if metric.name == Meters.LLM_TOKEN_USAGE and model == 'gpt-4o-2024-08-06':
found_token_metric = True
elif metric.name == Meters.LLM_GENERATION_CHOICES and model == 'gpt-4o-2024-08-06':
found_choice_metric = True
elif metric.name == Meters.LLM_OPERATION_DURATION and model == 'gpt-4o-2024-08-06':
found_duration_metric = True

assert found_token_metric
assert found_choice_metric
assert found_duration_metric


@pytest.mark.vcr
def test_chat_streaming_metrics(metrics_test_context, openai_client):
_, reader = metrics_test_context
Expand Down Expand Up @@ -158,7 +208,7 @@ def test_chat_streaming_metrics(metrics_test_context, openai_client):
)
assert str(
data_point.attributes[SpanAttributes.LLM_RESPONSE_MODEL]
).startswith("gpt-3.5-turbo")
) in ("gpt-3.5-turbo", "gpt-3.5-turbo-0125", "gpt-4o-2024-08-06")
assert data_point.attributes["gen_ai.operation.name"] == "chat"
assert data_point.attributes["server.address"] != ""

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
interactions:
- request:
body: '{"messages": [{"role": "user", "content": "Tell me a joke about opentelemetry"}],
"model": "gpt-4o", "response_format": {"type": "json_schema", "json_schema":
{"schema": {"properties": {"rating": {"title": "Rating", "type": "integer"},
"joke": {"title": "Joke", "type": "string"}}, "required": ["rating", "joke"],
"title": "StructuredAnswer", "type": "object", "additionalProperties": false},
"name": "StructuredAnswer", "strict": true}}, "stream": false}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '455'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- AsyncOpenAI/Python 1.51.2
x-stainless-arch:
- arm64
x-stainless-async:
- async:asyncio
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.51.2
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.7
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAAwAAAP//bFLBbtswDL37KziencFO3DbzZVh32IAB62FFB3QeAkVmbDWyqElM2qDI
vxd23CRDe5AO7+k9kY98TgDQ1FgC6laJ7rydfPn2NTe7eHcV7rN/9xd5tl3Sj7vbn0+/br5fY9or
ePlAWl5VHzV33pIYdgdaB1JCvWt+NZ0X0/4MRMc12V7WeJkUPJlm02KSzSfZ5Shs2WiKWMKfBADg
ebj7El1NT1hClr4iHcWoGsLy+AgAA9seQRWjiaKcYHoiNTshN1T9XGFQYlxTYVmkFT7wmiosK/zd
7qA2NUhLUNOWLHsKsAyk1rDx8GikhRtP7pYsdSRh97mqXFW5a9JqEwloS2EHYjqClkCCoRqEQYLS
PRLADDUYdjGF2BI8qgjCDLyMFLbKCShXw5q8QMfOCAfjGmhNHK073tKHCvfnjQVabaLqc3Uba0d8
f0zKcuMDL+PIH/GVcSa2i0AqsutTicIeB3afAPwdJrL5L2T0gTsvC+E1ud7wcn6ww9MKnMhiNpLC
ouwJz/M8fcduUZMoY+PZSFEr3VJ9kmbJWW9vP33P4tCfcc0bl2R0wriLQt1iZVxDwQdzWJKVX9AF
FZ/yGc1nmOyTFwAAAP//AwDb6fbULQMAAA==
headers:
CF-Cache-Status:
- DYNAMIC
CF-RAY:
- 8cf936e59aa9b018-BEG
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 08 Oct 2024 21:27:23 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
openai-organization:
- traceloop
openai-processing-ms:
- '1580'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '30000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '29999973'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_35530e89431ff0752a7d6828e69ced3d
status:
code: 200
message: OK
version: 1
Loading