Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP client span clarification and establishing HTTP connection spec #3234

Closed

Conversation

mateuszrzeszutek
Copy link
Member

@mateuszrzeszutek mateuszrzeszutek commented Feb 20, 2023

Fixes open-telemetry/semantic-conventions#1226

Ping @trask @lmolkova

I've added both the additional clarifications about the "low-level" vs "high-level" HTTP span, and the connection span spec. The connection span spec is derived from the current behavior of some of the Java HTTP client instrumentations.

I've made several clauses stronger (SHOULD changed to MUST) to show how I think the two ways of instrumenting HTTP clients should work in principle - consider this a topic for discussion.

Co-authored-by: Trask Stalnaker <[email protected]>
In case the request is resent, the resend attempts MUST follow the [HTTP resend spec](#http-request-retries-and-redirects).
Instrumentations MUST NOT emit a "logical" (encompassing) HTTP span.
Since establishing an HTTP connection happens before the client attempts to send an HTTP requests,
instrumentations SHOULD emit a [CONNECT span](#establishing-an-http-connection) whenever a new HTTP connection is made.
Copy link
Contributor

@lmolkova lmolkova Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that connect span does not cover all cases. If we follow this approach, to catch other problems, we'd need DNS span and read-stream span and whatnot. I think they can be useful, but verbose and don't solve all problems.

Moreover, in scope of an encompassing span, there could be operations that don't involve any physical HTTP, TCP or TLS operation. E.g. when circuit-breaking or client-side throttling is used.

I suggest an alternative approach:

  1. if instrumentation can create HTTP spans for each attempt, it SHOULD. If it can also instrument logical encompassing operation, it MAY create it, but it's not covered in HTTP semconv v1 (and may be covered in v1.1+). It's a different span, it does not have HTTP client semantics

If instrumented HTTP client includes DNS and establishing connection into the first try, outer operation does not help much. When these things happen before the instrumentation point, instrumentation SHOULD add outer span too.

  1. if it can't, it SHOULD create HTTP span for topmost operation (as in the p2 there)

With this approach, we can also add CONNECT, DNS, and any other related events and exceptions into the outer operation.
I think the presence of the outer span should be configurable. E.g. if I use a well-known retry library or client SDK, they should provide outer operation instead of HTTP client instrumentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example (with a couple of other possible error points)

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that connect span does not cover all cases. If we follow this approach, to catch other problems, we'd need DNS span and read-stream span and whatnot. I think they can be useful, but verbose and don't solve all problems.

Yeah, that's true.

  1. if instrumentation can create HTTP spans for each attempt, it SHOULD. If it can also instrument logical encompassing operation, it MAY create it, but it's not covered in HTTP semconv v1 (and may be covered in v1.1+). It's a different span, it does not have HTTP client semantics

I like the idea of the encompassing span that serves as a sort of catch-all net. It worries me that we're not defining it at all though -- if I were to instrument over-the-wire HTTP requests (and the spec says I SHOULD if it's doable) then I would possibly limit the observability of our instrumentations (e.g. all the network/implementation related events/errors); and since the encompassing span is not really well defined I probably couldn't implement it in a stable instrumentation.

WDYT about emitting both the top-most HTTP span and the over-the-wire HTTP spans? Perhaps with an attribute/annotation explaining that it's a "logical" HTTP span.

  1. if it can't, it SHOULD create HTTP span for topmost operation (as in the p2 there)

👍

With this approach, we can also add CONNECT, DNS, and any other related events and exceptions into the outer operation.

👍

Copy link
Contributor

@lmolkova lmolkova Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should somewhat define the outer span, but I don't know if it has to follow any semantic conventions and if it needs to be spec-ed out.

E.g. we can say something along these lines:

  • (when applicable...) instrumentation SHOULD create outer logical INTERNAL(?) span and record connection errors and events that happen before, after, or in between physical HTTP attempts. If no additional information about this logical operation is available, this span SHOULD have "Logical HTTP {VERB}" (?) name or the name of http client class and method called by user such as HttpClient.send.

If this span has HTTP semantics, it will appear in the results of all queries that match span name or attributes and would skew the results of such queries.
It's a common problem for any other network instrumentation, and essentially users may want to create such spans themselves to track the overall success/duration of logical operations, so my guess is that we don't need it to have any specific semantics beyond what general otel spec defines.

For example, if user puts @WithSpan around their function that prepares request, sends and retries it, and parses the response, it's good enough. And even though we might want to add more stuff on this span, we don't have to define it right away. Correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (when applicable...) instrumentation SHOULD create outer logical INTERNAL(?) span and record connection errors and events that happen before, after, or in between physical HTTP attempts. If no additional information about this logical operation is available, this span SHOULD have "Logical HTTP {VERB}" (?) name or the name of http client class and method called by user such as HttpClient.send.

Hmm, I suppose we could use the code semconv for that outer span. At least they would provide some attributes for sampling.

If this span has HTTP semantics, it will appear in the results of all queries that match span name or attributes and would skew the results of such queries.

👍

For example, if user puts @WithSpan around their function that prepares request, sends and retries it, and parses the response, it's good enough. And even though we might want to add more stuff on this span, we don't have to define it right away. Correct?

Yeah, that sounds correct.

So, to sum up: "low-level" HTTP instrumentations SHOULD/MAY emit outer INTERNAL span with code semantics, enabled by default, users can opt-out per instrumentation lib.

@Oberon00
Copy link
Member

Since we are trying to stabilize these conventions: I think this needs to be added nicely separated as an (initially) experimental add-on spec.

In case the request is resent, the resend attempts MUST follow the [HTTP resend spec](#http-request-retries-and-redirects).
Instrumentations MUST NOT emit a "logical" (encompassing) HTTP span.
Since establishing an HTTP connection happens before the client attempts to send an HTTP requests,
instrumentations SHOULD emit a [CONNECT span](#establishing-an-http-connection) whenever a new HTTP connection is made.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name is confusing, since there is also the HTTP verb "CONNECT", which presumably is unrelated.

In particular, if the connection was not established due to some network failure, the HTTP client might return with an error before any HTTP client span is emitted.

Spans emitted when an HTTP connection is established should simply be named `CONNECT`.
CONNECT spans are not HTTP spans; they do not represent an outgoing HTTP requests and SHOULD NOT contain any `http.*` attributes.
Copy link
Member

@Oberon00 Oberon00 Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONNECT spans are not HTTP spans

Then, shouldn't we define such a span in some more general network semantic conventions?

@mateuszrzeszutek
Copy link
Member Author

Per HTTP SIG meeting: I'll split this PR into 2 and implement @lmolkova 's idea about the encompassing span (instead the CONNECT span)

@github-actions
Copy link

github-actions bot commented Mar 4, 2023

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Mar 12, 2023
carlosalberto pushed a commit that referenced this pull request Apr 14, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
#3234

## Changes

This PR contains the less controversial parts of
#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
jsuereth pushed a commit to jsuereth/otel-semconv-test that referenced this pull request Apr 19, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
open-telemetry/opentelemetry-specification#3234

## Changes

This PR contains the less controversial parts of
open-telemetry/opentelemetry-specification#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
jsuereth pushed a commit to open-telemetry/semantic-conventions that referenced this pull request May 11, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
open-telemetry/opentelemetry-specification#3234

## Changes

This PR contains the less controversial parts of
open-telemetry/opentelemetry-specification#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Establishing HTTP connection spec
4 participants