Skip to content

Commit

Permalink
Merge branch 'main' into content-length-deprecation-note
Browse files Browse the repository at this point in the history
  • Loading branch information
arminru authored Oct 7, 2024
2 parents 8ec5be4 + 6e77ed5 commit d52588b
Show file tree
Hide file tree
Showing 29 changed files with 1,430 additions and 215 deletions.
22 changes: 22 additions & 0 deletions .chloggen/1437.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use this changelog template to create an entry for release notes.
#
# If your change doesn't affect end users you should instead start
# your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
component: db

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Recommend to capture `db.namespace` from initial connection over not capturing any, also specify `db.namespace` value for PostgreSQL, MySQL and MariaDB

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
# The values here must be integers.
issues: [ 1437 ]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
4 changes: 4 additions & 0 deletions .chloggen/980.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
change_type: breaking
component: gen_ai
note: Deprecate `gen_ai.prompt` and `gen_ai.completion` attributes, introduce log-based events for GenAI inputs and outputs.
issues: [834, 980]
4 changes: 4 additions & 0 deletions .chloggen/add_openai_specific_attributes.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
change_type: enhancement
component: gen_ai
note: Add system specific conventions for OpenAI.
issues: [1370]
4 changes: 4 additions & 0 deletions .chloggen/size-attributes-opt-in.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
change_type: breaking
component: messaging
note: Mark *.size messaging attributes as Opt-In
issues: [474]
87 changes: 55 additions & 32 deletions docs/attributes-registry/gen-ai.md

Large diffs are not rendered by default.

27 changes: 23 additions & 4 deletions docs/database/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,30 @@ This document defines semantic conventions for database client spans as well as
database metrics and logs.

> **Warning**
>
> Existing database instrumentations that are using
> [v1.24.0 of this document](https://github.com/open-telemetry/semantic-conventions/blob/v1.24.0/docs/database/README.md)
> (or prior) SHOULD NOT change the version of the database conventions that they emit by default
> until a transition plan to the (future) stable semantic conventions has been published.
> Conventions include, but are not limited to, attributes, metric and span names, and unit of measure.
> [v1.24.0 of this document](https://github.com/open-telemetry/semantic-conventions/blob/v1.24.0/docs/database/database-spans.md)
> (or prior):
>
> * SHOULD NOT change the version of the database conventions that they emit by default
> until the database semantic conventions are marked stable.
> Conventions include, but are not limited to, attributes,
> metric and span names, and unit of measure.
> * SHOULD introduce an environment variable `OTEL_SEMCONV_STABILITY_OPT_IN`
> in the existing major version which is a comma-separated list of values.
> If the list of values includes:
> * `database` - emit the new, stable database conventions,
> and stop emitting the old experimental database conventions
> that the instrumentation emitted previously.
> * `database/dup` - emit both the old and the stable database conventions,
> allowing for a seamless transition.
> * The default behavior (in the absence of one of these values) is to continue
> emitting whatever version of the old experimental database conventions
> the instrumentation was emitting previously.
> * Note: `database/dup` has higher precedence than `database` in case both values are present
> * SHOULD maintain (security patching at a minimum) the existing major version
> for at least six months after it starts emitting both sets of conventions.
> * SHOULD drop the environment variable in the next major version.
Semantic conventions for database operations are defined for the following signals:

Expand Down
6 changes: 4 additions & 2 deletions docs/database/cassandra.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The Semantic Conventions for [Cassandra](https://cassandra.apache.org/) extend a
| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
|---|---|---|---|---|---|
| [`db.collection.name`](/docs/attributes-registry/db.md) | string | The name of the Cassandra table that the operation is acting upon. [1] | `public.users`; `customers` | `Conditionally Required` [2] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.namespace`](/docs/attributes-registry/db.md) | string | The Cassandra keyspace name. [3] | `mykeyspace` | `Conditionally Required` If available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.namespace`](/docs/attributes-registry/db.md) | string | The keyspace associated with the session. [3] | `mykeyspace` | `Conditionally Required` If available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. [4] | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [Cassandra protocol error code](https://github.com/apache/cassandra/blob/cassandra-5.0/doc/native_protocol_v5.spec) represented as a string. [6] | `102`; `40020` | `Conditionally Required` [7] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [8] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
Expand All @@ -45,7 +45,9 @@ For batch operations, if the individual operations are known to have the same co

**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query.

**[3]:** For commands that switch the keyspace, this SHOULD be set to the target keyspace (even if the command fails).
**[3]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid.
Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system.
It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.

**[4]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query.
Expand Down
4 changes: 3 additions & 1 deletion docs/database/hbase.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,9 @@ The Semantic Conventions for [HBase](https://hbase.apache.org/) extend and overr

**[1]:** If table name includes the namespace, the `db.collection.name` SHOULD be set to the full table name.

**[2]:** When performing table-related operations, the instrumentations SHOULD extract the namespace from the table name according to the [HBase table naming conventions](https://hbase.apache.org/book.html#namespace_creation). If namespace is not provided, instrumentation SHOULD set `db.namespace` value to `default`.
**[2]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid.
Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system.
It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.

**[3]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query.
Expand Down
135 changes: 135 additions & 0 deletions docs/database/mariadb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
<!--- Hugo front matter used to generate the website version of this page:
linkTitle: MariaDB
--->

# Semantic Conventions for MariaDB

**Status**: [Experimental][DocumentStatus]

The Semantic Conventions for *MariaDB* extend and override the [Database Semantic Conventions](database-spans.md).

`db.system` MUST be set to `"mariadb"` and SHOULD be provided **at span creation time**.

## Attributes

<!-- semconv db.mariadb -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->

| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
|---|---|---|---|---|---|
| [`db.collection.name`](/docs/attributes-registry/db.md) | string | The name of the SQL table that the operation is acting upon. [1] | `users`; `dbo.products` | `Conditionally Required` [2] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.namespace`](/docs/attributes-registry/db.md) | string | The database associated with the connection. [3] | `products`; `customers` | `Conditionally Required` If available without an additional network call. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. [4] | `SELECT`; `INSERT`; `UPDATE`; `DELETE`; `CREATE`; `mystoredproc` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [Maria DB error code](https://mariadb.com/kb/en/mariadb-error-code-reference/) represented as a string. [6] | `1008`; `3058` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`db.query.parameter.<key>`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `<key>` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix.
For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured.

**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query.

**[3]:** A connection's currently associated database may change during its lifetime, e.g. from executing `USE <database>`.

If instrumentation is unable to capture the connection's currently associated database on each query
without triggering an additional query to be executed (e.g. `SELECT DATABASE()`),
then it is RECOMMENDED to fallback and use the database provided when the connection was established.

Instrumentation SHOULD document if `db.namespace` reflects the database provided when the connection was established.

It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.

**[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`.
In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed.

**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query.

**[6]:** SQL defines [SQLSTATE](https://wikipedia.org/wiki/SQLSTATE) as a database
return code which is adopted by some database systems like PostgreSQL.
See [PostgreSQL error codes](https://www.postgresql.org/docs/current/errcodes-appendix.html)
for the details.

Other systems like MySQL, Oracle, or MS SQL Server define vendor-specific
error codes. Database SQL drivers usually provide access to both properties.
For example, in Java, the [`SQLException`](https://docs.oracle.com/javase/8/docs/api/java/sql/SQLException.html)
class reports them with `getSQLState()` and `getErrorCode()` methods.

Instrumentations SHOULD populate the `db.response.status_code` with the
the most specific code available to them.

Here's a non-exhaustive list of databases that report vendor-specific
codes with granularity higher than SQLSTATE (or don't report SQLSTATE
at all):

- [DB2 SQL codes](https://www.ibm.com/docs/db2-for-zos/12?topic=codes-sql).
- [Maria DB error codes](https://mariadb.com/kb/en/mariadb-error-code-reference/)
- [Microsoft SQL Server errors](https://docs.microsoft.com/sql/relational-databases/errors-events/database-engine-events-and-errors)
- [MySQL error codes](https://dev.mysql.com/doc/mysql-errors/9.0/en/error-reference-introduction.html)
- [Oracle error codes](https://docs.oracle.com/cd/B28359_01/server.111/b28278/toc.htm)
- [SQLite result codes](https://www.sqlite.org/rescode.html)

These systems SHOULD set the `db.response.status_code` to a
known vendor-specific error code. If only SQLSTATE is available,
it SHOULD be used.

When multiple error codes are available and specificity is unclear,
instrumentation SHOULD set the `db.response.status_code` to the
concatenated string of all codes with '/' used as a separator.

For example, generic DB instrumentation that detected an error and has
SQLSTATE `"42000"` and vendor-specific `1071` should set
`db.response.status_code` to `"42000/1071"`."

**[7]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred.
When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred.
Instrumentations SHOULD document how `error.type` is populated.

**[8]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.

**[9]:** If using a port other than the default port for this DBMS and if `server.address` is set.

**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext).
For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable.
Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.

**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext).

**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.

**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders.
If a parameter has no name and instead is referenced only by index, then `<key>` SHOULD be the 0-based index.



The following attributes can be important for making sampling decisions
and SHOULD be provided **at span creation time** (if provided at all):

* [`db.collection.name`](/docs/attributes-registry/db.md)
* [`db.namespace`](/docs/attributes-registry/db.md)
* [`db.operation.name`](/docs/attributes-registry/db.md)
* [`db.query.text`](/docs/attributes-registry/db.md)
* [`server.address`](/docs/attributes-registry/server.md)
* [`server.port`](/docs/attributes-registry/server.md)

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |



<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->

[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status
Loading

0 comments on commit d52588b

Please sign in to comment.