Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default routing to receiver-specific data streams #34246

Open
felixbarny opened this issue Jul 25, 2024 · 3 comments · May be fixed by #35417
Open

Default routing to receiver-specific data streams #34246

felixbarny opened this issue Jul 25, 2024 · 3 comments · May be fixed by #35417
Labels

Comments

@felixbarny
Copy link
Contributor

felixbarny commented Jul 25, 2024

Component(s)

exporter/elasticsearch

Is your feature request related to a problem? Please describe.

When dynamic indexing to data streams is enabled, we currently route signals to <type>-generic.otel-default, for example logs-generic-default. A challenge with that is that the data for all receivers is going to the same data streams. We should instead separate the data a bit better according to the data stream naming scheme, without risking a data stream explosion.

As this impacts the default routing, we should implement this before GA as changing the data streams can be considered to be a breaking change.

Describe the solution you'd like

If the scope.name matches the regex \/receiver\/(\w*receiver), the dataset will be set to the capture group 1 ($1). For example, hostmetricsreceiver (or hostmetricsreceiver.otel in the OTel output mode) for the scope name github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/cpuscraper. This ensures that we don't send all metrics from well-known receivers to metrics-generic.otel-default. As tracing instrumentations typically set a different scope name for each instrumented library, and because it can be user-defined with an unknown cardinality, we don't want to route by any generic scope name to not risk an explosion of data streams. Instead, we only route based on receivers by default, where the granularity and cardinality is limited and matches well with the definition of the data stream naming scheme.

Describe alternatives you've considered

No response

Additional context

No response

@felixbarny felixbarny added enhancement New feature or request needs triage New item requiring triage labels Jul 25, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 24, 2024
@lahsivjar
Copy link
Member

/label -needs-triage enhancement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants