Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple periodic readers increment the counter abnormaly #5866

Open
dashpole opened this issue Oct 2, 2024 · 6 comments · May be fixed by #5900
Open

Multiple periodic readers increment the counter abnormaly #5866

dashpole opened this issue Oct 2, 2024 · 6 comments · May be fixed by #5900
Assignees
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working

Comments

@dashpole
Copy link
Contributor

dashpole commented Oct 2, 2024

Description

When multiple (periodic) readers are present, metrics increase abnormally.

Originally reported here: open-telemetry/opentelemetry-collector#11327

Steps To Reproduce

Run the OpenTelemetry collector with the configuration provided in the bug above.

Expected behavior

Metrics should not increase abnormally.

@dashpole
Copy link
Contributor Author

dashpole commented Oct 2, 2024

We should try to create a simpler reproduction (not using the collector) to make this easier to root-cause

@pree-dew
Copy link
Contributor

pree-dew commented Oct 3, 2024

@dashpole Can I pick this task?

@dashpole
Copy link
Contributor Author

dashpole commented Oct 3, 2024

@pree-dew feel free to work on this. It is likely going to be tricky to root-cause, but any help is appreciated

@pree-dew
Copy link
Contributor

pree-dew commented Oct 4, 2024

Able to reproduce the issue by providing multiple periodic reader in telemetry

Screenshot 2024-10-04 at 2 37 07 PM

Will try to find a way to reproduce outside of otel-collector

@pree-dew
Copy link
Contributor

pree-dew commented Oct 7, 2024

Able to reproduce this with sdk

========== Reader1=================
{
        "Resource": [
                {
                        "Key": "service.name",
                        "Value": {
                                "Type": "STRING",
                                "Value": "myapp"
                        }
                }
        ],
        "ScopeMetrics": [
                {
                        "Scope": {
                                "Name": "http",
                                "Version": "",
                                "SchemaURL": ""
                        },
                        "Metrics": [
                                {
                                        "Name": "http_requests_total",
                                        "Description": "TODO",
                                        "Unit": "count",
                                        "Data": {
                                                "DataPoints": [
                                                        {
                                                                "Attributes": [],
                                                                "StartTime": "2024-10-07T16:10:11.277202+05:30",
                                                                "Time": "2024-10-07T16:19:41.280994+05:30",
                                                                "Value": 4
                                                        }
                                                ],
                                                "Temporality": "CumulativeTemporality",
                                                "IsMonotonic": true
                                        }
                                }
                        ]
                }
        ]
}
======= Reader 2 ===================
{
	"Resource": [
		{
			"Key": "service.name",
			"Value": {
				"Type": "STRING",
				"Value": "myapp"
			}
		}
	],
	"ScopeMetrics": [
		{
			"Scope": {
				"Name": "http",
				"Version": "",
				"SchemaURL": ""
			},
			"Metrics": [
				{
					"Name": "http_requests_total",
					"Description": "TODO",
					"Unit": "count",
					"Data": {
						"DataPoints": [
							{
								"Attributes": [],
								"StartTime": "2024-10-07T16:10:11.277205+05:30",
								"Time": "2024-10-07T16:19:41.281038+05:30",
								"Value": 12
							}
						],
						"Temporality": "CumulativeTemporality",
						"IsMonotonic": true
					}
				}
			]
		}
	]
}

Same time, same metric is showing values as 4, 12 while the correct value is 3. Same setup giving correct value with 1 reader.

So far the observation is that it is happening with ObservableCounter but will give it one more shot to be sure.

@pree-dew
Copy link
Contributor

pree-dew commented Oct 19, 2024

@dashpole I have added a test case to reproduce the behaviour, it will fail as of now but will pass once the fix goes out.

Debugged the issue, this is my understanding of the issue:

Step 1
Say the value of variable(var reqCount = 1) used for metric type Int64ObserveCounter is 1:

Step2
Pipeline 1 starts-> calls the registered callback -> calls the measure function -> sets the value to 1 as the previous value initially is 0 ref

Step3
Pipeline 2 starts -> calls the registered callback -> calls the measure function -> picks the value set by pipeline1 as 1 and then add the value(reqCount) to it, making it 2

Step4
When the method .Collect will be called instead of value 1, it will get 2, Since it is PreComputedSum, expected value( as per reqCount value) is 1 not 2 [Bug!!!]

This is happening as the instrument is same for both pipelines and before the previous value gets cleared here, next pipeline picks the previous value set by pipeline1 and add the value, thereby making the value incorrect.

With above understanding, I have able to reproduce the issue.

Let me know what do you think about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants