Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow GC to collect unneeded slice elements #5804

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ash2k
Copy link
Contributor

@ash2k ash2k commented Sep 11, 2024

type interInst struct {
	x int
}

type inter interface {
}

var sink []inter

func BenchmarkX(b *testing.B) {
	sink = make([]inter, b.N)
	for i := 0; i < b.N; i++ {
		sink[i] = &interInst{}
	}
	clear(sink)
	sink = sink[:0]
	runtime.GC()
	var ms runtime.MemStats
	runtime.ReadMemStats(&ms)
	b.Log(b.N, ms.Frees) // Frees is the cumulative count of heap objects freed.
}
clear:
    ioz_test.go:35: 1 589
    ioz_test.go:35: 100 711
    ioz_test.go:35: 10000 10729
    ioz_test.go:35: 1000000 1010750  <-- 1m+ freed
    ioz_test.go:35: 16076874 17087643
    ioz_test.go:35: 19514749 36602412
no clear:
    ioz_test.go:35: 1 585
    ioz_test.go:35: 100 606
    ioz_test.go:35: 10000 725
    ioz_test.go:35: 1000000 10745  <-- some "overhead" objects freed, not the slice.
    ioz_test.go:35: 16391445 1010765
    ioz_test.go:35: 21765238 17402230

This is documented at https://go.dev/wiki/SliceTricks:

NOTE If the type of the element is a pointer or a struct with pointer fields, which need to be garbage collected, the above implementations of Cut and Delete have a potential memory leak problem: some elements with values are still referenced by slice a’s underlying array, just not “visible” in the slice. Because the “deleted” value is referenced in the underlying array, the deleted value is still “reachable” during GC, even though the value cannot be referenced by your code. If the underlying array is long-lived, this represents a leak.

... followed by examples of how zeroing out the slice elements solves the problem. This PR does the same.

Copy link

codecov bot commented Sep 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.5%. Comparing base (8e9baf2) to head (13656a8).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #5804     +/-   ##
=======================================
- Coverage   84.5%   84.5%   -0.1%     
=======================================
  Files        272     272             
  Lines      22804   22810      +6     
=======================================
+ Hits       19287   19292      +5     
- Misses      3171    3172      +1     
  Partials     346     346             

see 5 files with indirect coverage changes

@@ -280,6 +280,7 @@ func (bsp *batchSpanProcessor) exportSpans(ctx context.Context) error {
//
// It is up to the exporter to implement any type of retry logic if a batch is failing
// to be exported, since it is specific to the protocol and backend being sent to.
clear(bsp.batch) // Let GC collect objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that not clearing via clear(bsp.batch) and doing bsp.batch = bsp.batch[:0] below is intentional to reuse memory and reduce number of heap allocation.

Copy link
Contributor Author

@ash2k ash2k Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two kinds of memory reuse.

If we did bsp.batch = nil that would indeed throw away the slice's backing array. Instead, we reuse that backing array by re-slicing the slice to have zero length but keeping the capacity and array (bsp.batch[:0]).

clear() does not change the above.

What clear() does is it erases the backing array elements (from zero to len()) with zero values for the array type. IIRC (I'm not looking at the code right now) the type is an interface. So, instead of pointers to objects, we have nils there. And, hence, GC can collect those objects if they are not referenced anywhere anymore. Free memory can be reused.

If the slice was of a scalar type (e.g. int), then clear() would not do anything useful for GC. But for interfaces it does.

I hope this helps to understand why I think this change would be beneficial. Or did I misunderstand your concern?

Copy link
Member

@pellared pellared Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks for a great description.

Can you add benchstat results to PR description to compare the performance difference caused by the change?

I think the same improvement can be done in other places where [:0] is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a benchmark to the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared I've looked at other [:0] usage and made a few more improvements. PTAL.

@ash2k ash2k changed the title Allow GC to collect exported spans Allow GC to collect unneeded slice elements Oct 9, 2024
// not pointers and they themselves do not contain pointer fields,
// therefore the duplicate values do not need to be zeroed for them to be
// garbage collected.
clear(s.attributes[len(unique):]) // Erase unneeded elements to let GC collect objects
Copy link
Contributor Author

@ash2k ash2k Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is technically correct but the conclusion is not. While there are no pointers, there are string and interface{} fields there:

  • Strings contain pointers to their backing arrays that contain the actual bytes. Those arrays cannot be GCed if the string (with an embedded/hidden pointer) is reachable.
  • interface{} can contain anything. Everything, except for pointers, in an interface is "boxed" - allocated on the heap and a pointer to that data is stored in the interface{}. If it's a pointer already, then it's not "boxed" but... it's a pointer and the destination cannot be GCed. So, in both cases we need to clear those interface{} fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants