Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(profilecli): disable aggregate-callees in go-pgo query by default #3638

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

JansonLv
Copy link

In the default configuration related to profilecli query go-pgo, the exported PGO file does not optimize or improve performance. Using --no-aggregate-callees results in a performance boost, but the prompt for the aggregate-callees parameter is not user-friendly and may mislead users into thinking they should use --aggregate-callees=false. Therefore, I suggest changing the default value of aggregate-callees to false and adding relevant prompts.

@JansonLv JansonLv requested a review from a team as a code owner October 18, 2024 03:08
@CLAassistant
Copy link

CLAassistant commented Oct 18, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Oct 18, 2024

Hi @JansonLv, thanks for the contribution!

Indeed, I believe a clarification could improve UX. The --no-X semantics is enforced with the CLI framework we use, and I think I should have added a clarification regarding --no-aggregate-callees.

However, I have a couple of questions:

In the default configuration related to profilecli query go-pgo, the exported PGO file does not optimize or improve performance. Using --no-aggregate-callees results in a performance boost

I'd like to learn more about the case:

  • Which Go version you're using
  • Can we get profiles exported with and without aggregate-callees option? I understand that profiles may include sensitive data; but dropping strings from the pprof file should help (I can provide you a script that does this)

For the context: the go-pgo optimization is inspired by the discussion and the snippet by Michael Pratt (Go team). I suspect that in the most recent Go version, it may be harmful. Another possibility is that there is a bug in how the callees are being aggregated.

@kolesnikovae kolesnikovae self-requested a review October 18, 2024 06:35
@kolesnikovae kolesnikovae self-assigned this Oct 18, 2024
@JansonLv
Copy link
Author

JansonLv commented Oct 18, 2024

@kolesnikovae Hi, thank you for your reply. I am using version go1.22.7.

@kolesnikovae
Copy link
Collaborator

Thank you @JansonLv! I'll double check that our go-pgo works as expected on go tip. In the meantime, you can find the script in the cmd/trimstrings branch – the simplest way to run it is to clone pyroscope repo and run

go install ./cmd/trimstrings

@kolesnikovae kolesnikovae changed the title fix the PGO issue, modify the default value of aggregate-callees and … fix(plrofilecli): disable aggregate-callees in go-pgo query by default Oct 18, 2024
@kolesnikovae kolesnikovae changed the title fix(plrofilecli): disable aggregate-callees in go-pgo query by default fix(profilecli): disable aggregate-callees in go-pgo query by default Oct 18, 2024
@JansonLv
Copy link
Author

@kolesnikovae thanks,Please check if this is the file you are looking for.
with_agg_profile_clean.pb.gz
without_agg_profile_clean.pb.gz

@JansonLv
Copy link
Author

hi,@kolesnikovae Sorry, that was generated by another project; you can review the information from these two files, and I have modified the trimstrings code, processing only part of the information.
with_agg_profile_clean.pb.gz
without_agg_profile_clean.pb.gz

@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Oct 18, 2024

Thank you so much for providing the samples!

Profiles look good, however they contain surprisingly few samples ~800 and ~1000 samples correspondingly (16.64s of CPU time in both). Idea of go-pgo query was to reduce the profile size and speed up Go builds – usually, when you merge profiles over a period of time (e.g., a couple of days), they include millions of samples, which becomes a problem.

UPD: Same for the second pair: 756/1223/17.45s. I'd suggest querying a longer period of time.

I'm wondering how you've measured the impact of PGO – is it possible to check the compiler logs?

UPD2: correct command:

go build -pgo=auto -gcflags="-m -d=pgodebug=99" . 2>&1 | grep -e "inlining call to" -e "PGO devirtualizing call to" | wc -l

@JansonLv
Copy link
Author

JansonLv commented Oct 18, 2024

@kolesnikovae I obtained this through stress testing the service and did not use any related Go commands. Thank you for your guidance.

@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Oct 18, 2024

Thank you for sharing this! I'd like to figure out why aggregation affects PGO results, because this is really helpful feature and I wouldn't like disabling it by default.

In the log I see use of CGO and lots of undefined reference errors and that both gcc and ld failed – I'm curious if the build was successful. Could you please build the program with -m -d=pgodebug=99 added to -gcflags? Something like:

go build -pgo=auto -gcflags="-m -d=pgodebug=99" . 2>&1 | grep -e "inlining call to" -e "PGO devirtualizing call to" | wc -l

This will tell us how many optimisations Go compiler made (very roughy), and estimate impact of the PGO (not the app performance).

As for load/stress tests – as far as I understand from the log, it might involve IO (message broker), which would very likely dominate the result. Usually, the expected improvement (reduce in CPU consumption) is within 2-5%. Also, please note that PGO won't help with C code.

@JansonLv
Copy link
Author

JansonLv commented Oct 18, 2024

hi,@kolesnikovae I'm so sorry to keep you waiting for so long. It took me some time to set up the relevant environment locally.

In the log I see use of CGO and lots of undefined reference errors and that both gcc and ld failed – I'm curious if the build was successful.

There is indeed a problem, which is why I didn't add the required tag for github.com/confluentinc/confluent-kafka-go/v2/kafka."

in the end, the result is still 0, even though I used the one obtained through the /debug/pprof/profile. I also want to figure out why this is happening.

I think we can keep the aggregation for now, as it will take some time to investigate this issue.

Enjoy your weekend

./main.go:6:13: PGO devirtualize considering call cmd.Execute()
{"Pkg":"main","Pos":"/mnt/c/Users/bodhi/worker/server/main.go:6:13","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"server/cmd.Execute","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
hot-callsite-thres-from-CDF=0.02697599136768276
hot-cg before inline in dot format:
digraph G {
forcelabels=true;
"main.main" [color=black, style=solid, label="main.main"];
"server/cmd.Execute" [color=black, style=solid, label="server/cmd.Execute"];
edge [color=black, style=solid];
"main.main" -> "server/cmd.Execute" [label="0.00"];
}
./main.go:5:6: can inline main

@kolesnikovae
Copy link
Collaborator

Hi @JansonLv, no worries at all – I'm here to help. Thank you for sharing this! Please let me know if I can help you in any way.

In the meantime, I believe that clarifying the CLI help message and documentation (in ./docs) would improve the user experience. If you're willing to contribute, your effort would be very welcome :) Otherwise, I can take care of this

@JansonLv
Copy link
Author

Hi,@kolesnikovae, I'm very honored to contribute to Grafana and improve the user experience. I will make the necessary modifications and commit them as soon as possible. I'm also glad to have your help.

@JansonLv JansonLv requested a review from a team as a code owner October 19, 2024 04:36
@JansonLv
Copy link
Author

Hi, @kolesnikovae, I have submitted the code, but I am uncertain whether the last sentence ‘Try both options to see which gives better for your PGO’ is suitable. I hope to get your opinion on this.

Wish you a pleasant weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants