Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exception in GrpcRetryer. #2021

Merged
merged 7 commits into from
Apr 17, 2024
Merged

Fix exception in GrpcRetryer. #2021

merged 7 commits into from
Apr 17, 2024

Conversation

chronos-tachyon
Copy link
Contributor

@chronos-tachyon chronos-tachyon commented Mar 28, 2024

What was changed

  • Ensure that ChannelManager.getServerCapabilities follows the standard gRPC retry policy, even though it is called by GrpcRetryer.

Why?

In the course of writing new gRPC retry tests for the features repo, it was discovered that any gRPC failures returned by the server in response to WorkflowService.getSystemInfo will cause an uncaught io.grpc.StatusRuntimeException that happens before the actual gRPC retry code gets the chance to run.

Checklist

  1. Closes internal JIRA issue SDK-1888. (Sorry, I should have opened it as a GitHub issue and let it sync to JIRA instead, I didn't learn this until after the fact.)

  2. How was this tested: using the new features repo tests in Add features tests for gRPC retry behavior features#435 .

@chronos-tachyon chronos-tachyon requested a review from a team as a code owner March 28, 2024 20:16
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After internal discussion, it seems everyone agrees that no when-to-make-first-call behavior should change in the Java SDK (that would be an incompatible change), but of course fixing retry behavior and other similar ones makes sense. Will defer to @Quinn-With-Two-Ns on final approval.

@chronos-tachyon chronos-tachyon force-pushed the dking/SDK-1888 branch 2 times, most recently from 7d454e3 to 5d2e9f4 Compare April 3, 2024 18:23
@Quinn-With-Two-Ns
Copy link
Contributor

I don't think the CI failure is a result of any of your changes, I opened a separate issue to address this flake #2038

@Quinn-With-Two-Ns
Copy link
Contributor

As part of the PR can we please also add some unit tests to verify the changes address the issue? I understand the integration tests will live, so maybe just a targeted test of getServerCapabilitiesWithRetryOrThrow since that takes a channel and you can create a in processes server like this:

https://github.com/grpc/grpc-java/blob/master/examples/src/test/java/io/grpc/examples/helloworld/HelloWorldClientTest.java

@chronos-tachyon chronos-tachyon merged commit ed211fa into master Apr 17, 2024
10 checks passed
@chronos-tachyon chronos-tachyon deleted the dking/SDK-1888 branch April 17, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants