Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add task polling data to task responses #447

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Sushisource
Copy link
Member

@Sushisource Sushisource commented Aug 20, 2024

What changed?
Add some additional information to polling responses about the task queue

Outcome of discussion w/ @ShahabT and @dnr

Initially I would've liked TaskQueueStats to be returned on every poll response. This appears to be too much from both a performance and complexity perspective initially because regularly updating and distributing aggregated data across partitions would require more frequent internal RPCs, and would require them to go in a direction they don't currently (from sub-partition -> root rather than just the other way around).

The alternative is to have SDK acquire that data via DescribeTaskQueue. Calling that very frequently would also be problematic, since there is an RPS limit per-namespace.

The landed on middle ground, which is low complexity but ought to be reasonably efficient, is to include TaskQueueStats in the response whenever a poll hits the root partition, where it is readily available. If the SDK has not seen such a response in some reasonable TTL (~seconds), then it will call DescribeTaskQueue in order to have an updated view. In practice, it should be fairly unlikely for a worker with a reasonable number of pollers to somehow miss root, and low-traffic queues are also more likely to forward to the root to see if there is a task. Since the desired outcome SDK side is to only have a low number of pollers when there's low traffic, that should work out.

Why?
This will allow poller auto-tuning to have an idea of if it should be increasing the number of pollers because most matches are not sync (possibly in conjunction w/ a backlog hint)

If we are making sync matches, then we can reduce the number of pollers once the wait duration starts to exceed some threshold.

It can also inform the SDK what the minimum latency is when polling, since if we didn't make a sync match, then we were pulling something out of the backlog and delivered a task roughly as quickly as is possible.

The task queue stats info is useful to help to decide how quickly pollers should be ramped up, if at all.

Breaking changes
nope

Server PR
Will make one assuming we're OK with this add.

@Sushisource Sushisource requested review from a team as code owners August 20, 2024 01:03
@Sushisource Sushisource changed the title Add sync_match_wait_duration to task responses Add task polling data to task responses Aug 20, 2024
Comment on lines +271 to +274
// See `TaskQueueStats`. This field is only set if the poll hit the root partition, where the data is readily
// available. SDKs should cache this data, and explicitly refresh it with a call to `DescribeTaskQueue` if some
// TTL has expired.
TaskQueueStats stats = 3;
Copy link
Member

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, I wonder a bit about this. I wasn't expecting SDKs to call describe task queue ever, only users. Can I get clarity on when a worker may implicitly call describe task queue? Is there information in here to drive decisions that we need? If not, I wonder if we should remove this field and ask users that want this info to call describe themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going to give us this same stats information, which is used to drive poller scaling decisions. Users aren't going to interact with any of this directly. So it will periodically call it if, by chance, no pollers are hitting the root partition in the last unit of time, which should be relatively rare.

Copy link
Member

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, can the poller scaling decisions be done without changing workers to make new describe task queue calls? Or is it a strong requirement at this point? I saw the first version of this PR didn't need these stats.

Copy link
Member Author

@Sushisource Sushisource Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the concern here? Nothing is, strictly speaking, required - we could not do poller scaling at all, but the information is helpful in making those decisions. I already had some discussion with server folks about the load making this call would add and that's all fine. I would've liked it to be in the task response all the time, ideally, but that is more expensive than this option turns out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern I think is adding the confusing results to the poll (we already got bit by this with backlog hint where it was confusing for everyone what it meant and when the data was accurate). Also there is the concern of adding logic burden on the workers to now start making additional, separate out of band calls beyond polling to support polling. I was hoping the poller scaling could work with the calls the worker already makes and if it needs more info, put the burden on the centralized part (i.e. server) to provide it as part of the calls already being made or acknowledge that we may need to work with limited information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to talk about calling DescribeTaskQueue, or call it. If we're polling fairly frequently, then we'll hit the root and get the data. If we're not, then there's no tasks and no reason to scale pollers. Let's just say the SDK can make use of this field when it's provided, and do without it if it's missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That workers won't be expected to call DescribeTaskQueue definitely alleviates one of my concerns. But it alludes to another - we don't know what we want workers to do yet it sounds like. Maybe we can even use this field's proto doc to define worker poller behavior. Or maybe, since it sounds like this is an optional field and poller scaling needs to work well without it, we can do poller scaling and come back and add this field and associated optimizations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been writing some pseudocode for the algorithm that's nearly complete. I'm quite sure I want this data, and delivering it in the field is cheap. It's likely to be present often enough to be useful. I don't really see the cost of having it here as problematic at all.

More specifically, it's very useful in knowing when to scale down pollers. Which fits perfectly with this bit that is mentioned earlier but David writes clearly:

If we're polling fairly frequently, then we'll hit the root and get the data. If we're not, then there's no tasks and no reason to scale pollers.

So we're going to get that data when we most need it.

Copy link
Member

@cretz cretz Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 So the docs make clear that it may not appear, but per what @dnr put a few messages up in this thread, can we remove the part about expecting SDKs to call describe task queue to get this info?

@ShahabT - can you expand on what you mean by aggregate-and-cache? How out of date might this value be compared to a regular describe call? Is there a design step needed to confirm how this aggregate-and-cache might work? Or any fears we may learn something during impl that would affect these protos?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the stats to reflect the whole task queue, and not only the root partition itself, we need to fan-out to all partitions and get their stats (via internal DescribeTaskQueuePartition API), aggregate all of them, and return the aggregated value. This is what Enhanced DescribeTaskQueue does.

Now, if we're to send such stats in the root partition's poll response, this fan-out and calling of DescribeTaskQueuePartition API should happen constantly (as opposed to now that it happens only when DescribeTaskQueue is called).

To manage the load, server would have to add caching in the root partition to limit the fan-out frequency (say to once per second).

It's not something that we cannot manage, but it is not free either.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but still a bit concerned w/ the new stats field and its necessity. Also concerned with whether workers need this new out-of-band RPC to drive poller scaling. I would prefer the field didn't exist for the initial implementation.

I think we should try simple poller scaling before adding these advanced machinations around task queue stats.

Copy link
Member

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, please wait to merge until we have a server impl, since we usually discover things to tweak in the process of implementing

// Set to the amount of time a poll call waited for a task to be available. This value is allowed to be unset
// if the match was not sync. It is useful to help the SDK understand if it may reduce the number of pollers, since
// any substantial wait time here means the backlog is drained, and poll calls are idling waiting for tasks.
google.protobuf.Duration time_poller_waited = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it poll_wait_time or even just wait_time

}

// Data attached to successful poll responses (for any task type) which the SDK can use to optimize its polling
// strategy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe to head off potential confusion, add a comment like "Don't assume the accuracy of this data for any purpose other than poller scaling."

Comment on lines +271 to +274
// See `TaskQueueStats`. This field is only set if the poll hit the root partition, where the data is readily
// available. SDKs should cache this data, and explicitly refresh it with a call to `DescribeTaskQueue` if some
// TTL has expired.
TaskQueueStats stats = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to talk about calling DescribeTaskQueue, or call it. If we're polling fairly frequently, then we'll hit the root and get the data. If we're not, then there's no tasks and no reason to scale pollers. Let's just say the SDK can make use of this field when it's provided, and do without it if it's missing.

@Sushisource
Copy link
Member Author

Leaving as an FYI here: I started the server work but it's likely I won't get back to it for a bit b/c of other Replay-related priorities, so this might have to sit for a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants