Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update_agent: put state behind RwLock #577

Merged
merged 1 commit into from
Jul 27, 2021
Merged

update_agent: put state behind RwLock #577

merged 1 commit into from
Jul 27, 2021

Conversation

kelvinfan001
Copy link
Member

@kelvinfan001 kelvinfan001 commented Jun 10, 2021

We would like to put update agent state behind a RwLock so ticks
can be handled sequentially while allowing messages to the
update_agent actor to be handled concurrently, in general.

Closes: #569

@@ -370,7 +372,7 @@ pub(crate) struct UpdateAgent {
/// Update strategy.
strategy: UpdateStrategy,
/// Current status for agent state machine.
state: UpdateAgentState,
state: Arc<RwLock<UpdateAgentState>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-note: I instinctively got some doubts on this Arc but I've no reason to believe it is wrong. It may need a short explanation to clarify why a plain RwLock is not enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also added as a comment in the code) I believe we need to use an Arc because consumers of this field will likely need to own it (e.g. consumers in futures).

src/update_agent/actor.rs Show resolved Hide resolved
log::trace!(
"scheduling next agent refresh in {} seconds",
pause.as_secs()
);
Self::tick_later(ctx, pause);
} else {
let update_timestamp = chrono::Utc::now();
actor.state_changed = update_timestamp;
actor.info.state_changed = update_timestamp;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly related to this PR, but I noticed that the logic around updating this state_changed field is highly coupled with the logic around refresh_delay() which is in turn dependent on whether the state was actually changed. I think this made sense back in simpler times when tick_now if and only if change in state; however, there are now cases where a change in state does not necessarily tick_now.
Anyways, IIUC, I think the logic around updating this state_changed field should be refactored to a more intuitive location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, would this state_changed field fit more naturally into the state itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but UpdateAgentState is an enum. Would it be a good idea to attach the data to the enum the same way we attach e.g. Release to some of the variants?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking about the opposite: turning state into its own struct and making the enum just one of its field, next to the timestamp.
On a second thought, I'm not seeing right now what we are using this state_changed information for. Possibly it's unused at the moment, in which case you can either decide to drop it or to clarify its location and semantics (e.g. so that it could be sanely queried maybe via dbus).

@kelvinfan001
Copy link
Member Author

kelvinfan001 commented Jul 27, 2021

Some groundwork that happened in this PR includes:

  1. use async/await (instead of Actix-style future combinators) within the tick functions and their helpers
  2. convert the ticks and some of their helpers to methods of the new UpdateAgentInfo instead of UpdateAgent

1. solved the issue of it being super cumbersome to change all the parameters and return types of helper functions to always take ownership of state and return ownership of state once they're done with it, and also makes the code cleaner to read, IMO
2. was necessary because the async block does not have access to (ownership of) an entire UpdateAgent instance; it now only has access to a clone of the "read-only" side of UpdateAgent and a lock guard to the UpdateAgent's state.

We would like to put most of the "writable" parts of the update
agent behind a RwLock in order to make sure that, when the
update_agent actor receives multiple messages, it sequentially
processes the messages that would like to write/update the agent
state to avoid races.

In order to do this, we explicitly separate out a `state` field
and an `info` field in the `UpdateAgent` struct, where the `state`
field holds the `UpdateAgentState` that is behind a RwLock.
Though nothing enforces this, all the fields (except for
`state_changed`) in `UpdateAgentInfo` should be read-only, and
needs not be behind any lock.

During each tick, because we are passing the asynchronously acquired
lock guard of the agent state across futures, we can no longer make
use of Actix's `ResponseActFuture` (which contains the actor state and
context). We now use the newer `async/await` syntax where we can to
make it easier to move data across futures and generally make the code
easier to read. For context, the reason we previously stuck to the less
ergonomic future combinator (`.then()`) syntax was because we wanted to
make use of `ResponseActFuture`.
@kelvinfan001 kelvinfan001 changed the title WIP: lock update agent state Put update agent state behind RwLock Jul 27, 2021
UpdateAgentState::EndState => Ok(()),
};

if modify_state_outcome.is_err() {
Copy link
Contributor

@lucab lucab Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like most of the tick_* functions are internally doing error logging/handling, so the error type here is not really used (in fact, it's a ()). Do you think it would be possible to drop the Result at all? That would probably allow factoring our the .await too, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, there are actually only a couple tick-functions that even make use of this Result. I was originally thinking to make minimal changes to the areas that didn't have to be changed. Taking another look to see if not returning Result is doable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up to this here #610

@lucab
Copy link
Contributor

lucab commented Jul 27, 2021

This is amazing! The side-effect of switching to plain async is possibly even nicer than the improved locking granularity 🎉
I've left a couple of comments related to possible low hanging fruits / cleanups, but the PR itself already looks good.

@lucab lucab enabled auto-merge July 27, 2021 14:42
@lucab lucab added this to the vNext milestone Jul 27, 2021
@lucab lucab changed the title Put update agent state behind RwLock update_agent: put state behind RwLock Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow concurrent handling of messages to UpdateAgent actor
2 participants