-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incentives: cache top online accounts and use when building AbsentParticipationAccounts #6085
base: master
Are you sure you want to change the base?
Conversation
…ding AbsentParticipationAccounts
975ddb4
to
21db44d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Just a thought - why not to init top online on startup and then maintain the list in acctonline while processing incoming blocks?
My first approach was to make it a field in the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6085 +/- ##
==========================================
+ Coverage 56.22% 56.24% +0.02%
==========================================
Files 494 494
Lines 69954 70040 +86
==========================================
+ Hits 39330 39394 +64
- Misses 27947 27962 +15
- Partials 2677 2684 +7 ☔ View full report in Codecov by Sentry. |
…break TestAbsenteeChecks
Co-authored-by: John Jannotti <[email protected]>
f5b42d4
to
01b150a
Compare
1c4c898
to
c558d59
Compare
@@ -458,7 +458,7 @@ func TestOnlineAcctModelSimple(t *testing.T) { | |||
}) | |||
// test same scenario on double ledger | |||
t.Run("DoubleLedger", func(t *testing.T) { | |||
m := newDoubleLedgerAcctModel(t, protocol.ConsensusFuture, true) | |||
m := newDoubleLedgerAcctModel(t, protocol.ConsensusV39, true) // TODO simulate heartbeats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep this on future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails because heartbeats aren't implemented, but proposers aren't being set, so the big accounts are challenged and kicked offline, and all the stake numbers don't match the test expectations. I could have tried to fix this by ensuring all the test accounts show up as proposers as often as necessary to avoid suspension, but I thought maybe it would be better to see after heartbeats were implemented whether that would make the tests pass without as much modification.
@@ -47,6 +47,7 @@ type roundCowParent interface { | |||
// lookup retrieves agreement data about an address, querying the ledger if necessary. | |||
lookupAgreement(basics.Address) (basics.OnlineAccountData, error) | |||
onlineStake() (basics.MicroAlgos, error) | |||
knockOfflineCandidates() (map[basics.Address]basics.OnlineAccountData, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a NIT: should we actually call this top online accounts or similar naming? It's very clear from comments that's what we are requesting, more a debate over if the name should be based on what it's sourced from vs the use-case we have for this atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a potentially stale list of top online accounts, if new accounts appeared online in the last 256 rounds (since the last state proof) they wouldn't appear. So the word "candidates" was intended to make it seem a little less definitive that this was the complete list of top online accounts for the round... but happy to pick any other name, I wasn't particularly happy with this name.
This is already being used in a method JJ called "generateKnockOfflineAccountsList" in #5757 which is where the "knockOffline" part came from.
@@ -810,6 +810,9 @@ func TestTotalWeightChanges(t *testing.T) { | |||
a := require.New(fixtures.SynchronizedTest(t)) | |||
|
|||
consensusParams := getDefaultStateProofConsensusParams() | |||
consensusParams.Payouts = config.ProposerPayoutRules{} // TODO re-enable payouts when nodes aren't suspended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we tracking these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I should make an issue to address the "update this test once heartbeats are implemented" TODOs in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will not have the top online cache before the first state proof, right? Maybe it would make sense to seed it during genesis (since the onlince accounts are listed out for us in the genesis file, I think). That could avoid special cases in the tests.
func (eval *BlockEvaluator) endOfBlock() error { | ||
// When generating a block, participating addresses are passed to prevent a | ||
// proposer from suspending itself. | ||
func (eval *BlockEvaluator) endOfBlock(participating ...basics.Address) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ...basics.Address
instead of []basics.Address
? I assume callers always have a slice, as opposed to call sites with, say, 5 explicit arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's true, this is just me optimizing for a smaller diff, to not change other endOfBlock callers, but the idea is to pass a slice — can change
IncentiveEligible bool // currently unused below, but may be needed in the future | ||
} | ||
candidates := make(map[basics.Address]candidateData) | ||
partAddrs := util.MakeSet(participating...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we do anything else with this slice? Maybe we should push the Set
type up through the callers, so that it is built as a Set when it is first created to pass to endOfBlock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in GenerateBlock while making a map of end-of-block account state for participating addresses, to include in the UnfinishedBlock
... if we pushed it up to GenerateBlock then it could protect against looking up the same participating address twice, if duplicate addresses were passed to GenerateBlock.
if maxSuspensions > 0 { | ||
knockOfflineCandidates, err := eval.state.knockOfflineCandidates() | ||
if err != nil { | ||
// Log an error and keep going; generating lists of absent and expired |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this implies some nodes can "choose" not to search for absent/expired accounts.
|
||
// Now, check these candidate accounts to see if they are expired or absent. | ||
for accountAddr, acctData := range candidates { | ||
if acctData.MicroAlgosWithRewards.IsZero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% of time, zero balance implies being closed?
// | ||
// This function is passed a list of participating addresses so a node will not | ||
// propose a block that suspends or expires itself. | ||
func (eval *BlockEvaluator) generateKnockOfflineAccountsList(participating []basics.Address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
participating is really "participating accounts excluding any I host"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good in general, a few small comments.
blkEval = l.nextBlock(t) | ||
//require.Empty(t, vb.Block().ExpiredParticipationAccounts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this added commented out?
challenge := byte(0) | ||
for i := uint64(0); i < uint64(1210); i++ { // A bit past one grace period (200) past challenge at 1000. | ||
vb := l.endBlock(t, blkEval) | ||
for i := uint64(0); i < uint64(1200); i++ { // Just before first suspension at 1171 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this not go past first suspension - why 1200?
} | ||
|
||
st := txn.Sign(keys[0]) | ||
err = eval.Transaction(st, transactions.ApplyData{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove all of these eval.Transaction
calls?
} | ||
|
||
// fetch fresh data up to this round from online account cache. These accounts should all | ||
// be in cache, as long as proto.StateProofTopVoters < onlineAccountsCacheMaxSize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a condition to call out in the consensus file.
Summary
In #5757 a mechanism was introduced to suspend "absentee" accounts that don't participate (by making a proposal, or heartbeat as in #5799), by adding a block header
AbsentParticipationAccounts
, similar toExpiredParticipationAccounts
.Currently, the list is generated by considering any account touched by a transaction in the current block, since this data is readily available at
endOfBlock()
. This PR adds a periodically-updated cache of top online accounts to the ledger, to find additional online accounts not mentioned in the current block.All of these tracked addresses will now be checked for absentee or expired status each round. To get a recent list of top online accounts, this PR uses recent work done by the votersTracker and state proof worker. (Every 256 rounds, the state proof system performs a TopOnlineAccounts query.) This adds access to the votersTracker to fetch the most recent list of top online addresses, and for each address looks up the latest round's data from the online account cache.
LastProposed and LastHeartbeat are added to the online accounts table's DB representation in this PR. This also fixes an issue introduced in #5965 where uses of ledgercore.OnlineAccountData (which didn't have LastHeartbeat/LastProposed fields) were replaced by basics.OnlineAccountData (which did) and ended up with those fields not being set in a couple of conversions from AccountData.
Test Plan
update test/e2e-go/features/incentives/suspension_test.go(TODO return later after heartbeats)