Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental verify checkpoints #4487

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
19 changes: 17 additions & 2 deletions docs/software/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,8 +209,23 @@ apply.
hash for a checkpoint ledger, and then verifies the entire earlier history
of an archive that ends in that ledger hash, writing the output to a reference
list of trusted checkpoint hashes.
Option **--output-filename <FILE-NAME>** is mandatory and specifies the file
to write the trusted checkpoint hashes to.
* Option **--history-hash <HASH>** is optional and specifies the hash of the ledger
at the end of the verification range. When provided, `stellar-core` will use the history
hash to verify the range, rather than the latest checkpoint hash obtained from consensus.
Used in conjunction with `--history-ledger`.
* Option **--history-ledger <LEDGER-NUMBER>** is optional and specifies the ledger
number to end the verification at. Used in conjunction with `--history-hash`.
* Option **--output-filename <FILE-NAME>** is mandatory and specifies the file
to write the trusted checkpoint hashes to. The file will contain a JSON array
of arrays, where each inner array contains the ledger number and the corresponding
checkpoint hash of the form `[[999, "hash-abc"], [935, "hash-def"], ... [0, "hash-xyz]]`.
* Option **--trusted-checkpoint-file <FILE-NAME>** is optional. If provided,
stellar-core will parse the latest checkpoint ledger number and hash from the file and verify from this ledger to the latest checkpoint ledger obtained from the network.
* Option **--from-ledger <LEDGER-NUMBER>** is optional and specifies the ledger
number to start the verification from.

> Note: It is an error to provide both the `--trusted-checkpoint-hashes` and `--from-ledger` options.

* **version**: Print version info and then exit.

## HTTP Commands
Expand Down
2 changes: 1 addition & 1 deletion src/catchup/CatchupWork.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ CatchupWork::downloadVerifyLedgerChain(CatchupRange const& catchupRange,

mVerifyLedgers = std::make_shared<VerifyLedgerChainWork>(
mApp, *mDownloadDir, verifyRange, mLastClosedLedgerHashPair,
mRangeEndFuture, std::move(fatalFailurePromise));
std::nullopt, mRangeEndFuture, std::move(fatalFailurePromise));

// Never retry the sequence: downloads already have retries, and there's no
// point retrying verification
Expand Down
25 changes: 23 additions & 2 deletions src/catchup/VerifyLedgerChainWork.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ trySetFuture(std::promise<T>& promise, T value)
VerifyLedgerChainWork::VerifyLedgerChainWork(
Application& app, TmpDir const& downloadDir, LedgerRange const& range,
LedgerNumHashPair const& lastClosedLedger,
std::optional<LedgerNumHashPair> const& maxPrevVerified,
std::shared_future<LedgerNumHashPair> trustedMaxLedger,
std::promise<bool>&& fatalFailure,
std::shared_ptr<std::ofstream> outputStream)
Expand All @@ -118,6 +119,7 @@ VerifyLedgerChainWork::VerifyLedgerChainWork(
: mApp.getHistoryManager().checkpointContainingLedger(
mRange.last()))
, mLastClosed(lastClosedLedger)
, mMaxPrevVerified(maxPrevVerified)
, mFatalFailurePromise(std::move(fatalFailure))
, mTrustedMaxLedger(trustedMaxLedger)
, mVerifiedMinLedgerPrevFuture(mVerifiedMinLedgerPrev.get_future().share())
Expand Down Expand Up @@ -209,7 +211,7 @@ VerifyLedgerChainWork::verifyHistoryOfSingleCheckpoint()
}

// Verify ledger with local state by comparing to LCL
// When checking against LCL, see it the local node is in the bad state,
// When checking against LCL, see if the local node is in a bad state
// or if the archive is in a bad state (in which case, retry)
if (curr.header.ledgerSeq == mLastClosed.first)
{
Expand Down Expand Up @@ -240,6 +242,20 @@ VerifyLedgerChainWork::verifyHistoryOfSingleCheckpoint()
mChainDisagreesWithLocalState = lclResult;
}
}
// If the curr history entry is the same ledger as our mMaxPrevVerified,
// verify that the hashes match.
if (mMaxPrevVerified &&
curr.header.ledgerSeq == mMaxPrevVerified->first &&
curr.hash != mMaxPrevVerified->second)
{
CLOG_ERROR(History,
"Checkpoint {} does not agree with trusted "
"checkpoint hash {}",
LedgerManager::ledgerAbbrev(curr),
LedgerManager::ledgerAbbrev(mMaxPrevVerified->first,
*mMaxPrevVerified->second));
return HistoryManager::VERIFY_STATUS_ERR_BAD_HASH;
}

if (beginCheckpoint)
{
Expand Down Expand Up @@ -354,7 +370,7 @@ VerifyLedgerChainWork::verifyHistoryOfSingleCheckpoint()
}
else
{
// Otherwise we just finished a checkpoint _after_ than the first call
// Otherwise we just finished a checkpoint _after_ the first call
// to this method and the `incoming` value we read out of
// `mVerifiedAhead` should have content, because the previous call
// should have saved something in `mVerifiedAhead`.
Expand Down Expand Up @@ -409,6 +425,11 @@ VerifyLedgerChainWork::onSuccess()
{
for (auto const& pair : mVerifiedLedgers)
{
if (mMaxPrevVerified && mMaxPrevVerified->first == pair.first)
{
// Skip writing the trusted hash to the output file.
continue;
}
(*mOutputStream) << "\n[" << pair.first << ", \""
<< binToHex(*pair.second) << "\"],";
}
Expand Down
5 changes: 5 additions & 0 deletions src/catchup/VerifyLedgerChainWork.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ class VerifyLedgerChainWork : public BasicWork
LedgerRange const mRange;
uint32_t mCurrCheckpoint;
LedgerNumHashPair const mLastClosed;
// The max ledger number and hash that we have verified up to at some time
// in the past (or genesis if we have no previous verification). Invocations
// of VerifyLedgerChainWork will verify down to this ledger.
std::optional<LedgerNumHashPair> const mMaxPrevVerified;

// Record any instance where the chain we're verifying disagrees with the
// local node state. This _might_ mean we can't possibly catch up (eg. we're
Expand Down Expand Up @@ -78,6 +82,7 @@ class VerifyLedgerChainWork : public BasicWork
VerifyLedgerChainWork(
Application& app, TmpDir const& downloadDir, LedgerRange const& range,
LedgerNumHashPair const& lastClosedLedger,
std::optional<LedgerNumHashPair> const& maxPrevVerified,
std::shared_future<LedgerNumHashPair> trustedMaxLedger,
std::promise<bool>&& fatalFailure,
std::shared_ptr<std::ofstream> outputStream = nullptr);
Expand Down
2 changes: 1 addition & 1 deletion src/history/test/HistoryTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ TEST_CASE("Ledger chain verification", "[ledgerheaderverification]")
std::shared_future<bool> fatalFailureFuture =
fataFailurePromise.get_future().share();
auto w = wm.executeWork<VerifyLedgerChainWork>(
tmpDir, ledgerRange, lclPair, ledgerRangeEndFuture,
tmpDir, ledgerRange, lclPair, std::nullopt, ledgerRangeEndFuture,
std::move(fataFailurePromise));
REQUIRE(expectedState == w->getState());
REQUIRE(fatalFailureFuture.valid());
Expand Down
Loading
Loading