Skip to content

Commit

Permalink
RBMC: Use sibling fields in role determination
Browse files Browse the repository at this point in the history
Now that the sibling information is available, use it in role
determination.

At a high level the new rules are:
1. If something is wrong with the sibling, choose active.
2. If the sibling already has a role, choose the opposite.
3. Otherwise, choose based on position.

There will be cases that that would require the local BMC to be passive,
such as if it isn't provisioned.  These would be checked before even
looking at any sibling fields.

The role determination function now also returns if there was an error
case detected.  This is needed so that in the future:

1. An error log can be created based on that specific case.

2. Eventually, the previous role will be used in role determination,
   before defaulting to the role based on position.  However, we won't
   want to do that if the role was only passive because of an error.
   (Meaning the fact that there was an error will need to be
   persisted.)

Tested:
New unit tests pass.

And in simulation:
```
phosphor-rbmc-state-manager[956]: Role = active due to the sibling already being passive

phosphor-rbmc-state-manager[1005]: Role = passive due to the sibling already being active
```

Signed-off-by: Matt Spinler <[email protected]>
  • Loading branch information
spinler committed Oct 21, 2024
1 parent f28792d commit 0cd16c3
Show file tree
Hide file tree
Showing 5 changed files with 181 additions and 17 deletions.
13 changes: 11 additions & 2 deletions redundant-bmc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,22 @@ redundancy related properties.

## Startup

So far on startup the application will just immediately determine its role.
On startup, the code will wait for up to six minutes for the sibling BMC's
heartbeat to start, assuming the BMC is present. After that it will determine
its role.

## Role Determination Rules

The current rules for role determination are:

1. If BMC position zero choose the active role, otherwise passive.
1. If the sibling BMC doesn't have a heartbeat, choose active. It could be the
sibling isn't even present.
1. If the sibling BMC's position matches this BMC's position, choose passive.
This is an error case.
1. If the sibling isn't provisioned, choose active.
1. If the sibling is already passive, choose active.
1. If the sibling is already active, choose passive.
1. If this BMC's position is zero choose active, otherwise passive.

If there is an internal failure during role determination, like an exception,
the BMC will also have to become passive.
Expand Down
28 changes: 22 additions & 6 deletions redundant-bmc/src/manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -80,26 +80,42 @@ sdbusplus::async::task<> Manager::doHeartBeat()

Role Manager::determineRole()
{
Role role{Role::Unknown};
using namespace role_determination;

RoleInfo roleInfo{Role::Unknown, ErrorCase::noError};

try
{
// Note: If these returned nullopts, the algorithm wouldn't use
// them anyway because there would be no heartbeat.
auto siblingRole = sibling->getRole().value_or(Role::Unknown);
auto siblingProvisioned = sibling->getProvisioned().value_or(false);
auto siblingPosition = sibling->getPosition().value_or(0xFF);

role_determination::Input input{
.bmcPosition = services->getBMCPosition()};
.bmcPosition = services->getBMCPosition(),
.siblingPosition = siblingPosition,
.siblingRole = siblingRole,
.siblingHeartbeat = sibling->hasHeartbeat(),
.siblingProvisioned = siblingProvisioned};

role = role_determination::run(input);
roleInfo = role_determination::run(input);
}
catch (const std::exception& e)
{
lg2::error("Exception while determining, role. Will have to be "
"passive. Error = {ERROR}",
"ERROR", e);
role = Role::Passive;
roleInfo.role = Role::Passive;
roleInfo.error = ErrorCase::internalError;
}

lg2::info("Role Determined: {ROLE}", "ROLE", role);
if (roleInfo.error != role_determination::ErrorCase::noError)
{
// TODO: Create an error log
}

return role;
return roleInfo.role;
}

} // namespace rbmc
42 changes: 39 additions & 3 deletions redundant-bmc/src/role_determination.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,50 @@
namespace rbmc::role_determination
{

Role run(const Input& input)
RoleInfo run(const Input& input)
{
// Must check this before any other sibling fields
if (!input.siblingHeartbeat)
{
lg2::info("Role = active due to no sibling heartbeat");
return {Role::Active, ErrorCase::noError};
}

if (input.bmcPosition == input.siblingPosition)
{
lg2::error(
"Role = passive due to both BMC's having the same position {POSITION}",
"POSITION", input.bmcPosition);
return {Role::Passive, ErrorCase::samePositions};
}

if (!input.siblingProvisioned)
{
lg2::info("Role = active due to the sibling not being provisioned");
return {Role::Active, ErrorCase::noError};
}

if (input.siblingRole == Role::Passive)
{
lg2::info("Role = active due to the sibling already being passive");
return {Role::Active, ErrorCase::noError};
}

if (input.siblingRole == Role::Active)
{
lg2::info("Role = passive due to the sibling already being active");
return {Role::Passive, ErrorCase::noError};
}

if (input.bmcPosition == 0)
{
lg2::info("Role = active due to BMC position 0");
return Role::Active;
return {Role::Active, ErrorCase::noError};
}
return Role::Passive;

lg2::info("Role = passive due to BMC position {POSITION}", "POSITION",
input.bmcPosition);
return {Role::Passive, ErrorCase::noError};
}

} // namespace rbmc::role_determination
30 changes: 28 additions & 2 deletions redundant-bmc/src/role_determination.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,42 @@ namespace role_determination
struct Input
{
size_t bmcPosition;
size_t siblingPosition;
Role siblingRole;
bool siblingHeartbeat;
bool siblingProvisioned;
};

/**
* @brief Role determination error cases.
*/
enum class ErrorCase
{
noError,
internalError,
samePositions
};

/**
* @brief The role and the error reason returned from run()
*/
struct RoleInfo
{
Role role;
ErrorCase error;

// use the default <, ==, > operators for compares
auto operator<=>(const RoleInfo&) const = default;
};

/**
* @brief Determines if this BMC should claim the Active or Passive role.
*
* @param[in] input - The structure of inputs
*
* @return The role
* @return The role and error case
*/
Role run(const Input& input);
RoleInfo run(const Input& input);

} // namespace role_determination

Expand Down
85 changes: 81 additions & 4 deletions redundant-bmc/test/role_determination_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,90 @@ using namespace role_determination;

TEST(RoleDeterminationTest, RoleDeterminationTest)
{
using enum ErrorCase;
using enum Role;

// BMC pos 0 with sibling healthy
{
Input input{.bmcPosition = 0,
.siblingPosition = 1,
.siblingRole = Unknown,
.siblingHeartbeat = true,
.siblingProvisioned = true};

RoleInfo info{Active, noError};
EXPECT_EQ(run(input), info);
}

// BMC pos 1 with sibling healthy
{
Input input{.bmcPosition = 1,
.siblingPosition = 0,
.siblingRole = Unknown,
.siblingHeartbeat = true,
.siblingProvisioned = true};

RoleInfo info{Passive, noError};
EXPECT_EQ(run(input), info);
}

// No Sibling heartbeat, BMC pos 1
{
Input input{.bmcPosition = 0};
EXPECT_EQ(run(input), Role::Active);
Input input{.bmcPosition = 1,
.siblingPosition = 0,
.siblingRole = Unknown,
.siblingHeartbeat = false,
.siblingProvisioned = true};

RoleInfo info{Active, noError};
EXPECT_EQ(run(input), info);
}

// Both BMCs report the same position
{
Input input{.bmcPosition = 1};
EXPECT_EQ(run(input), Role::Passive);
Input input{.bmcPosition = 0,
.siblingPosition = 0,
.siblingRole = Unknown,
.siblingHeartbeat = true,
.siblingProvisioned = true};

RoleInfo info{Passive, samePositions};
EXPECT_EQ(run(input), info);
}

// Sibling not provisioned
{
Input input{.bmcPosition = 1,
.siblingPosition = 0,
.siblingRole = Unknown,
.siblingHeartbeat = true,
.siblingProvisioned = false};

RoleInfo info{Active, noError};
EXPECT_EQ(run(input), info);
}

// Sibling already active, this pos = 0
{
Input input{.bmcPosition = 0,
.siblingPosition = 1,
.siblingRole = Active,
.siblingHeartbeat = true,
.siblingProvisioned = true};

RoleInfo info{Passive, noError};
EXPECT_EQ(run(input), info);
}

// Sibling already passive, this pos = 1
{
Input input{.bmcPosition = 1,
.siblingPosition = 0,
.siblingRole = Passive,
.siblingHeartbeat = true,
.siblingProvisioned = true};

RoleInfo info{Active, noError};
EXPECT_EQ(run(input), info);
}
}

0 comments on commit 0cd16c3

Please sign in to comment.