-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the sensors reading on plugable boards issue (valid sensors detection method proposal) #25
Comments
Here's my thoughts:
Expected behavior is that it should set all zones with missing sensors to failsafe mode, and continue loading. Zones without missing sensors should function normally. Zones with missing sensors should remain in failsafe mode. If all of the missing sensors appear at a later time, and the zone is no longer missing any sensors, the zone should come out of failsafe mode, and begin normal operation. I'm hoping the program receives notifications when new sensors are added, so that it doesn't have to be manually restarted in order to search for sensors again. If not, that's a bug.
If we simply filter out all missing sensors, then we run the risk of overheating the system if an important sensor is missing. We should set the system to failsafe mode, instead. Failsafe mode is designed to be used if any sensors are missing, or not reporting valid data. Some unimportant sensors may be designated as optional. Those, I'm OK with filtering them out in the updateSensors() loop, as you described. This designation would be a new feature, so that it would not impact existing sensors already in the field. Alternatively, a compile-time constant could be used, to change all sensors from mandatory to optional, defaulting to false. This would most likely be much easier to implement. By defaulting to false, it would not affect existing systems in the field, so it would not compromise their ability to go into failsafe mode when sensors are missing. |
I'm not sure if it's a bug or not. |
A pluggable board should trigger some auto-detection, perhaps using entity-manager JSON probe/exposes files, so that the PID loops are only instantiated when the board is plugged in. There is no easy solution for this case. Failsafe mode is bad, because there's no need to have the fans spin fast, to cool a board that isn't there. Ignoring missing sensors is bad, because the board could overheat if any sensors on it go bad and disappear. As for static JSON versus D-Bus, that is a strange behavior. It sounds like a bug to me. It would be great if we could address both of the code paths and clean up any differences in behavior like this, that we find. |
Isn't that essentially what
Sounds reminiscent of this... |
@zevweiss but the sensors we are talking about is the sensors on plugable board |
@Krellan , as you suggested , missing sensors should be devided into two types |
how did you get that result? |
When using phosphor-hwmon + statis JSON before, I added DIMM sensors to an input list. "inputs": [ But when one of dimm sensors in this list doesn't exist on dbus(didn't insert DIMM device on board), pid service will start to fail and restart. |
@harveyquta we use enity-manager +dbus-sensors with phosphor-pid-control , so it's the different story |
As we found that we have to define sensor reading path in config.json
Then buildSensors() function will search all the sensor reading paths defined in config.json
Once they are found , they are added to SensorManager (mgmr.addSensor())
(https://github.com/openbmc/phosphor-pid-control/blob/master/sensors/builder.cpp#L184)
Once pid service started , the added sensors reading are updated if the property changed
=>void DbusPidZone::updateSensors(void)
However, buildSensors() process will fail when any of the defined sensor reading paths is not found.
It will restart over and over
take following case as example : the sensors on a pluggable board
The Sensors path need to be predefined in config.json
But they wont exist , if the board is not plugged
It will let buildSensors() process repeat over and over
=>
(https://github.com/openbmc/phosphor-pid-control/blob/master/sensors/builder.cpp#L99)
We’d like propose a method called valid sensor detection
The main idea is to create validSensors map recording if the sensor reading path is found within a certain time,
Only the valid ones will be added to sensor manager
(https://github.com/openbmc/phosphor-pid-control/blob/master/sensors/builder.cpp#L184)
And this validSensors map will be passed on to buildZones() process
Because only the valid sensors can be added to thermal_inputs of each zone (we use this map to identify the valid sensors)
Through this way , only the valid sensors reading will be updated =>
=========================================
void DbusPidZone::updateSensors(void)
{
for (const auto& t : _thermalInputs) <== only the valid sensors in _thermalInputs
{
auto sensor = _mgr.getSensor(t); <=== only the valid sensors in mgr
ReadReturn r = sensor->read();
}
}
please share your ideas, we are open to further discussion
The text was updated successfully, but these errors were encountered: