Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Modified JEventProcessorPODIO::Process to acquire lock only when need… #1531

Draft
wants to merge 2 commits into
base: pr/JEventProcessorPODIO_avoid_fixing_collections
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions src/services/io/podio/JEventProcessorPODIO.cc
Original file line number Diff line number Diff line change
Expand Up @@ -371,10 +371,13 @@ void JEventProcessorPODIO::FindCollectionsToWrite(const std::shared_ptr<const JE

void JEventProcessorPODIO::Process(const std::shared_ptr<const JEvent> &event) {

std::lock_guard<std::mutex> lock(m_mutex);
std::unique_lock<std::mutex> lock(m_mutex, std::defer_lock);
lock.lock();
if (m_is_first_event) {
FindCollectionsToWrite(event);
m_is_first_event = false;
}
lock.unlock();
Comment on lines +375 to +380
Copy link
Contributor

@wdconinc wdconinc Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly (m_is_first_event monotonically changing from true to false), the lock can be inside the if, avoiding a lock/unlock after the first event. I'd want to avoid an unnecessary lock since it potentially introduces a wait for the write operation.

The (new) race condition to avoid then is m_is_first_event changing from true to false while the condition is evaluated, so we'd have to do a second locked if inside the unlocked if, but it would virtually never be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're describing the ABA problem. I think std::call_once is the tool for this purpose. (Just make sure that any code inside call_once is exception free because we've been bitten by bad standard library implementations before)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any code inside call_once is exception free

The code inside is an entire event processing in EICrecon :=)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanation of double-checked locking and ABA problem for anyone who is interested: https://en.wikipedia.org/wiki/Double-checked_locking#Motivation_and_original_pattern

However, this is an optimization detail that I would avoid for now in favor of the simple, obvious solution. We'd like to verify:

  • Consistent results in the output file
  • A sensible cores vs throughput curve
  • No data races uncovered by valgrind/tsan/whatever

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any code inside call_once is exception free

The code inside is an entire event processing in EICrecon :=)

Nope, just FindCollectionsToWrite() thankfully :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see. At some point this ran through the whole first event to figure out which output names were activated, right? Or do I remember that wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did some wild things in the past for sure. Right now it just returns all available collection names (ignoring whether the corresponding factories have been run or not) and applies inclusions and exclusions.


for (const std::string& coll : m_collections_to_write) {
try {
Expand All @@ -395,9 +398,9 @@ void JEventProcessorPODIO::Process(const std::shared_ptr<const JEvent> &event) {
// Note that collections MUST be present in frame. If a collection is null, the writer will segfault.
const auto* frame = event->GetSingle<podio::Frame>();

lock.lock();
m_writer->writeFrame(*frame, "events", m_collections_to_write);
m_is_first_event = false;

lock.unlock();
// Print the contents of some collections, just for debugging purposes
// Do this before writing just in case writing crashes
if (!m_collections_to_print.empty()) {
Expand Down
Loading