Skip to content

Commit

Permalink
[BBPBGLIB-1139] Missing exception logging on configuration errors (#142)
Browse files Browse the repository at this point in the history
## Context

Sometimes, when there is an error (typo) in sonata config file, the run
was just failing silently, without any exception logged.

## Scope

There is a hack in commands.py, which synchronizes exception logging by
allowing only a single node to log the exception to stderr (otherwise,
production runs would be flooded by exception logs). It requires a file
to write all MPI ranks for all failed runs, and choosing the first rank
for reporting the exception.

This temp file was supposed to be deleted on startup, but if the
exception happens early enough (e.g. if there is a typo in the config
file), the temp file removal code was not reached, and it messed with
subsequent runs and their exception logging.

This attempt at leader election is a hack, to say the least, but for now
it should work with this fix. We might want to think how to implement
this properly later; temp files are fragile and non-atomic.

The code removing the temp file has been moved to earlier position in
the execution, to commands.py.

## Testing

It is hard to test for this in the multi-node environment. The change is
minor. We might want to write a unit test that checks if this file is
actually removed after starting commands.py, but does it make sense?

## Review
* [X] PR description is complete
* [X] Coding style (imports, function length, New functions, classes or
files) are good
* [?] Unit/Scientific test added
* [X] Updated Readme, in-code, developer documentation
  • Loading branch information
atemerev authored Mar 21, 2024
1 parent b978f40 commit 20be81f
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
5 changes: 5 additions & 0 deletions neurodamus/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,11 @@ def neurodamus(args=None):
# Warning control before starting the process
_filter_warnings()

# Some previous executions may have left a bad exception node file
# This is done now so it's a very early stage and we know the mpi rank
if MPI.rank == 0 and os.path.exists(EXCEPTION_NODE_FILENAME):
os.remove(EXCEPTION_NODE_FILENAME)

try:
Neurodamus(config_file, True, logging_level=log_level, **options).run()
except ConfigurationError as e: # Common, only show error in Rank 0
Expand Down
5 changes: 0 additions & 5 deletions neurodamus/core/_neurodamus.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,6 @@ def _init(cls, **kwargs):
setup_logging(GlobalConfig.verbosity, log_name, MPI.rank)
log_stage("Initializing Neurodamus... Logfile: " + log_name)

# Some previous executions may have left a bad exception node file
# This is done now so it's a very early stage and we know the mpi rank
if MPI.rank == 0 and os.path.exists(EXCEPTION_NODE_FILENAME):
os.remove(EXCEPTION_NODE_FILENAME)

# Load mods if not available
cls._load_nrnmechlibs()
log_verbose("Mechanisms (mod) library(s) successfully loaded")
Expand Down

0 comments on commit 20be81f

Please sign in to comment.