-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build_ascent.sh failures #1379
Comments
@mlohry for the cray compiler wrappers, we have to tell CMake that mpi will magically work and it shouldn't look for it. I am working on a path that uses the cray compiler wrappers as well as the mpich mpi compiler wrappers for Frontier w/ the new modules. Can you confirm what modules would be ideal for your case? |
In the short term really any CPU build with a working replay_mpi on frontier would be of use to look at some large blueprint saves. The above error was from the default frontier modules. The modules used for our solver case on frontier are these:
so ideally those (gcc 13) if it prevents issues. Running with that set,
HDF5 hits a linker error during the build,
The (recently updated) Trying to minimize the modules a bit,
I still hit the original build error of |
Thanks for the details - try this branch: https://github.com/Alpine-DAV/ascent/tree/task/2024_09_frontier run: It's not the same modules you need for the integrated case, but I was able to run ascent mpi tests (I tried two ranks) successfully. Here are the modules that need to be loaded to run (from the top of the build frontier script)
|
If you see BLT_SOURCE_DIR missing, that means you missed You can fix that with:
|
Looking again -- I think the MPI issue confused me -- seeing that |
@cyrush thanks that built, but back to the original issue I hit -- when I execute
That file exists and rank 0 sees it, but the mpi broadcast of the bool seems to leave the other ranks seeing false. Looks like that code is fairly recent: Are you able to successfully run |
Ok - sounds like a new bug & system MPI is ok. We did have a change, looking into it. |
@mlohry On the /task/2024_09_frontier branch -- I changed the actions checking logic in relay to match another implementation we have. Can you see if this resolves your issue? |
Sitting in queues, will let you know. What is |
The other code uses |
I was wondering where The latest branch looks like it might have worked, but the post I expected to take 10 minutes on 471 nodes timed out after 60 minutes and didn't produce any images so can't tell if it was hanging or not. I'll try it again on a smaller dataset. |
That is a great question. The ifdef was wrong, so |
ascent/src/utilities/replay/replay.cpp Line 235 in 239d4d6
missing
|
pushed fix -- checked compile and it worked. |
aside -- trying to build a past working version commit a5f51b,
this check ascent/scripts/build_ascent/build_ascent.sh Line 869 in ead8637
ascent-develop and ends up building that, not the checked out branch.
|
Overall develop (or the frontier branch) should be the best -- but sorry for the bumps in the road with the recent replay bugs. We are planning to add extensive replay testing. Since the ifdef typos have happened twice now we will need to think of a good way to protect from those errors, which the compiler doesn't help us with :-) |
On latest develop branch
e0100bf5
, runningenv enable_mpi=ON install_dir=/path/to/install build_jobs=10 ./scripts/build_ascent/build_ascent.sh
ends with the following error in the ascent configure step:The text was updated successfully, but these errors were encountered: