Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need OMPI_MCA_osc=sm,pt2pt when using libyt #86

Open
1 task
cindytsai opened this issue Jun 8, 2023 · 0 comments
Open
1 task

Need OMPI_MCA_osc=sm,pt2pt when using libyt #86

cindytsai opened this issue Jun 8, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@cindytsai
Copy link
Collaborator

cindytsai commented Jun 8, 2023

Need OMPI_MCA_osc=sm,pt2pt when using libyt

At the time I was developing features related to using RMA (remote memory access), all I care is make it work on HPC system and haven't thought much about why do we need this parameter so that it can run on HPC. We don't need this on single machine, ex: my laptop.

TODO

  • What is OMPI_MCA_osc=sm,pt2pt? and why?

Problems

  • Slow and isn't recommanded
    • Though when using the parameter for strong scaling test, it is still faster than post-processing. (Just for reference).

When do we need this?

Attaching same pointer multiple times

When I was testing particle array using example like this:

int temp[0] = {myrank};
grids_local[index_local].particle_data[0][3].data_ptr = temp;

I get error:

[xps:25522] *** An error occurred in MPI_Win_attach
[xps:25522] *** reported by process [3353411585,1]
[xps:25522] *** on win rdma window 3
[xps:25522] *** MPI_ERR_RMA_ATTACH: Could not attach RMA segment
[xps:25522] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[xps:25522] ***    and potentially your MPI job)
[xps:25513] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[xps:25513] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

This is probably caused by attaching same data to windows.
But it is strange that it can be fixed by using

OMPI_MCA_osc=sm,pt2pt mpirun -np 4 ./example

When running on Taiwania 3 and Eureka

Needs to add:

OMPI_MCA_osc=sm,pt2pt mpirun -np 4 ./example

Otherwise I get error:

(something related to attaching...)
@cindytsai cindytsai added the enhancement New feature or request label Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant