Your task is now to combine the MPI parallelization as described for the CPU-only code with the OpenMP offloading.
You can base your work on the hybrid MPI + OpenMP code and the previous work on offloading with a single GPU.
In order to achieve a working multi-GPU code, you should:
- Assign MPI tasks to devices
- Alternatively, either
- a) Copy the data between host and device before and after the MPI communication, or
- b) Pass device pointer to MPI routines
- Use OpenMP offload constructs in the
evolve
routine