3.9. GPU implementation

The principle for GPU implementation are consistent with the general princples for DISPATCH as a code framework:

  • The implementation should be largely invisible to the solver code, requiring only classfication of array variables into to:, from:, tofrom, and (most importantly) alloc:.
  • Parallelization is achieved by running a large number of tasks simultaneously on the device, analogously with how a number of tasks are run simultaneously on multi-core CPUs

Details are given below