3.1.3.3. multi-GPUs in node¶

template<class T> class chase::mpi::ChaseMpiDLAMultiGPU : public chase::mpi::ChaseMpiDLAInterface<T>¶

A derived class of ChaseMpiDLAInterface which implements the inter-node computation for a multi-GPUs MPI-based implementation of ChASE.

Public Functions

void preApplication(T *V, std::size_t locked, std::size_t block) override¶

void apply(T alpha, T beta, std::size_t offset, std::size_t block, std::size_t locked) override¶

bool postApplication(T *V, std::size_t block, std::size_t locked) override¶

void shiftMatrix(T c, bool isunshift = false) override¶

This function performs the shift of diagonal of a global matrix

This global is already distributed on GPUs, so the shifting operation takes place on the local block of global matrix on each GPU.
This function is naturally in parallel among all MPI procs and also with each GPU.

void applyVec(T *B, T *C) override¶

All required operations for this function has been done in for ChaseMpiDLA::applyVec().
This function contains nothing in this class.

void axpy(std::size_t N, T *alpha, T *x, std::size_t incx, T *y, std::size_t incy) override¶: It is an interface to BLAS ?axpy.

void scal(std::size_t N, T *a, T *x, std::size_t incx) override¶: It is an interface to BLAS ?scal.

Base<T> nrm2(std::size_t n, T *x, std::size_t incx) override¶: It is an interface to BLAS ?nrm2.

T dot(std::size_t n, T *x, std::size_t incx, T *y, std::size_t incy) override¶: It is an interface to BLAS ?dot.