ChaseMpiDLAInterface¶
- 
template<class T>
 classchase::mpi::ChaseMpiDLAInterface¶
- A class to set up an interface to all the Dense Linear Algebra ( - DLA) operations required by ChASE.- In the class ChaseMpiDLAInterface, the - DLAfunctions are only setup as a series of- virtualfunctions without direct implementation. The implementation of these- DLAwill be laterly implemented by a set of derived classes targeting different computing architectures. Currently, in- ChASE, we provide multiple derived classes- chase::mpi::ChaseMpiDLABlaslapackSeq: implementing ChASE targeting shared-memory architectures with only CPUs available. 
- chase::mpi::ChaseMpiDLABlaslapackSeqInplace: implementing ChASE targeting shared-memory architectures with only CPUs available, with a inplace mode, in which the buffer of rectangular matrices are swapped and reused. This reduces the required memory to be allocted. 
- chase::mpi::ChaseMpiDLACudaSeq: implementing ChASE targeting shared-memory architectures, most computation tasks are offloaded to one single GPU card. 
- chase::mpi::ChaseMpiDLA: implementing mostly the MPI collective communications part of distributed-memory ChASE targeting the systems with or w/o GPUs. 
- chase::mpi::ChaseMpiDLABlaslapack: implementing the inter-node computation for a pure-CPU MPI-based implementation of ChASE. 
- chase::mpi::ChaseMpiDLAMultiGPU: implementing the inter-node computation for a multi-GPU MPI-based implementation of ChASE. - Template Parameters
- T: the scalar type used for the application. ChASE is templated for real and complex numbers with both Single Precision and Double Precision, thus- Tcan be one of- float,- double,- std::complex<float>and- std::complex<double>.
 
 
 - Subclassed by chase::mpi::ChaseMpiDLA< T >, chase::mpi::ChaseMpiDLABlaslapack< T >, chase::mpi::ChaseMpiDLABlaslapackSeq< T >, chase::mpi::ChaseMpiDLABlaslapackSeqInplace< T >, chase::mpi::ChaseMpiDLACudaSeq< T >, chase::mpi::ChaseMpiDLAMultiGPU< T > - Public Functions - 
void shiftMatrix(T c, bool isunshift = false) = 0¶
- This function shifts the diagonal of global matrix with a constant value - c.- Parameters
- c: shift value
 
 
 - 
void preApplication(T *V1, std::size_t locked, std::size_t block) = 0¶
- This function is for some pre-application steps for the distributed HEMM in ChASE. These steps may vary in different implementations targetting different architectures. These steps can be backup of some buffers, copy data from CPU to GPU, etc. - Parameters
- V1: a pointer to a matrix
- locked: an integer indicating the number of locked (converged) eigenvectors
- block: an integer indicating the number of non-locked (non-converged) eigenvectors
 
 
 - 
void apply(T alpha, T beta, std::size_t offset, std::size_t block, std::size_t locked) = 0¶
- Performs \(V_2<- \alpha V1H + \beta V_2\) and - swap\((V_1,V_2)\).- The first - offsetvectors of V1 and V2 are not part of the- HEMM. The number of vectors performed in- V1and- V2is- blockIn- MATLABnotation, this operation performs:- V2[:,start:end]<-alpha*V1[:,start:end]*H+beta*V2[:,start:end],- in which - start=locked+offsetand- end=locked+offset+block.- Parameters
- alpha: a scalar times on- V1*Hin- HEMMoperation.
- beta: a scalar times on- V2in- HEMMoperation.
- offset: an offset of number vectors which the- HEMMstarting from.
- block: number of non-converged eigenvectors, it indicates the number of vectors in- V1and- V2to perform- HEMM.
- locked: number of converged eigenvectors.
 
 
 - 
bool postApplication(T *V, std::size_t block, std::size_t locked) = 0¶
- Copy from buffer rectangular matrix - v1to- v2. For the implementation of distributed-memory ChASE, this operation performs a- copyfrom a matrix distributed within each column communicator and redundant among different column communicators to a matrix redundantly distributed across all MPI procs. Then in the next iteration of ChASE-MPI, this operation takes places in the row communicator…- Parameters
- V: the target buff
- block: number of columns to copy from- v1to- v2
- locked: number of converged eigenvectors.
 
 
 - 
void applyVec(T *B, T *C) = 0¶
- Performs a Generalized Matrix Vector Multiplication ( - GEMV) with- alpha=1.0and- beta=0.0.- The operation is - C=H*B.- Parameters
- B: the vector to be multiplied on- H.
- C: the vector to store the product of- Hand- B.
 
 
 - 
void axpy(std::size_t N, T *alpha, T *x, std::size_t incx, T *y, std::size_t incy) = 0¶
- A - BLAS-likefunction which performs a constant times a vector plus a vector.- Parameters
- [in] N: number of elements in input vector(s).
- [in] alpha: a scalar times on- xin- AXPYoperation.
- [in] x: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incx ).
- [in] incx: storage spacing between elements of- x.
- [in/out]: y: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incy ).
- [in] incy: storage spacing between elements of- y.
 
 
 - 
void scal(std::size_t N, T *a, T *x, std::size_t incx) = 0¶
- A - BLAS-likefunction which scales a vector by a constant.- Parameters
- [in] N: number of elements in input vector(s).
- [in] a: a scalar of type- Ttimes on vector- x.
- [in/out]: x: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incx ).
- [in] incx: storage spacing between elements of- x.
 
 
 - 
Base<T> nrm2(std::size_t n, T *x, std::size_t incx) = 0¶
- A - BLAS-likefunction which returns the euclidean norm of a vector.- Return
- the euclidean norm of vector - x.
- Parameters
- [in] N: number of elements in input vector(s).
- [in] x: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incx ).
- [in] incx: storage spacing between elements of- x.
 
 
 - 
T dot(std::size_t n, T *x, std::size_t incx, T *y, std::size_t incy) = 0¶
- A - BLAS-likefunction which forms the dot product of two vectors.- Return
- the dot product of vectors - xand- y.
- Parameters
- [in] N: number of elements in input vector(s).
- [in] x: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incx ).
- [in] incx: storage spacing between elements of- x.
- [in] y: an array of type- T, dimension- ( 1 + ( N - 1 )*abs( incy ).
- [in] incy: storage spacing between elements of- y.