3.1.2.3. Single GPU¶
-
template<class
T
>
classchase::mpi
::
ChaseMpiDLACudaSeq
: public chase::mpi::ChaseMpiDLAInterface<T>¶ A derived class of ChaseMpiDLAInterface which implements ChASE targeting shared-memory architectures, some selected computation tasks are offloaded to one single GPU card.
Public Functions
-
void
preApplication
(T *V, std::size_t locked, std::size_t block) override¶ This function is for some pre-application steps for the distributed HEMM in ChASE. These steps may vary in different implementations targetting different architectures. These steps can be backup of some buffers, copy data from CPU to GPU, etc.
- Parameters
V1
: a pointer to a matrixlocked
: an integer indicating the number of locked (converged) eigenvectorsblock
: an integer indicating the number of non-locked (non-converged) eigenvectors
-
void
apply
(T alpha, T beta, std::size_t offset, std::size_t block, std::size_t locked) override¶ Performs \(V_2<- \alpha V1H + \beta V_2\) and
swap
\((V_1,V_2)\).The first
offset
vectors of V1 and V2 are not part of theHEMM
. The number of vectors performed inV1
andV2
isblock
InMATLAB
notation, this operation performs:V2[:,start:end]<-alpha*V1[:,start:end]*H+beta*V2[:,start:end]
,in which
start=locked+offset
andend=locked+offset+block
.- Parameters
alpha
: a scalar times onV1*H
inHEMM
operation.beta
: a scalar times onV2
inHEMM
operation.offset
: an offset of number vectors which theHEMM
starting from.block
: number of non-converged eigenvectors, it indicates the number of vectors inV1
andV2
to performHEMM
.locked
: number of converged eigenvectors.
-
bool
postApplication
(T *V, std::size_t block, std::size_t locked) override¶ Copy from buffer rectangular matrix
v1
tov2
. For the implementation of distributed-memory ChASE, this operation performs acopy
from a matrix distributed within each column communicator and redundant among different column communicators to a matrix redundantly distributed across all MPI procs. Then in the next iteration of ChASE-MPI, this operation takes places in the row communicator…- Parameters
V
: the target buffblock
: number of columns to copy fromv1
tov2
locked
: number of converged eigenvectors.
-
void
shiftMatrix
(T c, bool isunshift = false) override¶ This function shifts the diagonal of global matrix with a constant value
c
.- Parameters
c
: shift value
-
void
applyVec
(T *B, T *C) override¶ Performs a Generalized Matrix Vector Multiplication (
GEMV
) withalpha=1.0
andbeta=0.0
.The operation is
C=H*B
.- Parameters
B
: the vector to be multiplied onH
.C
: the vector to store the product ofH
andB
.
-
void
axpy
(std::size_t N, T *alpha, T *x, std::size_t incx, T *y, std::size_t incy) override¶ A
BLAS-like
function which performs a constant times a vector plus a vector.- Parameters
[in] N
: number of elements in input vector(s).[in] alpha
: a scalar times onx
inAXPY
operation.[in] x
: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incx )
.[in] incx
: storage spacing between elements ofx
.[in/out]
: y: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incy )
.[in] incy
: storage spacing between elements ofy
.
-
void
scal
(std::size_t N, T *a, T *x, std::size_t incx) override¶ A
BLAS-like
function which scales a vector by a constant.- Parameters
[in] N
: number of elements in input vector(s).[in] a
: a scalar of typeT
times on vectorx
.[in/out]
: x: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incx )
.[in] incx
: storage spacing between elements ofx
.
-
Base<T>
nrm2
(std::size_t n, T *x, std::size_t incx) override¶ A
BLAS-like
function which returns the euclidean norm of a vector.- Return
the euclidean norm of vector
x
.- Parameters
[in] N
: number of elements in input vector(s).[in] x
: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incx )
.[in] incx
: storage spacing between elements ofx
.
-
T
dot
(std::size_t n, T *x, std::size_t incx, T *y, std::size_t incy) override¶ A
BLAS-like
function which forms the dot product of two vectors.- Return
the dot product of vectors
x
andy
.- Parameters
[in] N
: number of elements in input vector(s).[in] x
: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incx )
.[in] incx
: storage spacing between elements ofx
.[in] y
: an array of typeT
, dimension( 1 + ( N - 1 )*abs( incy )
.[in] incy
: storage spacing between elements ofy
.
-
void