Performance¶
-
class
chase
::
ChasePerfData
¶ ChASE class for collecting data relative to FLOPs, timings, etc.
The ChasePerfData class collects and handles information relative to the execution of the eigensolver. It collects information about
Number of subspace iterations
Number of filtered vectors
Timings of each main algorithmic procedure (Lanczos, Filter, etc.)
Number of FLOPs executed
The number of iterations and filtered vectors can be used to monitor the behavior of the algorithm as it attempts to converge all the desired eigenpairs. The timings and number of FLOPs are use to measure performance, especially parallel performance. The timings are stored in a vector of objects derived by the class template
std::chrono::duration
.Public Functions
-
std::size_t
get_iter_count
()¶ Returns the number of total subspace iterations executed by ChASE.
The S in ChASE stands for Subspace iteration. The main engine under the hood of ChASE is a loop enveloping all the main routines executed by the code. Because of this structure, ChASE is a truly iterative algorithm based on subspace filtering. Counting the number of times such a loop is repeated gives a measure of the effectiveness of the algorithm and it is usually a non-linear function of the spectral distribution. For example, when using the flag
approximate_ = 'true'
to solve a sequence of eigenproblems, one can observe that the number of subspace iteration decreases as a function of sequences index.- Return
The total number of subspace iterations.
-
std::size_t
get_filtered_vecs
()¶ Returns the cumulative number of times each column vector is filtered by one degree.
The most computationally expensive routine of ChASE is the Chebyshev filter. Within the filter a matrix of vectors V is filtered with a varying degree each time a subspace iteration is executed. This counter return the total number of times each vector in V goes through a filtering step. For instance, when the flag
optim_ = false
, such a number roughly corresponds to rank(V) x degree x iter_count. When theoptim_
is set totrue
such a calculation is quite more complicated. Roughly speaking, this counter is useful to monitor the convergence ration of the filtered vectors and together withget_iter_count
convey the effectiveness of the algorithm.- Return
Cumulative number of filtered vectors.
-
std::size_t
get_flops
(std::size_t N)¶ Returns the total number of FLOPs executed by ChASE.
When measuring performance, it is fundamental to understand how many operations a routine executes against the total time to solutions. This counter returns the total amount of operations executed by ChASE and can be used to extract the performance of ChASE and compare it with theoretical peak performance of the platform where the code is executed.
- Return
The total number of operations executed by ChASE.
- Parameters
N
: Size of the eigenproblem matrix
-
std::size_t
get_filter_flops
(std::size_t N)¶ Returns the total number of FLOPs of the Chebyshev filter.
Similar to
get_flops
, this counter return the total number of operations executed by the Chebyshev filter alone. Since the filter is the routine that executes, on average, 80% of the total FLOPs of ChASE, this counter is a good indicator of the performance of the entire algorithm. Because the filter executes almost exclusively BLAS-3 operations, this counter is quite useful to monitor how well the filter is close to the peak performance of the platform where ChASE is executed. This can be quite useful to fine tune the use of the computational resources used.- Return
The total number of operations executed by the polynomial filter.
- Parameters
N
: Size of the eigenproblem matrix
-
void
print
(std::size_t N = 0)¶ Print function outputting counters and timings for all routines.
It prints by default ( for N = 0) in the order,
size of the eigenproblem
total number of subspace iterations executed
total number of filtered vectors
time-to-solution of the following 6 main sections of the ChASE algorithm:
Total time-to-solution
Estimates of the spectral bounds based on Lanczos,
Chebyshev filter,
QR decomposition,
Raleygh-Ritz procedure including the solution of the reduced dense problem,
Computation of the eigenpairs residuals
When the parameter
N
is set to be a number else than zero, the function returns total FLOPs and filter FLOPs, respectively.- Parameters
N
: Control parameter. By default equal to 0.
-
template<class
T
>
classchase
::
PerformanceDecoratorChase
: public chase::Chase<T>¶ A derived class used to extract performance and configuration data.
This is a class derived from the Chase class which plays the role of interface for the kernels used by the library. All members of the Chase class are virtual functions. These functions are re-implemented in the PerformanceDecoratorChase class. All derived members that provide an interface to computational kernels are reimplemented by decorating the original function with time pointers which are members of the ChasePerfData class. All derived members that provide an interface to input or output data are called without any specific decoration. In addition to the virtual member of the Chase class, the PerformanceDecoratorChase class has also among its public members a reference to an object of type ChasePerfData. When using Chase to solve an eigenvalue problem, the members of the PerformanceDecoratorChase are called instead of the virtual functions members of the Chase class. In this way, all parameters and counters are automatically invoked and returned in the correct order.
- See
Public Functions
-
ChaseConfig<T> &
GetConfig
()¶ Return a class which contains the configuration parameters.
-
ChasePerfData &
GetPerfData
()¶