Assignment 3: Linear Algebra HPC libraries

 

 Krister Dackland, Erik Elmroth, Robert Granat and Bo Kågström


 
 
 
In this assignment you will practice and learn how to use the ScaLAPACK library.
You will develop a program that solves an overdetermined system by computing the least square solution.
You will  implement the solution using Fortran 77 (take some time to learn it), measure the performance, and check the correctness of the computed components.

Notice: The assignment will be performed as a common exercise together with one or two of the teachers at Monday 23rd of April. More information about time and location will come.

Remember to use the account TDBD08-VT07 in your submit file. But don't try to allocate more than 64 nodes in your submit-file.

Assignment 3.1: Introduction to Seth at HPC2N
Log on, Compile, and Run a ScaLAPACK program - if you know how to do this, skip this section and go on to 3.2

  • log on to Seth at HPC2N
  • make a directory (make sure you are at /kfs$HOME): mkdir ScaLAPACK
    • change directory: cd ScaLAPACK
    • download a makefile: cp ~granat/Public/ScaLAPACK/myfirst/Makefile .
    • download the source code: cp ~granat/Public/ScaLAPACK/myfirst/myfirst.f .
    • download the submit file (you might need to change it to suit your directories): cp ~granat/Public/ScaLAPACK/myfirst/submit .
  • make the executable: make
  • run the program: qsub submit
  • check the batch queue: showq | more
  • the result shows up in your PBS_O_WORKDIR (that is from where you submitted the job).

Assignment 3.2: Write a main program that initialize ScaLAPACK
Write a program
TryScaLAPACK (here is a template Fortran77 code with make- and submitfiles), that perform the following tasks:

  • initialize BLACS, the BLACS QRef
  • node 0 reads an input file containing three matrix dimension (M, N, K) and a block size (NB)
  • node 0 distributes the information from the input file
  • all nodes init a descriptor of a matrix A of size MxN (M > N) use ScaLAPACK routine DESCINIT
  • all nodes init a descriptor of a matrix B of size MxK (M > K) use ScaLAPACK routineDESCINIT
  • all nodes generate their part of the distributed random matrices A and B, use FORTRAN routine DRAND48()
·                DOUBLE PRECISION DRAND48, X
·                EXTERNAL         DRAND48
·                X = DRAND48()
  • To compute the local number of rows/columns of a distributed matrix use for example ScaLAPACK tool routine numroc.f
  • release the process grid and terminate BLACS
  • For further information about the descriptor and other related issues study this

Assignment 3.3: Solve the overdetermined system AX = B
Assumptions: A is MxN, B is MxK, where K is the number of right hand sides.
Therefore, the requested solution X is NxK.

One way to solve an overdetermined system is to compute the least squares solution.
That is, to find the solution that minimizes ||AX - B|| (2-norm).

Perform the following steps to solve the overdetermined system:

  • QR factorize the A matrix ->  Q matrix of size MxM  and  R upper-trapezoidal matrix of size MxN with non-zero elements only in the top NxN submatrix (Use The ScaLAPACK Users Guide, SLUG or the list of double scalapack routines to find an appropriate QR factorization routine).
  • Apply Q' onto B (Use PDORMQR).
  • Now you may use the PBLAS routine PDTRSM to compute X <- inv(R)*B, store X in the NxN top of B.

Assignment 3.4: Write your own "ScaLAPACK" routine
Write a  routine
PDFROB that compute the frobenius norm (F-norm) of a matrix A.
frobenius norm = sqrt(sum(abs(a(i,j)^2))), i = 1..M, j = 1..N

Assignment 3.5: Check result and measure performance
Use all building blocks from earlier assignments to:

  • measure the performance of the QR-factorization routine in Mflops for different matrix and grid sizes (#flops = 4N^3/3),
  • use your PDFROB and the PBLAS routine PDGEMM to compute the residual norm AX - B (we use the F-norm since it is easy to compute),
  • report the performance and the residual norms for the different grid and problem sizes.

Tips, Tricks and Links


HPC2N and Department of Computing Science


Umeå University, S-901 87 Umeå, Sweden
Email: larsk@cs.umu.se
Last updated 060314 by Lars Karlsson