Let A be a general real m-by-n matrix. The singular value decomposition (SVD) of A is the factorization , where U and V are orthogonal, and , , with . If A is complex, then its SVD is where U and V are unitary, and is as before with real diagonal elements. The are called the singular values, the first r columns of V the right singular vectors and the first r columns of U the left singular vectors.
The routines described in this section, and listed in Table 2.12, are used to compute this decomposition. The computation proceeds in the following stages:
The reduction to bidiagonal form is performed by the subroutine xGEBRD, or by xGBBRD for a band matrix.
The routine xGEBRD represents U1 and V1 in factored form as products of elementary reflectors, as described in section 5.4. If A is real, the matrices U1 and V1 may be computed explicitly using routine xORGBR, or multiplied by other matrices without forming U1 and V1 using routine xORMBR. If A is complex, one instead uses xUNGBR and xUNMBR, respectively.
If A is banded and xGBBRD is used to reduce it to bidiagonal form, U1 and V1 are determined as products of Givens rotations , rather than as products of elementary reflectors. If U1 or V1 is required, it must be formed explicitly by xGBBRD. xGBBRD uses a vectorizable algorithm, similar to that used by xSBTRD (see Kaufman [77]). xGBBRD may be much faster than xGEBRD when the bandwidth is narrow.
The SVD of the bidiagonal matrix is computed either by subroutine xBDSQR or by subroutine xBDSDC. xBDSQR uses QR iteration when singular vectors are desired [32,23], and otherwise uses the dqds algorithm [51]. xBDSQR is more accurate than its counterparts in LINPACK and EISPACK: barring underflow and overflow, it computes all the singular values of B to nearly full relative precision, independent of their magnitudes. It also computes the singular vectors much more accurately. See section 4.9 and [32,23,51] for details.
xBDSDC uses a variation of Cuppen's divide and conquer algorithm to find singular values and singular vectors [69,58]. It is much faster than xBDSQR if singular vectors of large matrices are desired. When singular values only are desired, it uses the same dqds algorithm as xBDSQR [51]. Divide-and-conquer is not guaranteed to compute singular values to nearly full relative precision, but in practice xBDSDC is often at least as accurate as xBDSQR. xBDSDC represents the singular vector matrices U2 and V2 in a compressed format requiring only space instead of n2. U2 and V2 may subsequently be generated explicitly using routine xLASDQ, or multiplied by vectors without forming them explicitly using routine xLASD0.
If , it may be more efficient to first perform a QR factorization of A, using the routine xGEQRF , and then to compute the SVD of the n-by-n matrix R, since if A = QR and , then the SVD of A is given by . Similarly, if , it may be more efficient to first perform an LQ factorization of A, using xGELQF. These preliminary QR and LQ factorizations are performed by the drivers xGESVD and xGESDD.
The SVD may be used to find a minimum norm solution to a (possibly)
rank-deficient linear least squares
problem (2.1). The effective rank, k, of A can be determined as the
number of singular values which exceed a suitable threshold.
Let
be the leading k-by-k submatrix of ,
and
be the matrix consisting of the first k columns of V.
Then the solution is given by: