next up previous contents index
Next: Singular Eigenproblems Up: Further Details: Error Bounds Previous: Balancing and Conditioning   Contents   Index


Computing si, $l_{\cal I}$, $r_{\cal I}$ and ${\rm Dif}_u$, ${\rm Dif}_l$

To explain si, $l_{\cal I}$, $r_{\cal I}$, ${\rm Dif}_u$ and ${\rm Dif}_l$ in Table 4.7 and Table 4.8, we need to introduce a condition number for an individual eigenvalue, block diagonalization of a matrix pair and the separation of two matrix pairs.

Let $(\alpha_i, \beta_i) \neq (0, 0)$ be a simple eigenvalue of (A, B) with left and right eigenvectors yi and xi, respectively. si is the reciprocal condition number for a simple eigenvalue of (A, B) [95]:

\begin{displaymath}
s_i = \frac{\sqrt{\vert y^H_iAx_i\vert^2 + \vert y^H_iBx_i\vert^2}}{\Vert x_i\Vert _2\Vert y_i\Vert _2}.
\end{displaymath} (4.11)

Notice that yHiAxi / yHiBxi is equal to $\lambda_i = \alpha_i / \beta_i$. The condition number si in (4.11) is independent of the normalization of the eigenvectors. In the error bound of Table 4.7 for a simple eigenvalue and in (4.10), si is returned as RCONDE(i) by xGGEVX (as S(i) by xTGSNA).

We assume that the matrix pair (A, B) is in the generalized Schur form. Consider a cluster of m eigenvalues, counting multiplicities. Moreover, assume the n-by-n matrix pair (A, B) is

\begin{displaymath}
(A, B) = \left(
\left( \begin{array}{cc} A_{11} & A_{12} \\ ...
...} B_{11} & B_{12} \\ 0 & B_{22} \end{array} \right)
\right) ,
\end{displaymath} (4.12)

where the eigenvalues of the m-by-m matrix pair (A11, B11) are exactly those in which we are interested. In practice, if the eigenvalues on the (block) diagonal of (A, B) are not in the desired order, routine xTGEXC can be used to put the desired ones in the upper left corner as shown [73].

An equivalence transformation that block-diagonalizes (A, B) can be expressed as

\begin{displaymath}
\left( \begin{array}{cc} I_m & -L \\ 0 & I_{n - m} \end{arra...
...ay}{cc} B_{11} & 0 \\ 0 & B_{22} \end{array} \right)
\right).
\end{displaymath} (4.13)

Solving for (L,R) in (4.13) is equivalent to solving the system of linear equations
\begin{displaymath}
\begin{array}{c}
A_{11} R - L A_{22} = - A_{12} \\
B_{11} R - L B_{22} = - B_{12}
\end{array}.
\end{displaymath} (4.14)

Equation (4.14) is called a generalized Sylvester equation [71,75]. Given the generalized Schur form (4.12), we solve equation (4.14) for L and R using the subroutine xTGSYL.

$l_{\cal I}$ and $r_{\cal I}$ for the eigenvalues of (A11, B11) are defined as

\begin{displaymath}
l_{\cal I} = \frac{1}{\sqrt{1 + \Vert L\Vert _F^2}}, \quad
r_{\cal I} = \frac{1}{\sqrt{1 + \Vert R\Vert _F^2}}.
\end{displaymath} (4.15)

In the perturbation theory for the generalized eigenvalue problem, $p = l^{-1}_{\cal I}$ and $q = r^{-1}_{\cal I}$ play the same role as the norm of the spectral projector |P| does for the standard eigenvalue problem in section 4.8.1.3. Indeed, if B = I, then p = q and p equals the norm of the projection onto an invariant subspace of A. For the generalized eigenvalue problem we need both a left and a right projection norm since the left and right deflating subspaces are (usually) different. In Table 4.8, li and $l_{\cal I}$ denote the left projector norm corresponding to an individual eigenvalue pair $({\hat{\alpha}}_i, {\hat{\beta}}_i)$ and a cluster of eigenvalues defined by the subset $\cal I$, respectively. Similar notation is used for ri and $r_{\cal I}$. The values of $l_{\cal I}$ and $r_{\cal I}$ are returned as RCONDE(1) and RCONDE(2) from xGGESX (as PL and PR from xTGSEN).

The separation of two matrix pairs (A11, B11) and (A22, B22) is defined as the smallest singular value of the linear map in (4.14) which takes (L, R) to (A11 R - L A22, B11 R - L B22) [94]:

\begin{displaymath}
{\rm Dif}_u[(A_{11}, B_{11}),(A_{22}, B_{22})] =
\inf_{\Vert...
...} {\Vert(A_{11} R - L A_{22},
B_{11} R - L B_{22})\Vert _F} .
\end{displaymath} (4.16)

${\rm Dif}_u$ is a generalization of the separation between two matrices ( ${\rm sep}(A_{11},A_{22})$ in (4.6)) to two matrix pairs, and it measures the separation of their spectra in the following sense. If (A11, B11) and (A22, B22) have a common eigenvalue, then ${\rm Dif}_u$ is zero, and it is small if there is a small perturbation of either (A11, B11) or (A22, B22) that makes them have a common eigenvalue.

Notice that ${\rm Dif}_u[(A_{22}, B_{22}),(A_{11}, B_{11})]$ does not generally equal ${\rm Dif}_u[(A_{11}, B_{11}),(A_{22}, B_{22})]$ (unless Aii and Bii are symmetric for i = 1, 2). Accordingly, the ordering of the arguments plays a role for the separation of two matrix pairs, while it does not for the separation of two matrices ( ${\rm sep}(A_{11}, A_{22}) = {\rm sep}(A_{22}, A_{11})$). Therefore, we introduce the notation

\begin{displaymath}
{\rm Dif}_l[(A_{11}, B_{11}),(A_{22}, B_{22})] =
{\rm Dif}_u[(A_{22}, B_{22}),(A_{11}, B_{11})] .
\end{displaymath} (4.17)

An associated generalized Sylvester operator (A22 R - L A11, B22 R - L B11) in the definition of ${\rm Dif}_l$ is obtained from block-diagonalizing a regular matrix pair in lower block triangular form, just as the operator (A11 R - L A22, B11 R - L B22) in the definition of ${\rm Dif}_u$ arises from block-diagonalizing a regular matrix pair (4.12) in upper block triangular form.

In the error bounds of Tables 4.7 and 4.8, ${\rm Dif}_l(i)$ and ${\rm Dif}_l({\cal I})$ denote ${\rm Dif}_l[(A_{11},
B_{11}),(A_{22}, B_{22})]$, where (A11, B11) corresponds to an individual eigenvalue pair $({\hat{\alpha}}_i, {\hat{\beta}}_i)$ and a cluster of eigenvalues defined by the subset $\cal I$, respectively. Similar notation is used for ${\rm Dif}_u(i)$ and ${\rm Dif}_u({\cal I})$. xGGESX reports estimates of ${\rm Dif}_u({\cal I})$ and ${\rm Dif}_l({\cal I})$ in RCONDV(1) and RCONDV(2) (DIF(1) and DIF(2) in xTGSEN), respectively.

From a matrix representation of (4.14) it is possible to formulate an exact expression of ${\rm Dif}_u$ as

\begin{displaymath}
{\rm Dif}_u[(A_{11}, B_{11}),(A_{22}, B_{22})]
= \sigma_{\min} (Z_u)
= {\rm Dif}_u,
\end{displaymath} (4.18)

where Zu is the 2m(n - m)-by-2m(n - m) matrix

\begin{displaymath}
Z_u = \left( \begin{array}{cc} I_{n-m} \otimes A_{11} & -A_{...
...\otimes B_{11} & -B_{22}^T \otimes I_{m} \end{array} \right) ,
\end{displaymath}

and $\otimes$ is the Kronecker product. A method based directly on forming Zu is generally impractical, since Zu can be as large as n2/2 x n2/2. Thus we would require as much as O(n4) extra workspace and O(n6) operations, much more than the estimation methods that we now describe.

We instead compute an estimate of ${\rm Dif}_u$ as the reciprocal value of an estimate of ${\rm Dif}_u^{-1} = {\Vert Z_u^{-1}\Vert}_2 = 1 / \sigma_{\min} (Z_u)$, where Zu is the matrix representation of the generalized Sylvester operator. It is possible to estimate ${\Vert Z_u^{-1}\Vert}_2$ by solving generalized Sylvester equations in triangular form. We provide both Frobenius norm and one norm ${\rm Dif}_u$ estimates [74]. The one norm estimate makes the condition estimation uniform with the nonsymmetric eigenvalue problem. The Frobenius norm estimate offers a low cost and equally reliable estimator. The one norm estimate is a factor 3 to 10 times more expensive [74]. From the definition of ${\rm Dif}_l$ (4.17) we see that ${\rm Dif}_l$ estimates can be computed by using the algorithms for estimating ${\rm Dif}_u$.

Frobenius norm estimate: From the Zux = b representation of the generalized Sylvester equation (4.14) we get a lower bound on ${\rm Dif}_u^{-1}$:

\begin{displaymath}
{\Vert(L, R)\Vert}_F / {\Vert(C, F)\Vert}_F = {\Vert x\Vert}_2 / {\Vert b\Vert}_2 \leq {\Vert Z_u^{-1}\Vert}_2 .
\end{displaymath} (4.19)

To get an improved estimate we try to choose right hand sides (C, F) such that the associated solution (L, R) has as large norm as possible, giving the ${\rm Dif}_u$ estimator
\begin{displaymath}
{\tt DIF(1)} \equiv {\Vert(C, F)\Vert}_F / {\Vert(L, R)\Vert}_F .
\end{displaymath} (4.20)

Methods for computing such (C, F) are described in [75,74]. The work to compute DIF(1) is comparable to solve a generalized Sylvester equation, which costs only 2m2(n-m) + 2m(n-m)2 operations if the matrix pairs are in generalized Schur form. DIF(2) is the Frobenius norm ${\rm Dif}_l$ estimate.

One norm norm estimate: From the relationship

\begin{displaymath}
\frac{1}{\sqrt{2m(n-m)}} {\Vert Z_u^{-1}\Vert}_{1} \leq {\Ve...
..._u^{-1}\Vert}_2 \leq \sqrt{2m(n-m)}
{\Vert Z_u^{-1}\Vert}_{1},
\end{displaymath} (4.21)

we know that ${\Vert Z_u^{-1}\Vert}_{1}$ can never differ more than a factor $\sqrt{2m(n -m)}$ from ${\Vert Z_u^{-1}\Vert}_2$. So it makes sense to compute an one norm estimate of ${\rm Dif}_u$. xLACON implements a method for estimating the one norm of a square matrix, using reverse communication for evaluating matrix and vector products [59,64]. We apply this method to ${\Vert Z_u^{-1}\Vert}_{1}$ by providing the solution vectors x and y of Zux = z and a transposed system ZuTy = z, where z is determined by xLACON. In each step only one of these generalized Sylvester equations is solved using blocked algorithms [74]. xLACON returns v and ${\tt EST}$ such that Zu-1w = v and ${\tt EST}
= {\Vert v\Vert}_{1}/ {\Vert w\Vert}_{1} \leq {\Vert Z_u^{-1}\Vert}_{1}$, resulting in the one-norm-based estimate
\begin{displaymath}
{\tt DIF(1)} \equiv 1 / {\tt EST} .
\end{displaymath} (4.22)

The cost for computing this bound is roughly equal to the number of steps in the reverse communication times the cost for one generalized Sylvester solve. DIF(2) is the one norm ${\rm Dif}_l$ estimate.

The expert driver routines xGGEVX and xGGESX compute the Frobenius norm estimate (4.20). The routine xTGSNA also computes the Frobenius norm estimate (4.20) of ${\rm Dif}_u$ and ${\rm Dif}_l$. The routine xTGSEN optionally computes the Frobenius norm estimate (4.20) or the one norm estimate (4.22). The choice of estimate is controlled by the input parameter IJOB.


next up previous contents index
Next: Singular Eigenproblems Up: Further Details: Error Bounds Previous: Balancing and Conditioning   Contents   Index
Susan Blackford
1999-10-01