lsd_mkb/lsd_mkb_superlu/superlu_seq/sgsisx.c File Reference

Computes an approximate solutions of linear equations A*X=B or A'*X=B. More...

#include "slu_sdefs.h"

Include dependency graph for sgsisx.c:

Functions
void	sgsisx (superlu_options_t options, SuperMatrix A, int perm_c, int perm_r, int etree, char equed, float R, float C, SuperMatrix L, SuperMatrix U, void work, int lwork, SuperMatrix B, SuperMatrix X, float recip_pivot_growth, float rcond, GlobalLU_t Glu, mem_usage_t mem_usage, SuperLUStat_t stat, int *info)

Detailed Description

Computes an approximate solutions of linear equations A*X=B or A'*X=B.

 -- SuperLU routine (version 4.2) --
 Lawrence Berkeley National Laboratory.
 November, 2010
 August, 2011

Function Documentation

void sgsisx	(	superlu_options_t *	options,
		SuperMatrix *	A,
		int *	perm_c,
		int *	perm_r,
		int *	etree,
		char *	equed,
		float *	R,
		float *	C,
		SuperMatrix *	L,
		SuperMatrix *	U,
		void *	work,
		int	lwork,
		SuperMatrix *	B,
		SuperMatrix *	X,
		float *	recip_pivot_growth,
		float *	rcond,
		GlobalLU_t *	Glu,
		mem_usage_t *	mem_usage,
		SuperLUStat_t *	stat,
		int *	info
	)

 Purpose
 =======

 SGSISX computes an approximate solutions of linear equations
 A*X=B or A'*X=B, using the ILU factorization from sgsitrf().
 An estimation of the condition number is provided. 
 The routine performs the following steps:

   1. If A is stored column-wise (A->Stype = SLU_NC):

	1.1. If options->Equil = YES or options->RowPerm = LargeDiag, scaling
	     factors are computed to equilibrate the system:
	     options->Trans = NOTRANS:
		 diag(R)*A*diag(C) *inv(diag(C))*X = diag(R)*B
	     options->Trans = TRANS:
		 (diag(R)*A*diag(C))**T *inv(diag(R))*X = diag(C)*B
	     options->Trans = CONJ:
		 (diag(R)*A*diag(C))**H *inv(diag(R))*X = diag(C)*B
	     Whether or not the system will be equilibrated depends on the
	     scaling of the matrix A, but if equilibration is used, A is
	     overwritten by diag(R)*A*diag(C) and B by diag(R)*B
	     (if options->Trans=NOTRANS) or diag(C)*B (if options->Trans
	     = TRANS or CONJ).

	1.2. Permute columns of A, forming A*Pc, where Pc is a permutation
	     matrix that usually preserves sparsity.
	     For more details of this step, see sp_preorder.c.

	1.3. If options->Fact != FACTORED, the LU decomposition is used to
	     factor the matrix A (after equilibration if options->Equil = YES)
	     as Pr*A*Pc = L*U, with Pr determined by partial pivoting.

	1.4. Compute the reciprocal pivot growth factor.

	1.5. If some U(i,i) = 0, so that U is exactly singular, then the
	     routine fills a small number on the diagonal entry, that is
		U(i,i) = ||A(:,i)||_oo * options->ILU_FillTol ** (1 - i / n),
	     and info will be increased by 1. The factored form of A is used
	     to estimate the condition number of the preconditioner. If the
	     reciprocal of the condition number is less than machine precision,
	     info = A->ncol+1 is returned as a warning, but the routine still
	     goes on to solve for X.

	1.6. The system of equations is solved for X using the factored form
	     of A.

	1.7. options->IterRefine is not used

	1.8. If equilibration was used, the matrix X is premultiplied by
	     diag(C) (if options->Trans = NOTRANS) or diag(R)
	     (if options->Trans = TRANS or CONJ) so that it solves the
	     original system before equilibration.

	1.9. options for ILU only
	     1) If options->RowPerm = LargeDiag, MC64 is used to scale and
		permute the matrix to an I-matrix, that is Pr*Dr*A*Dc has
		entries of modulus 1 on the diagonal and off-diagonal entries
		of modulus at most 1. If MC64 fails, dgsequ() is used to
		equilibrate the system.
              ( Default: LargeDiag )
	     2) options->ILU_DropTol = tau is the threshold for dropping.
		For L, it is used directly (for the whole row in a supernode);
		For U, ||A(:,i)||_oo * tau is used as the threshold
	        for the	i-th column.
		If a secondary dropping rule is required, tau will
	        also be used to compute the second threshold.
              ( Default: 1e-4 )
	     3) options->ILU_FillFactor = gamma, used as the initial guess
		of memory growth.
		If a secondary dropping rule is required, it will also
              be used as an upper bound of the memory.
              ( Default: 10 )
	     4) options->ILU_DropRule specifies the dropping rule.
		Option	      Meaning
		======	      ===========
		DROP_BASIC:   Basic dropping rule, supernodal based ILUTP(tau).
		DROP_PROWS:   Supernodal based ILUTP(p,tau), p = gamma*nnz(A)/n.
		DROP_COLUMN:  Variant of ILUTP(p,tau), for j-th column,
			      p = gamma * nnz(A(:,j)).
		DROP_AREA:    Variation of ILUTP, for j-th column, use
			      nnz(F(:,1:j)) / nnz(A(:,1:j)) to control memory.
		DROP_DYNAMIC: Modify the threshold tau during factorizaion:
			      If nnz(L(:,1:j)) / nnz(A(:,1:j)) > gamma
				  tau_L(j) := MIN(tau_0, tau_L(j-1) * 2);
			      Otherwise
				  tau_L(j) := MAX(tau_0, tau_L(j-1) / 2);
			      tau_U(j) uses the similar rule.
			      NOTE: the thresholds used by L and U are separate.
		DROP_INTERP:  Compute the second dropping threshold by
			      interpolation instead of sorting (default).
			      In this case, the actual fill ratio is not
			      guaranteed smaller than gamma.
		DROP_PROWS, DROP_COLUMN and DROP_AREA are mutually exclusive.
		( Default: DROP_BASIC | DROP_AREA )
	     5) options->ILU_Norm is the criterion of measuring the magnitude
		of a row in a supernode of L. ( Default is INF_NORM )
		options->ILU_Norm	RowSize(x[1:n])
		=================	===============
		ONE_NORM		||x||_1 / n
		TWO_NORM		||x||_2 / sqrt(n)
		INF_NORM		max{|x[i]|}
	     6) options->ILU_MILU specifies the type of MILU's variation.
		= SILU: do not perform Modified ILU;
		= SMILU_1 (not recommended):
		    U(i,i) := U(i,i) + sum(dropped entries);
		= SMILU_2:
		    U(i,i) := U(i,i) + SGN(U(i,i)) * sum(dropped entries);
		= SMILU_3:
		    U(i,i) := U(i,i) + SGN(U(i,i)) * sum(|dropped entries|);
		NOTE: Even SMILU_1 does not preserve the column sum because of
		late dropping.
              ( Default: SILU )
	     7) options->ILU_FillTol is used as the perturbation when
		encountering zero pivots. If some U(i,i) = 0, so that U is
		exactly singular, then
		   U(i,i) := ||A(:,i)|| * options->ILU_FillTol ** (1 - i / n).
              ( Default: 1e-2 )

   2. If A is stored row-wise (A->Stype = SLU_NR), apply the above algorithm
	to the transpose of A:

	2.1. If options->Equil = YES or options->RowPerm = LargeDiag, scaling
	     factors are computed to equilibrate the system:
	     options->Trans = NOTRANS:
		 diag(R)*A*diag(C) *inv(diag(C))*X = diag(R)*B
	     options->Trans = TRANS:
		 (diag(R)*A*diag(C))**T *inv(diag(R))*X = diag(C)*B
	     options->Trans = CONJ:
		 (diag(R)*A*diag(C))**H *inv(diag(R))*X = diag(C)*B
	     Whether or not the system will be equilibrated depends on the
	     scaling of the matrix A, but if equilibration is used, A' is
	     overwritten by diag(R)*A'*diag(C) and B by diag(R)*B
	     (if trans='N') or diag(C)*B (if trans = 'T' or 'C').

	2.2. Permute columns of transpose(A) (rows of A),
	     forming transpose(A)*Pc, where Pc is a permutation matrix that
	     usually preserves sparsity.
	     For more details of this step, see sp_preorder.c.

	2.3. If options->Fact != FACTORED, the LU decomposition is used to
	     factor the transpose(A) (after equilibration if
	     options->Fact = YES) as Pr*transpose(A)*Pc = L*U with the
	     permutation Pr determined by partial pivoting.

	2.4. Compute the reciprocal pivot growth factor.

	2.5. If some U(i,i) = 0, so that U is exactly singular, then the
	     routine fills a small number on the diagonal entry, that is
		 U(i,i) = ||A(:,i)||_oo * options->ILU_FillTol ** (1 - i / n).
	     And info will be increased by 1. The factored form of A is used
	     to estimate the condition number of the preconditioner. If the
	     reciprocal of the condition number is less than machine precision,
	     info = A->ncol+1 is returned as a warning, but the routine still
	     goes on to solve for X.

	2.6. The system of equations is solved for X using the factored form
	     of transpose(A).

	2.7. If options->IterRefine is not used.

	2.8. If equilibration was used, the matrix X is premultiplied by
	     diag(C) (if options->Trans = NOTRANS) or diag(R)
	     (if options->Trans = TRANS or CONJ) so that it solves the
	     original system before equilibration.

   See supermatrix.h for the definition of 'SuperMatrix' structure.

 Arguments
 =========

 options (input) superlu_options_t*
	   The structure defines the input parameters to control
	   how the LU decomposition will be performed and how the
	   system will be solved.

 A	   (input/output) SuperMatrix*
	   Matrix A in A*X=B, of dimension (A->nrow, A->ncol). The number
	   of the linear equations is A->nrow. Currently, the type of A can be:
	   Stype = SLU_NC or SLU_NR, Dtype = SLU_S, Mtype = SLU_GE.
	   In the future, more general A may be handled.

	   On entry, If options->Fact = FACTORED and equed is not 'N',
	   then A must have been equilibrated by the scaling factors in
	   R and/or C.
	   On exit, A is not modified
         if options->Equil = NO, or
         if options->Equil = YES but equed = 'N' on exit, or
         if options->RowPerm = NO.

	   Otherwise, if options->Equil = YES and equed is not 'N',
	   A is scaled as follows:
	   If A->Stype = SLU_NC:
	     equed = 'R':  A := diag(R) * A
	     equed = 'C':  A := A * diag(C)
	     equed = 'B':  A := diag(R) * A * diag(C).
	   If A->Stype = SLU_NR:
	     equed = 'R':  transpose(A) := diag(R) * transpose(A)
	     equed = 'C':  transpose(A) := transpose(A) * diag(C)
	     equed = 'B':  transpose(A) := diag(R) * transpose(A) * diag(C).

         If options->RowPerm = LargeDiag, MC64 is used to scale and permute
            the matrix to an I-matrix, that is A is modified as follows:
            P*Dr*A*Dc has entries of modulus 1 on the diagonal and 
            off-diagonal entries of modulus at most 1. P is a permutation
            obtained from MC64.
            If MC64 fails, sgsequ() is used to equilibrate the system,
            and A is scaled as above, but no permutation is involved.
            On exit, A is restored to the orginal row numbering, so
            Dr*A*Dc is returned.

 perm_c  (input/output) int*
	   If A->Stype = SLU_NC, Column permutation vector of size A->ncol,
	   which defines the permutation matrix Pc; perm_c[i] = j means
	   column i of A is in position j in A*Pc.
	   On exit, perm_c may be overwritten by the product of the input
	   perm_c and a permutation that postorders the elimination tree
	   of Pc'*A'*A*Pc; perm_c is not changed if the elimination tree
	   is already in postorder.

	   If A->Stype = SLU_NR, column permutation vector of size A->nrow,
	   which describes permutation of columns of transpose(A) 
	   (rows of A) as described above.

 perm_r  (input/output) int*
	   If A->Stype = SLU_NC, row permutation vector of size A->nrow, 
	   which defines the permutation matrix Pr, and is determined
	   by MC64 first then followed by partial pivoting.
         perm_r[i] = j means row i of A is in position j in Pr*A.

	   If A->Stype = SLU_NR, permutation vector of size A->ncol, which
	   determines permutation of rows of transpose(A)
	   (columns of A) as described above.

	   If options->Fact = SamePattern_SameRowPerm, the pivoting routine
	   will try to use the input perm_r, unless a certain threshold
	   criterion is violated. In that case, perm_r is overwritten by a
	   new permutation determined by partial pivoting or diagonal
	   threshold pivoting.
	   Otherwise, perm_r is output argument.

 etree   (input/output) int*,  dimension (A->ncol)
	   Elimination tree of Pc'*A'*A*Pc.
	   If options->Fact != FACTORED and options->Fact != DOFACT,
	   etree is an input argument, otherwise it is an output argument.
	   Note: etree is a vector of parent pointers for a forest whose
	   vertices are the integers 0 to A->ncol-1; etree[root]==A->ncol.

 equed   (input/output) char*
	   Specifies the form of equilibration that was done.
	   = 'N': No equilibration.
	   = 'R': Row equilibration, i.e., A was premultiplied by diag(R).
	   = 'C': Column equilibration, i.e., A was postmultiplied by diag(C).
	   = 'B': Both row and column equilibration, i.e., A was replaced 
		  by diag(R)*A*diag(C).
	   If options->Fact = FACTORED, equed is an input argument,
	   otherwise it is an output argument.

 R	   (input/output) float*, dimension (A->nrow)
	   The row scale factors for A or transpose(A).
	   If equed = 'R' or 'B', A (if A->Stype = SLU_NC) or transpose(A)
	       (if A->Stype = SLU_NR) is multiplied on the left by diag(R).
	   If equed = 'N' or 'C', R is not accessed.
	   If options->Fact = FACTORED, R is an input argument,
	       otherwise, R is output.
	   If options->Fact = FACTORED and equed = 'R' or 'B', each element
	       of R must be positive.

 C	   (input/output) float*, dimension (A->ncol)
	   The column scale factors for A or transpose(A).
	   If equed = 'C' or 'B', A (if A->Stype = SLU_NC) or transpose(A)
	       (if A->Stype = SLU_NR) is multiplied on the right by diag(C).
	   If equed = 'N' or 'R', C is not accessed.
	   If options->Fact = FACTORED, C is an input argument,
	       otherwise, C is output.
	   If options->Fact = FACTORED and equed = 'C' or 'B', each element
	       of C must be positive.

 L	   (output) SuperMatrix*
	   The factor L from the factorization
	       Pr*A*Pc=L*U		(if A->Stype SLU_= NC) or
	       Pr*transpose(A)*Pc=L*U	(if A->Stype = SLU_NR).
	   Uses compressed row subscripts storage for supernodes, i.e.,
	   L has types: Stype = SLU_SC, Dtype = SLU_S, Mtype = SLU_TRLU.

 U	   (output) SuperMatrix*
	   The factor U from the factorization
	       Pr*A*Pc=L*U		(if A->Stype = SLU_NC) or
	       Pr*transpose(A)*Pc=L*U	(if A->Stype = SLU_NR).
	   Uses column-wise storage scheme, i.e., U has types:
	   Stype = SLU_NC, Dtype = SLU_S, Mtype = SLU_TRU.

 work    (workspace/output) void*, size (lwork) (in bytes)
	   User supplied workspace, should be large enough
	   to hold data structures for factors L and U.
	   On exit, if fact is not 'F', L and U point to this array.

 lwork   (input) int
	   Specifies the size of work array in bytes.
	   = 0:  allocate space internally by system malloc;
	   > 0:  use user-supplied work array of length lwork in bytes,
		 returns error if space runs out.
	   = -1: the routine guesses the amount of space needed without
		 performing the factorization, and returns it in
		 mem_usage->total_needed; no other side effects.

	   See argument 'mem_usage' for memory usage statistics.

 B	   (input/output) SuperMatrix*
	   B has types: Stype = SLU_DN, Dtype = SLU_S, Mtype = SLU_GE.
	   On entry, the right hand side matrix.
	   If B->ncol = 0, only LU decomposition is performed, the triangular
			   solve is skipped.
	   On exit,
	      if equed = 'N', B is not modified; otherwise
	      if A->Stype = SLU_NC:
		 if options->Trans = NOTRANS and equed = 'R' or 'B',
		    B is overwritten by diag(R)*B;
		 if options->Trans = TRANS or CONJ and equed = 'C' of 'B',
		    B is overwritten by diag(C)*B;
	      if A->Stype = SLU_NR:
		 if options->Trans = NOTRANS and equed = 'C' or 'B',
		    B is overwritten by diag(C)*B;
		 if options->Trans = TRANS or CONJ and equed = 'R' of 'B',
		    B is overwritten by diag(R)*B.

 X	   (output) SuperMatrix*
	   X has types: Stype = SLU_DN, Dtype = SLU_S, Mtype = SLU_GE.
	   If info = 0 or info = A->ncol+1, X contains the solution matrix
	   to the original system of equations. Note that A and B are modified
	   on exit if equed is not 'N', and the solution to the equilibrated
	   system is inv(diag(C))*X if options->Trans = NOTRANS and
	   equed = 'C' or 'B', or inv(diag(R))*X if options->Trans = 'T' or 'C'
	   and equed = 'R' or 'B'.

 recip_pivot_growth (output) float*
	   The reciprocal pivot growth factor max_j( norm(A_j)/norm(U_j) ).
	   The infinity norm is used. If recip_pivot_growth is much less
	   than 1, the stability of the LU factorization could be poor.

 rcond   (output) float*
	   The estimate of the reciprocal condition number of the matrix A
	   after equilibration (if done). If rcond is less than the machine
	   precision (in particular, if rcond = 0), the matrix is singular
	   to working precision. This condition is indicated by a return
	   code of info > 0.

mem_usage (output) mem_usage_t* Record the memory usage statistics, consisting of following fields:

for_lu (float) The amount of space used in bytes for L data structures.
total_needed (float) The amount of space needed in bytes to perform factorization.
expansions (int) The number of memory expansions during the LU factorization.

 stat   (output) SuperLUStat_t*
	  Record the statistics on runtime and floating-point operation count.
	  See slu_util.h for the definition of 'SuperLUStat_t'.

 info    (output) int*
	   = 0: successful exit
	   < 0: if info = -i, the i-th argument had an illegal value
	   > 0: if info = i, and i is
		<= A->ncol: number of zero pivots. They are replaced by small
		      entries due to options->ILU_FillTol.
		= A->ncol+1: U is nonsingular, but RCOND is less than machine
		      precision, meaning that the matrix is singular to
		      working precision. Nevertheless, the solution and
		      error bounds are computed because there are a number
		      of situations where the computed solution can be more
		      accurate than the value of RCOND would suggest.
		> A->ncol+1: number of bytes allocated when memory allocation
		      failure occurred, plus A->ncol.

Here is the call graph for this function:

lsd_mkb/lsd_mkb_superlu/superlu_seq/sgsisx.c File Reference

Functions

Detailed Description

Function Documentation