2. Institute of Automation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China
This paper is concerned with the solution of least-square problem
$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|\;\;\;\mathit{\boldsymbol{A}} \in {{\bf{R}}^{n \times n}},\mathit{\boldsymbol{x}} \in {{\bf{R}}^n},b \in {{\bf{R}}^n} $ | (1) |
with a large square matrix A of ill-determined rank.In particular, such a matrix is severely ill-conditioned and may be singular by which its singular values decrease to zero gradually and without obvious interval.The vector b represents the available data that is usually with a discrete error or measurement error e∈Rn, i.e.
$ \mathit{\boldsymbol{b}} = \mathit{\boldsymbol{\hat b}} + \mathit{\boldsymbol{e}} $ | (2) |
where
$ \mathit{\boldsymbol{Ax}} = \mathit{\boldsymbol{\hat b}} $ | (3) |
is consistent and
In view of the ill-condition of A and the error e in b, the straightforward solution generally yields a meaningless approximation, so it is essential that the computation is stabilized by regularization.Tikhonov regularization is one of the most popular regularization methods for properties and application.Based on Tikhonov regularization, we consider a penalized least-squares problem
$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\{ {{{\left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{Lx}}} \right\|}^2}} \right\} $ | (4) |
where the scalar μ>0 is referred to the regularization parameter and the matrix L∈Rl×n is the regularization operator[2-3].The method of this paper requires L to be a square matrix.Calvetti et al.[4]and Hansen et al.[5] described a variety of square regularization operators.For the purpose of obtaining an accurate approximate solution of
$ N\left( \mathit{\boldsymbol{A}} \right) \cap N\left( \mathit{\boldsymbol{L}} \right) = \left\{ 0 \right\} $ |
Then the Tikhonov minimization problem (4) has the unique solution
$ {\mathit{\boldsymbol{x}}_\mu }: = {\left( {{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{A}} + {\mu ^{ - 1}}{\mathit{\boldsymbol{L}}^{\rm{T}}}\mathit{\boldsymbol{L}}} \right)^{ - 1}}{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{b}} $ | (5) |
for any μ>0 and the superscript"Τ"denotes transposition of the matrix[6].
This paper solves the minimization problem (4) by simplifying it to standard form as well as uses a fractional power of the matrix
In this section, we discuss the method which combines the fractional matrices and orthogonal projection operators.Projection fractional Tikhonov regularization provides that the penalized least-squares problem (4) can be simplified to standard form and uses a fractional power as weighting matrix to measure the residual error in standard with a semi-norm.
1.1 Form simplificationThe penalized least-squares problem (4) can be simplified to standard form with the orthogonal projection
$ \mathit{\boldsymbol{L}}: = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}}\;\;\;\;\mathit{\boldsymbol{P}} \in {{\bf{R}}^{n \times l}},{\mathit{\boldsymbol{P}}^{\rm{T}}}\mathit{\boldsymbol{P}} = \mathit{\boldsymbol{I}} $ | (6) |
which is well suited for using in Tikhonov regularization.In Eq.(6), L is used as regularization operator.It is convenient to consider the relation of the choice of the matrix L and the matrix P, and actually the choice of P determines the choice of L.Moreover, the choice of matrix P can be carried out in many different ways, some of which may yield regularization operators, and they can give more accurate approximations of
Give the A-weighted pseudo-inverse of L as
$ \mathit{\boldsymbol{L}}_A^\dagger : = \left( {\mathit{\boldsymbol{I}} - {{\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)}^\dagger }\mathit{\boldsymbol{A}}} \right){\mathit{\boldsymbol{L}}^\dagger } \in {{\bf{R}}^{n \times l}} $ | (7) |
where L†∈Rn×l denotes the Moore-Penrose pseudoinverse of the regularization operator L, and I is the identity matrix.
Suppose that Eq.(6) holds and introduce the QR-factorization shown as
$ \mathit{\boldsymbol{AP}} = \mathit{\boldsymbol{QR}} $ | (8) |
where R∈Rl×l is upper triangular and Q∈Rn×l has orthonormal columns.Using the properties of the Moore-Penrose pseudo-inverse and orthogonal projection, we have the following identities for L
$ \mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}} = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}},{\mathit{\boldsymbol{L}}^\dagger } = \mathit{\boldsymbol{L}} $ | (9) |
So yield that
$ {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger } = {\left( {\mathit{\boldsymbol{AP}}{\mathit{\boldsymbol{P}}^{\rm{T}}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\left( {\mathit{\boldsymbol{AP}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}} $ | (10) |
Substituting Eqs.(8), (10) into Eq.(7), we get
$ \mathit{\boldsymbol{L}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}}} \right)\mathit{\boldsymbol{L}} $ | (11) |
which simplifies to
$ \mathit{\boldsymbol{L}}_A^\dagger = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}} $ | (12) |
Transforming the matrix and vectors of Tikhonov minimization problem (4) by the following substitutions
$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger $ | (13) |
$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} $ | (14) |
where
$ {\mathit{\boldsymbol{x}}_0}: = {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger }\mathit{\boldsymbol{b}} $ | (15) |
When L is an orthogonal projection operator, Eqs.(13), (14), can be expressed in a simple manner as
$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{A}} $ | (16) |
$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{b}} $ | (17) |
Let
$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $ | (18) |
An attractive property of this transformation is that the LA† defined by Eq.(7) is of simple form which makes the orthogonal projection (6) easy to use.For any μ>0, let λ=1/μ, and then the minimization problem of Eq.(18) is
$ F\left( \lambda \right): = {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{\tilde b}}} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2} = \min {J_\lambda }\left( {\mathit{\boldsymbol{\tilde x}}} \right) $ | (19) |
Given any λ>0, x(λ) has a certain value and is satisfied as
$ \lambda \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) = {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ | (20) |
Then
$ F\left( 0 \right): = \inf {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|^2}: = {\omega ^2} $ | (21) |
is defined.Consequently, F(λ) is continuous in [0, ∞), and some properties of F(λ) are given in the following.
Proposition 1 F(λ) is infinitely differentiable, and has the following properties:
(1)
(2) For any λ>0, the first and second order derivatives of F(λ) are as follows
$ F'\left( \lambda \right) = {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2},F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right). $ |
Proof:
(1) Computing the inner product of the formula (20) with x(λ) yields
$ \begin{array}{*{20}{c}} {\lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} \le \lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2} = }\\ {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}},\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) \le \left\| {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}}} \right\|\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|} \end{array} $ | (22) |
which implies that
$ \mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0 $ |
According to this estimate and Eq.(22), we obtain that
$ \mathop {\lim }\limits_{\lambda \to \infty } \lambda \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0,\mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\| = 0 $ |
Thus the conclusion (1) can be drawn from the definition of F(λ).
(2) Implicit differentiation of Eq.(19) with respect to λ combining with Eq.(20) yields
$ \begin{array}{*{20}{c}} {F'\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{b}},\mathit{\boldsymbol{\tilde A' \tilde x}}\left( \lambda \right)} \right) + }\\ {2\lambda \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) + {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} = {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2}}\\ {F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right)} \end{array} $ |
thus the conclusion (2) is proved.
Proposition 2 For
$ F'\left( \lambda \right) > 0,F''\left( \lambda \right) < 0\;\;\;\;\forall \lambda > 0 $ |
Proof:We consider
$ \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = - \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) $ | (23) |
Computing the inner product of the formula (23) with x′(λ) yields
$ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|^2} = - \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) $ |
In view of Eq.(20), we obtain that
$ \begin{array}{*{20}{c}} {F''\left( \lambda \right) = - 2{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|}^2} - 2\lambda {{\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|}^2} \le 0}\\ {F'\left( \lambda \right) \ge 0} \end{array} $ |
Then we prove that the equal-sign in the above equation does not hold.Assume that
$ {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} = 0 $ |
which is contradictory with
Proposition 3 F(λ) satisfies the differential relationship
$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\lambda F'\left( \lambda \right) + F\left( \lambda \right) + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0\;\;\;\;\forall \lambda > 0 $ |
Proof:Implicit differentiation of Eq.(20) with respect to λ yields
$ \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = 0 $ |
Computing the inner product of the above equation with x(λ) yields
$ \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \lambda \left( {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \left( {\mathit{\boldsymbol{\tilde A \tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $ |
and combining with Eq.(20) yields
$ F'\left( \lambda \right) + \frac{\lambda }{2}F''\left( \lambda \right) + \frac{1}{2}\frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $ |
i.e.
$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\frac{\lambda }{2}F'\left( \lambda \right) + \frac{1}{2}F\left( \lambda \right) + \frac{1}{2}{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0 $ |
Therefore, Proposition 3 has been proved.
1.2 Fractional TikhonovIn this section, we use a fractional power of the matrix
$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|_\mathit{\boldsymbol{H}}^2 + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $ | (24) |
where the matrix H is symmetric positive semi-definite and
$ {\left\| \mathit{\boldsymbol{M}} \right\|_\mathit{\boldsymbol{H}}} = {\left( {{\mathit{\boldsymbol{M}}^{\rm{T}}}\mathit{\boldsymbol{HM}}} \right)^{1/2}} $ | (25) |
for any M.It is quite natural that the value of μ counts for a great deal that determines how sensitive the solution of Eq.(24) is to the error e in
Assuming that
$ \mathit{\boldsymbol{H}} = {\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)^{\left( {a - 1} \right)/2}} $ | (26) |
for a>0.When a < 1, we define H as the Moore-Penrose pseudo-inverse of
The normal equation associated with the penalized least-squares problem (24) is given by
$ \left( {{{\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)}^{\left( {a + 1} \right)/2}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)\mathit{\boldsymbol{\tilde x}} = {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A}}} \right)^{\left( {a + 1} \right)/2}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ | (27) |
Then introduce the singular value decomposition (SVD) of
$ \mathit{\boldsymbol{\tilde A}} = \mathit{\boldsymbol{U \boldsymbol{\varSigma} }}{\mathit{\boldsymbol{V}}^{\rm{T}}} $ | (28) |
where
$ \mathit{\boldsymbol{V}} = \left[ {{v_1},{v_2}, \cdots ,{v_n}} \right] \in {{\bf{R}}^{n \times n}} $ |
and
$ \mathit{\boldsymbol{U}} = \left[ {{u_1},{u_2}, \cdots ,{u_m}} \right] \in {{\bf{R}}^{m \times m}} $ |
are orthogonal matrices and
$ \mathit{\boldsymbol{ \boldsymbol{\varSigma} }} = {\rm{diag}}\left[ {{\sigma _1},{\sigma _2}, \cdots ,{\sigma _n}} \right] \in {{\bf{R}}^{m \times n}} $ | (29) |
whose diagonal elements are arranged in the following order
$ {\sigma _1} \ge {\sigma _2} \ge \cdots \ge {\sigma _r} > {\sigma _{r + 1}} = \cdots = {\sigma _n} = 0 $ | (30) |
where the index r is the rank of
Substituting the singular value decomposition (Eq.(28)) into Eq.(27) yields
$ \left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right){\mathit{\boldsymbol{V}}^{\rm{T}}}\mathit{\boldsymbol{\tilde x}} = {\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ | (31) |
Then the solution of Eq.(27) can be written as
$ \mathit{\boldsymbol{\tilde x}} = \mathit{\boldsymbol{V}}{\left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)^{ - 1}}{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ | (32) |
which is equivalent to
$ {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} = \sum\limits_{i = 1}^n {\varphi \left( {{\sigma _i}} \right)\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right){v_i}} $ | (33) |
where
$ \varphi \left( {{\sigma _i}} \right) = \frac{{\sigma _i^a}}{{\sigma _i^{a + 1} + \mu }} $ | (34) |
The solution xμ of Eq.(5) can be recovered from the solution of Eq.(33) according to
$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $ | (35) |
In addition, the filter function for some a>0 is given by Eq.(34), it has the following asymptotics
$ \mathop {\lim }\limits_{\sigma \to 0} \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to 0} \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = \frac{{{\sigma ^a}}}{\mu } + O\left( {{\sigma ^{2a + 1}}} \right) $ | (36) |
and
$ \mathop {\lim }\limits_{\sigma \to \infty } \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to \infty } \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = {\sigma ^{ - 1}} + O\left( {{\sigma ^{ - \left( {a + 2} \right)}}} \right) $ | (37) |
Then we consider the filter function of standard Tikhonov regularization shown as
$ \tilde \varphi \left( \sigma \right) = \frac{\sigma }{{{\sigma ^2} + \mu }} $ |
It is easy to show that the filter function (34) is less smoothing than
The regularization method is based on the singular value decomposition of the coefficient matrix.However, the singular value decomposition requires a very large amount of computation for the large-scale matrix.Therefore, we choose to project the large-scale problem to the low-dimensional Krylov subspace.Lewis and Reichel proposed Arnoldi Tikhonov regularization method[11] in 2009, and introduced the method in detail.Moreover, Global Arnoldi Tikhonov and Augmented Arnoldi Tikhonov Regularization Methods were successively proposed[12-13].
We propose to reduce the problem (3) to a problem of smaller size by application of the Arnoldi process applied to A with initial vector
$ \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{V}}_k} = {\mathit{\boldsymbol{V}}_{k + 1}}{{\mathit{\boldsymbol{\bar H}}}_k} $ | (38) |
where Vk=[v1, v2, …, vk]∈Rn×k is the first k columns of Vk+1, and Vk+1∈Rn×(k+1) has orthonormal columns, which span the Krylov subspace
$ {K_k}\left( {\mathit{\boldsymbol{A}},\mathit{\boldsymbol{b}}} \right) = {\rm{span}}\left( {\mathit{\boldsymbol{b}},\mathit{\boldsymbol{Ab}}, \cdots ,{\mathit{\boldsymbol{A}}^{k - 1}}\mathit{\boldsymbol{b}}} \right) $ | (39) |
We assume that k is chosen sufficiently small so that Hk is an upper Hessenberg matrix with nonvanishing subdiagonal entries.Then Hk is of rank k.We seek to determine an approximate solution xμ, k of Eq.(4) in the Krylov subspace (39).
Substituting
$ \mathit{\boldsymbol{x}} = {\mathit{\boldsymbol{V}}_k}\mathit{\boldsymbol{y}}\;\;\;\mathit{\boldsymbol{y}} \in {{\bf{R}}^{n \times k}} $ |
into Eq.(4) and using Eq.(38) yields the reduced minimization problem
$ \mathop {\min }\limits_{\mathit{\boldsymbol{y}} \in {{\bf{R}}^k}} \left\{ {{{\left\| {{{\mathit{\boldsymbol{\bar H}}}_k}\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{V}}_{k + 1}^{\rm{T}}\mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| \mathit{\boldsymbol{y}} \right\|}^2}} \right\} $ | (40) |
whose solution is denoted by yμ, k.And the reduced minimization problem (40) solved using the projection fractional Tikhonov regularization methods is described in Section 1, then
$ {\mathit{\boldsymbol{x}}_{\mu ,k}} = {\mathit{\boldsymbol{V}}_k}{\mathit{\boldsymbol{y}}_{\mu ,k}} $ | (41) |
is an approximate solution of Eq.(4).
3 Parameters SelectionThis section discusses the determination of the regularization parameter.We first consider the effects of parameters a and μ on
$ \frac{\partial }{{\partial a}}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = 2\mu \sum\limits_{i = 1}^r {\frac{{\log \left( {{\sigma _i}} \right)\sigma _i^{ - a}}}{{{{\left( {{\sigma _i} + \mu \sigma _i^{ - a}} \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ | (42) |
and
$ \frac{\partial }{{\partial \mu }}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = - 2\sum\limits_{i = 1}^r {\frac{{\sigma _i^{2a}}}{{{{\left( {\sigma _i^{a + 1} + \mu } \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} < 0 $ | (43) |
i.e.,
Conjugating
$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $ |
we have
$ \mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}} = \mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}} $ | (44) |
where
$ {\mathit{\boldsymbol{d}}_\mu }: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_\mu } $ | (45) |
and we assume that an estimate of the norm of the error
$ \varepsilon : = \left\| \mathit{\boldsymbol{e}} \right\| $ | (46) |
Then we can apply the discrepancy principle to determine a suitable value of the regularization parameter μ.Let a>0 be fixed and define that
$ \delta = \varepsilon \eta $ | (47) |
where η>1 is a user-supplied constant independent of ε.We determine μ>0, so that the solution xμ of Eq.(4) satisfies
$ \left\| {\mathit{\boldsymbol{\tilde b}} - \mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\| = \varepsilon \eta = \delta $ | (48) |
Then the vector xμ is asked to satisfy the discrepancy principle[15].Solution of Eq.(48) about μ is equivalent to the positive zero of the function
$ {\varphi _a}\left( \mu \right) = \sum\limits_{i = 1}^r {{{\left( {\lambda \sigma _i^{a + 1} + 1} \right)}^{ - 2}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} + \sum\limits_{i = r + 1}^m {{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} - {\delta ^2} $ | (49) |
where r is the rank of A.Thus
$ {{\varphi '}_a}\left( \mu \right) = - 2\sum\limits_{i = 1}^r {\sigma _i^{a + 1}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 3}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ | (50) |
and
$ {{\varphi ''}_a}\left( \mu \right) = 6\sum\limits_{i = 1}^r {\sigma _i^{2a + 2}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 4}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ | (51) |
We consider the initial approximate solution μ0:=0 for Newton method with μ=μ0-φ′a(μ)/φ″a(μ) to compute the positive zero of the function φa(μ).The iterations with Newton′s method are terminated as soon as a value of μ, such that
$ {\varphi _a}\left( \mu \right) \le \frac{1}{{100}}\left( {{\eta ^2} - 1} \right){\varepsilon ^2} $ | (52) |
has been determined.The factor 1/100 in Eq.(52) is used in our implementation, but other positive factors strictly smaller than 1 can be also used.
4 Numerical ExamplesWe use three text examples to illustrate the performance of the Arnoldi projection fractional Tikhonov(APFT) regularization and compare them to Arnoldi fractional Tikhonov (AFT) and Arnoldi Tikhonov (AT) for large scale linear discrete ill-posed problems.The orthogonal projection with
$ P: = 1/{n^{1/2}}{\left[ {1,1, \cdots ,1} \right]^{\rm{T}}} \in {{\bf{R}}^n} $ |
has the same null space as the regularization operator
$ \mathit{\boldsymbol{L}} = \left[ {\begin{array}{*{20}{c}} 1&{ - 1}&{}&0\\ {}&1&{ - 1}&{}\\ {}&{}& \ddots&\ddots \\ 0&{}&1&{ - 1} \end{array}} \right] \in {{\bf{R}}^{\left( {n - 1} \right) \times n}} $ |
which will be applied in the following examples.All computations were carried out in MATLAB with about 16 significant decimal digits.
Example 1 Considering the Fredholm integral equation of the first kind shown as
$ \int_{ - \frac{{\rm{ \mathsf{ π} }}}{2}}^{\frac{{\rm{ \mathsf{ π} }}}{2}} {k\left( {\omega ,\rho } \right)x\left( \rho \right){\rm{d}}\rho } = g\left( \omega \right)\;\;\;\;\; - \frac{{\rm{ \mathsf{ π} }}}{2} \le \omega \le \frac{{\rm{ \mathsf{ π} }}}{2} $ |
the MATLAB code Shaw produces a discretization A∈R1 000×1 000 and the right-hand side
Fig. 1 illustrates that the approximate solution obtained by the APFT method can approximate the exact solution well, which means that APFT regularization method is effective.
Example 2 The Fredholm integral equation of the first kind is
$ \int_a^b {k\left( {s,t} \right)f\left( t \right){\rm{d}}t} = g\left( s \right)\;\;\;\;\;c \le s \le d $ |
and the MATLAB code discretes Barrt, Shaw, Phillips, Gravity, Foxgod and Deriv2 by a Galerkin method with orthonormal box functions about the matrix order n=1 000.The noise-level λ is defined by
$ \lambda = \frac{{\left\| \mathit{\boldsymbol{e}} \right\|}}{{\left\| {\mathit{\boldsymbol{\hat b}}} \right\|}} $ |
The regularization parameter μ is determined by the discrepancy principle.The tables report relative errors
Tables 1 and 2 show the qualities of AT, AFT and APFT for various examples (n=1 000).The following results show that APFT usually renders solutions of high quality.In other words, we can see that APFT is superior to AFT and AT.
Example 3 We show the performance of the method about the restoration of a discrete image which has been contaminated by blur and noise.Our task is to deblur the two-dimensional images degraded by additive noise and spatially invariant blur.The restoration problems were proposed by the US Air Force Phillips Laboratory.The two-dimensional image restoration problem can be modeled by a linear system of equations Ax=b.The matrix A is a discrete blurring operator referred to as a discrete point spread function.Then the components of the vectors b and
Fig. 2 displays the noise- and blur-free images, the contaminated image, as well as restored images of Lena which determined by the AFT and APFT methods.Meanwhile, the images above illustrate that APFT gives better reconstructions than AFT.
Fig. 3 displays the noise- and blur-free images, the contaminated image, as well as restored images of "MATH" which are determined by the AT and APFT methods.The approximate solutions abtained by the APFT method are nearly optimal for this example.Actually, the computed solutions are close to the orthogonal projection of the exact solution into the range-restricted subspace.However, the AT produces an approximate solution of lower quality than the APFT method.
5 Conclusions
In this paper, we propose the APFT regularization method for solving the large scale linear discrete ill-posed problems.Our method is easy to realize and numerical examples show that the proposed method is effective by which we can give a more accurate approximation than AT and AFT methods.
Acknowledgements
This work was supported by the National Natural Science Foundations of China (Nos.11571171 and 61473148).
[1] |
HANSEN P C. Regularization tools:A Matlab package for analysis and solution of discrete ill-posed problems[J]. Numerical Algorithms, 1994, 6(1): 1-35. DOI:10.1007/BF02149761 |
[2] |
FUHRY M, REICHEL L. A new Tikhonov regularization method[J]. Numerical Algorithms, 2012, 59(3): 433-445. DOI:10.1007/s11075-011-9498-x |
[3] |
HANKE M, HANSEN P C. Regularization methods for large-scale problems[J]. Surveys on Mathematics for Industry, 1993, 3(4): 253-315. |
[4] |
CALVETTI D, REICHEL L, SHUIBI A. Invertible smoothing preconditioners for linear discrete ill-posed problems[J]. Applied Numerical Mathematics, 2005, 54(2): 135-149. DOI:10.1016/j.apnum.2004.09.027 |
[5] |
HANSEN P C, JENSEN T K. Smoothing-norm preconditioning for regularizing minimum-residual methods[J]. SIAM Journal on Matrix Analysis and Applications, 2006, 29(1): 1-14. |
[6] |
MILLER M. Least squares methods for ill-posed problems with a prescribed bound[J]. SIAM Journal on Mathematical Analysis, 2012, 1(1): 52-74. |
[7] |
MORIGI S, REICHEL L, SGALLARI F. Orthogonal projection regularization operators[J]. Numerical Algorithms, 2007, 44(2): 99-114. DOI:10.1007/s11075-007-9080-8 |
[8] |
HOCHSTENBAC M E, REICHEL L. Fractional Tikhonov regularization for linear discrete ill-posed problems[J]. BIT Numerical Mathematics, 2011, 51(1): 197-215. DOI:10.1007/s10543-011-0313-9 |
[9] |
HOCHSTENBAC M E, NOSCHESE S, REICHEL L. Fractional regularization matrices for linear discrete ill-posed problems[J]. Journal of Engineering Mathematics, 2015, 93(1): 1-17. DOI:10.1007/s10665-014-9780-8 |
[10] |
REICHEL L, RODRIGUEZ G. Old and new parameter choice rules for discrete ill-posed problems[J]. Numerical Algorithms, 2013, 63(1): 65-87. DOI:10.1007/s11075-012-9612-8 |
[11] |
LEWIS B, REICHEL L. Arnoldi-Tikhonov regularization methods[J]. Journal of Computational and Applied Mathematics, 2009, 226(1): 92-102. DOI:10.1016/j.cam.2008.05.003 |
[12] |
SIADAT M, AGHAZADE H, OKTEM N. Reordering for improving global Arnoldi-Tikhonov method in image restoration problems[J]. Signal Image and Video Processing, 2017(2): 1-8. |
[13] |
LIN Y, BAO L, CAO Y. Augmented Arnoldi-Tikhonov regularization methods for solving large-scale linear ill-posed systems[J]. Mathematical Problems in Engineering, 2013, 5: 87-118. |
[14] |
KLANN E, RAMLAU R. Regularization by fractional filter methods and data smoothing[J]. Inverse Problems, 2008, 24(2): 025018. DOI:10.1088/0266-5611/24/2/025018 |
[15] |
ENGL H W. Discrepancy principles for Tikhonov regularization of ill-posed problems leading to optimal convergence rates[J]. Journal of Optimization Theory and Applications, 1987, 52(2): 209-215. DOI:10.1007/BF00941281 |
[16] |
HANSEN P C. Regularization tools version 4.0 for Matlab 7.3[J]. Numerical Algorithms, 2007, 46(2): 189-194. DOI:10.1007/s11075-007-9136-9 |