Transactions of Nanjing University of Aeronautics and Astronautics  2018, Vol. 35 Issue (3): 395-402   PDF    
Arnoldi Projection Fractional Tikhonov for Large Scale Ill-Posed Problems
Wang Zhengsheng1, Mu Liming1, Liu Rongrong1, Xu Guili2     
1. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China;
2. Institute of Automation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China
Abstract: It is well known that Tikhonov regularization in standard form may determine approximate solutions that are too smooth for ill-posed problems, so fractional Tikhonov methods have been introduced to remedy this shortcoming. And Tikhonov regularization for large-scale linear ill-posed problems is commonly implemented by determining a partial Arnoldi decomposition of the given matrix. In this paper, we propose a new method to compute an approximate solution of large scale linear discrete ill-posed problems which applies projection fractional Tikhonov regularization in Krylov subspace via Arnoldi process. The projection fractional Tikhonov regularization combines the fractional matrices and orthogonal projection operators. A suitable value of the regularization parameter is determined by the discrepancy principle. Numerical examples with application to image restoration are carried out to examine that the performance of the method.
Key words: ill-posed problems    fractional matrix    Tikhonov regularization    orthogonal projection operator    image restoration    
0 Introduction

This paper is concerned with the solution of least-square problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|\;\;\;\mathit{\boldsymbol{A}} \in {{\bf{R}}^{n \times n}},\mathit{\boldsymbol{x}} \in {{\bf{R}}^n},b \in {{\bf{R}}^n} $ (1)

with a large square matrix A of ill-determined rank.In particular, such a matrix is severely ill-conditioned and may be singular by which its singular values decrease to zero gradually and without obvious interval.The vector b represents the available data that is usually with a discrete error or measurement error eRn, i.e.

$ \mathit{\boldsymbol{b}} = \mathit{\boldsymbol{\hat b}} + \mathit{\boldsymbol{e}} $ (2)

where ${\mathit{\boldsymbol{\hat b}}}$ denotes the unknown error-free vector associated with b.Throughout this paper we will refer to the error eRn as "noise" and assume that the linear system of equations with the unknown error-free right-hand side

$ \mathit{\boldsymbol{Ax}} = \mathit{\boldsymbol{\hat b}} $ (3)

is consistent and ${\mathit{\boldsymbol{\hat x}}}$ denotes its solution of minimal Euclidean norm.Then we will determine an approximation of ${\mathit{\boldsymbol{\hat x}}}$ by computing an approximate solution of the large scale linear discrete ill-posed problem (1)[1].

In view of the ill-condition of A and the error e in b, the straightforward solution generally yields a meaningless approximation, so it is essential that the computation is stabilized by regularization.Tikhonov regularization is one of the most popular regularization methods for properties and application.Based on Tikhonov regularization, we consider a penalized least-squares problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\{ {{{\left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{Lx}}} \right\|}^2}} \right\} $ (4)

where the scalar μ>0 is referred to the regularization parameter and the matrix LRl×n is the regularization operator[2-3].The method of this paper requires L to be a square matrix.Calvetti et al.[4]and Hansen et al.[5] described a variety of square regularization operators.For the purpose of obtaining an accurate approximate solution of ${\mathit{\boldsymbol{\hat x}}}$, the least-squares problem (1) is replaced by the minimization problem (4).The number of rows in L, ln, but regularization matrices with l>n were also applied.Let R(K) and N(K) denote the range and null space of the matrix K, respectively.The matrices A and L to be chosen are assumed to satisfy

$ N\left( \mathit{\boldsymbol{A}} \right) \cap N\left( \mathit{\boldsymbol{L}} \right) = \left\{ 0 \right\} $

Then the Tikhonov minimization problem (4) has the unique solution

$ {\mathit{\boldsymbol{x}}_\mu }: = {\left( {{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{A}} + {\mu ^{ - 1}}{\mathit{\boldsymbol{L}}^{\rm{T}}}\mathit{\boldsymbol{L}}} \right)^{ - 1}}{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{b}} $ (5)

for any μ>0 and the superscript"Τ"denotes transposition of the matrix[6].

This paper solves the minimization problem (4) by simplifying it to standard form as well as uses a fractional power of the matrix $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$ as weighting matrix to measure the residual error in standard with a semi-norm.And then, using a few steps of the Arnoldi process, this paper reduces the problem (3) to a problem of smaller size, which is solved by using the projection fractional Tikhonov, and the regularization parameters a and μ are determined.At last, the illustrative numerical examples are also reported, and concluding remark can be found.

1 Projection Fractional Tikhonov

In this section, we discuss the method which combines the fractional matrices and orthogonal projection operators.Projection fractional Tikhonov regularization provides that the penalized least-squares problem (4) can be simplified to standard form and uses a fractional power as weighting matrix to measure the residual error in standard with a semi-norm.

1.1 Form simplification

The penalized least-squares problem (4) can be simplified to standard form with the orthogonal projection

$ \mathit{\boldsymbol{L}}: = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}}\;\;\;\;\mathit{\boldsymbol{P}} \in {{\bf{R}}^{n \times l}},{\mathit{\boldsymbol{P}}^{\rm{T}}}\mathit{\boldsymbol{P}} = \mathit{\boldsymbol{I}} $ (6)

which is well suited for using in Tikhonov regularization.In Eq.(6), L is used as regularization operator.It is convenient to consider the relation of the choice of the matrix L and the matrix P, and actually the choice of P determines the choice of L.Moreover, the choice of matrix P can be carried out in many different ways, some of which may yield regularization operators, and they can give more accurate approximations of ${\mathit{\boldsymbol{\hat x}}}$ than the general finite difference-based regularization operators[7].

Give the A-weighted pseudo-inverse of L as

$ \mathit{\boldsymbol{L}}_A^\dagger : = \left( {\mathit{\boldsymbol{I}} - {{\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)}^\dagger }\mathit{\boldsymbol{A}}} \right){\mathit{\boldsymbol{L}}^\dagger } \in {{\bf{R}}^{n \times l}} $ (7)

where LRn×l denotes the Moore-Penrose pseudoinverse of the regularization operator L, and I is the identity matrix.

Suppose that Eq.(6) holds and introduce the QR-factorization shown as

$ \mathit{\boldsymbol{AP}} = \mathit{\boldsymbol{QR}} $ (8)

where RRl×l is upper triangular and QRn×l has orthonormal columns.Using the properties of the Moore-Penrose pseudo-inverse and orthogonal projection, we have the following identities for L

$ \mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}} = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}},{\mathit{\boldsymbol{L}}^\dagger } = \mathit{\boldsymbol{L}} $ (9)

So yield that

$ {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger } = {\left( {\mathit{\boldsymbol{AP}}{\mathit{\boldsymbol{P}}^{\rm{T}}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\left( {\mathit{\boldsymbol{AP}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}} $ (10)

Substituting Eqs.(8), (10) into Eq.(7), we get

$ \mathit{\boldsymbol{L}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}}} \right)\mathit{\boldsymbol{L}} $ (11)

which simplifies to

$ \mathit{\boldsymbol{L}}_A^\dagger = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}} $ (12)

Transforming the matrix and vectors of Tikhonov minimization problem (4) by the following substitutions

$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger $ (13)
$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} $ (14)

where

$ {\mathit{\boldsymbol{x}}_0}: = {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger }\mathit{\boldsymbol{b}} $ (15)

When L is an orthogonal projection operator, Eqs.(13), (14), can be expressed in a simple manner as

$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{A}} $ (16)
$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{b}} $ (17)

Let $\mathit{\boldsymbol{\hat x}}: = \mathit{\boldsymbol{Lx}}$, then the penalized least-squares problem (4) can be translated into the standard form

$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $ (18)

An attractive property of this transformation is that the LA defined by Eq.(7) is of simple form which makes the orthogonal projection (6) easy to use.For any μ>0, let λ=1/μ, and then the minimization problem of Eq.(18) is

$ F\left( \lambda \right): = {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{\tilde b}}} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2} = \min {J_\lambda }\left( {\mathit{\boldsymbol{\tilde x}}} \right) $ (19)

Given any λ>0, x(λ) has a certain value and is satisfied as

$ \lambda \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) = {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ (20)

Then

$ F\left( 0 \right): = \inf {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|^2}: = {\omega ^2} $ (21)

is defined.Consequently, F(λ) is continuous in [0, ∞), and some properties of F(λ) are given in the following.

Proposition 1 F(λ) is infinitely differentiable, and has the following properties:

(1) $\mathop {\lim }\limits_{\lambda \to \infty } F\left( \lambda \right) = {\left\| {\mathit{\boldsymbol{\tilde b}}} \right\|^2}$

(2) For any λ>0, the first and second order derivatives of F(λ) are as follows

$ F'\left( \lambda \right) = {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2},F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right). $

Proof:

(1) Computing the inner product of the formula (20) with x(λ) yields

$ \begin{array}{*{20}{c}} {\lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} \le \lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2} = }\\ {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}},\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) \le \left\| {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}}} \right\|\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|} \end{array} $ (22)

which implies that

$ \mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0 $

According to this estimate and Eq.(22), we obtain that

$ \mathop {\lim }\limits_{\lambda \to \infty } \lambda \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0,\mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\| = 0 $

Thus the conclusion (1) can be drawn from the definition of F(λ).

(2) Implicit differentiation of Eq.(19) with respect to λ combining with Eq.(20) yields

$ \begin{array}{*{20}{c}} {F'\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{b}},\mathit{\boldsymbol{\tilde A' \tilde x}}\left( \lambda \right)} \right) + }\\ {2\lambda \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) + {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} = {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2}}\\ {F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right)} \end{array} $

thus the conclusion (2) is proved.

Proposition  2 For $\mathit{\boldsymbol{\tilde b}} \notin {\text{Ker}}\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}} \right)$, the function F(λ) is nonnegative, and it is strictly monotonically increasing and strictly convex, i.e.

$ F'\left( \lambda \right) > 0,F''\left( \lambda \right) < 0\;\;\;\;\forall \lambda > 0 $

Proof:We consider

$ \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = - \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) $ (23)

Computing the inner product of the formula (23) with x′(λ) yields

$ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|^2} = - \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) $

In view of Eq.(20), we obtain that

$ \begin{array}{*{20}{c}} {F''\left( \lambda \right) = - 2{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|}^2} - 2\lambda {{\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|}^2} \le 0}\\ {F'\left( \lambda \right) \ge 0} \end{array} $

Then we prove that the equal-sign in the above equation does not hold.Assume that $\hat \lambda > 0$ satisfies $F''\left( {\hat \lambda } \right) = 0$, then we have $\mathit{\boldsymbol{\tilde x'}}\left( {\hat \lambda } \right) = 0$.Due to Eq.(23), $\mathit{\boldsymbol{\tilde x}}\left( {\hat \lambda } \right) = 0$ is obtained, then note that the form (20) yields

$ {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} = 0 $

which is contradictory with $\mathit{\boldsymbol{\tilde b}} \notin {\text{Ker}}\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}} \right)$, therefore Proposition 2 has been proved.

Proposition 3 F(λ) satisfies the differential relationship

$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\lambda F'\left( \lambda \right) + F\left( \lambda \right) + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0\;\;\;\;\forall \lambda > 0 $

Proof:Implicit differentiation of Eq.(20) with respect to λ yields

$ \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = 0 $

Computing the inner product of the above equation with x(λ) yields

$ \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \lambda \left( {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \left( {\mathit{\boldsymbol{\tilde A \tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $

and combining with Eq.(20) yields

$ F'\left( \lambda \right) + \frac{\lambda }{2}F''\left( \lambda \right) + \frac{1}{2}\frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $

i.e.

$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\frac{\lambda }{2}F'\left( \lambda \right) + \frac{1}{2}F\left( \lambda \right) + \frac{1}{2}{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0 $

Therefore, Proposition 3 has been proved.

1.2 Fractional Tikhonov

In this section, we use a fractional power of the matrix $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$ as weighting matrix to measure the residual error in standard form (18) with a semi-norm[8].We will replace the penalized least-squares problem (18) by a minimization problem of the form

$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|_\mathit{\boldsymbol{H}}^2 + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $ (24)

where the matrix H is symmetric positive semi-definite and

$ {\left\| \mathit{\boldsymbol{M}} \right\|_\mathit{\boldsymbol{H}}} = {\left( {{\mathit{\boldsymbol{M}}^{\rm{T}}}\mathit{\boldsymbol{HM}}} \right)^{1/2}} $ (25)

for any M.It is quite natural that the value of μ counts for a great deal that determines how sensitive the solution of Eq.(24) is to the error e in ${\mathit{\boldsymbol{\tilde b}}}$.The minimization problem (24) has a unique solution ${{\mathit{\boldsymbol{\tilde x}}}_\mu }$ for any μ, such as the penalized least-squares problem (18).

Assuming that

$ \mathit{\boldsymbol{H}} = {\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)^{\left( {a - 1} \right)/2}} $ (26)

for a>0.When a < 1, we define H as the Moore-Penrose pseudo-inverse of $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$.The choice of a is the key to determine ${{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}$, which makes the approximate solution more accurate.We refer to the minimization problem (24) as the fractional Tikhonov method (the weighted Tikhonov method)[9].When a=1, we can obtain the standard Tikhonov regularization.

The normal equation associated with the penalized least-squares problem (24) is given by

$ \left( {{{\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)}^{\left( {a + 1} \right)/2}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)\mathit{\boldsymbol{\tilde x}} = {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A}}} \right)^{\left( {a + 1} \right)/2}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ (27)

Then introduce the singular value decomposition (SVD) of ${\mathit{\boldsymbol{\tilde A}}}$, shown as

$ \mathit{\boldsymbol{\tilde A}} = \mathit{\boldsymbol{U \boldsymbol{\varSigma} }}{\mathit{\boldsymbol{V}}^{\rm{T}}} $ (28)

where

$ \mathit{\boldsymbol{V}} = \left[ {{v_1},{v_2}, \cdots ,{v_n}} \right] \in {{\bf{R}}^{n \times n}} $

and

$ \mathit{\boldsymbol{U}} = \left[ {{u_1},{u_2}, \cdots ,{u_m}} \right] \in {{\bf{R}}^{m \times m}} $

are orthogonal matrices and

$ \mathit{\boldsymbol{ \boldsymbol{\varSigma} }} = {\rm{diag}}\left[ {{\sigma _1},{\sigma _2}, \cdots ,{\sigma _n}} \right] \in {{\bf{R}}^{m \times n}} $ (29)

whose diagonal elements are arranged in the following order

$ {\sigma _1} \ge {\sigma _2} \ge \cdots \ge {\sigma _r} > {\sigma _{r + 1}} = \cdots = {\sigma _n} = 0 $ (30)

where the index r is the rank of ${\mathit{\boldsymbol{\tilde A}}}$[10].

Substituting the singular value decomposition (Eq.(28)) into Eq.(27) yields

$ \left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right){\mathit{\boldsymbol{V}}^{\rm{T}}}\mathit{\boldsymbol{\tilde x}} = {\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ (31)

Then the solution of Eq.(27) can be written as

$ \mathit{\boldsymbol{\tilde x}} = \mathit{\boldsymbol{V}}{\left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)^{ - 1}}{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $ (32)

which is equivalent to

$ {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} = \sum\limits_{i = 1}^n {\varphi \left( {{\sigma _i}} \right)\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right){v_i}} $ (33)

where

$ \varphi \left( {{\sigma _i}} \right) = \frac{{\sigma _i^a}}{{\sigma _i^{a + 1} + \mu }} $ (34)

The solution xμ of Eq.(5) can be recovered from the solution of Eq.(33) according to

$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $ (35)

In addition, the filter function for some a>0 is given by Eq.(34), it has the following asymptotics

$ \mathop {\lim }\limits_{\sigma \to 0} \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to 0} \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = \frac{{{\sigma ^a}}}{\mu } + O\left( {{\sigma ^{2a + 1}}} \right) $ (36)

and

$ \mathop {\lim }\limits_{\sigma \to \infty } \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to \infty } \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = {\sigma ^{ - 1}} + O\left( {{\sigma ^{ - \left( {a + 2} \right)}}} \right) $ (37)

Then we consider the filter function of standard Tikhonov regularization shown as

$ \tilde \varphi \left( \sigma \right) = \frac{\sigma }{{{\sigma ^2} + \mu }} $

It is easy to show that the filter function (34) is less smoothing than $\tilde \varphi \left( \sigma \right)$ for 0 < a < 1, and the singular values are damped less by the filter function (34) than by $\tilde \varphi \left( \sigma \right)$, which means that the approximate solution (35) has higher quality than that with the exact solution.

2 Arnoldi-Projection Fractional Tikhonov

The regularization method is based on the singular value decomposition of the coefficient matrix.However, the singular value decomposition requires a very large amount of computation for the large-scale matrix.Therefore, we choose to project the large-scale problem to the low-dimensional Krylov subspace.Lewis and Reichel proposed Arnoldi Tikhonov regularization method[11] in 2009, and introduced the method in detail.Moreover, Global Arnoldi Tikhonov and Augmented Arnoldi Tikhonov Regularization Methods were successively proposed[12-13].

We propose to reduce the problem (3) to a problem of smaller size by application of the Arnoldi process applied to A with initial vector ${\mathit{\boldsymbol{v}}_1} = \mathit{\boldsymbol{b/}}\left\| \mathit{\boldsymbol{b}} \right\|$.This yields the decomposition

$ \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{V}}_k} = {\mathit{\boldsymbol{V}}_{k + 1}}{{\mathit{\boldsymbol{\bar H}}}_k} $ (38)

where Vk=[v1, v2, …, vk]∈Rn×k is the first k columns of Vk+1, and Vk+1Rn×(k+1) has orthonormal columns, which span the Krylov subspace

$ {K_k}\left( {\mathit{\boldsymbol{A}},\mathit{\boldsymbol{b}}} \right) = {\rm{span}}\left( {\mathit{\boldsymbol{b}},\mathit{\boldsymbol{Ab}}, \cdots ,{\mathit{\boldsymbol{A}}^{k - 1}}\mathit{\boldsymbol{b}}} \right) $ (39)

We assume that k is chosen sufficiently small so that Hk is an upper Hessenberg matrix with nonvanishing subdiagonal entries.Then Hk is of rank k.We seek to determine an approximate solution xμ, k of Eq.(4) in the Krylov subspace (39).

Substituting

$ \mathit{\boldsymbol{x}} = {\mathit{\boldsymbol{V}}_k}\mathit{\boldsymbol{y}}\;\;\;\mathit{\boldsymbol{y}} \in {{\bf{R}}^{n \times k}} $

into Eq.(4) and using Eq.(38) yields the reduced minimization problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{y}} \in {{\bf{R}}^k}} \left\{ {{{\left\| {{{\mathit{\boldsymbol{\bar H}}}_k}\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{V}}_{k + 1}^{\rm{T}}\mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| \mathit{\boldsymbol{y}} \right\|}^2}} \right\} $ (40)

whose solution is denoted by yμ, k.And the reduced minimization problem (40) solved using the projection fractional Tikhonov regularization methods is described in Section 1, then

$ {\mathit{\boldsymbol{x}}_{\mu ,k}} = {\mathit{\boldsymbol{V}}_k}{\mathit{\boldsymbol{y}}_{\mu ,k}} $ (41)

is an approximate solution of Eq.(4).

3 Parameters Selection

This section discusses the determination of the regularization parameter.We first consider the effects of parameters a and μ on ${{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}$.It follows from the solution that

$ \frac{\partial }{{\partial a}}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = 2\mu \sum\limits_{i = 1}^r {\frac{{\log \left( {{\sigma _i}} \right)\sigma _i^{ - a}}}{{{{\left( {{\sigma _i} + \mu \sigma _i^{ - a}} \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ (42)

and

$ \frac{\partial }{{\partial \mu }}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = - 2\sum\limits_{i = 1}^r {\frac{{\sigma _i^{2a}}}{{{{\left( {\sigma _i^{a + 1} + \mu } \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} < 0 $ (43)

i.e., ${\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|^2}$ is a monotonically decreasing function about μ.It is quite natural that $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|^2}$ is a monotonically increasing function about a when log(σi)>0[14].The amount of error e determines the choice of the regularization parameter μ, so that different values of the regularization parameter μ are obtained with different e.Generally, the larger e is, the larger μ should be.Due to the norm of the computed solution decreases as μ increases, $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|}$ may be smaller than the norm of solution of Eq.(24).Interestingly, the value of a < 1 increases $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|}$, i.e., the appropriate value of a can yield a more accurate approximate solution of ${\mathit{\boldsymbol{\hat x}}}$.

Conjugating

$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $

we have

$ \mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}} = \mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}} $ (44)

where ${\mathit{\boldsymbol{\tilde A}}}$ and ${\mathit{\boldsymbol{\tilde b}}}$ are given by Eq.(16) and Eq.(17).The discrepancy associated with xμ is defined by

$ {\mathit{\boldsymbol{d}}_\mu }: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_\mu } $ (45)

and we assume that an estimate of the norm of the error

$ \varepsilon : = \left\| \mathit{\boldsymbol{e}} \right\| $ (46)

Then we can apply the discrepancy principle to determine a suitable value of the regularization parameter μ.Let a>0 be fixed and define that

$ \delta = \varepsilon \eta $ (47)

where η>1 is a user-supplied constant independent of ε.We determine μ>0, so that the solution xμ of Eq.(4) satisfies

$ \left\| {\mathit{\boldsymbol{\tilde b}} - \mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\| = \varepsilon \eta = \delta $ (48)

Then the vector xμ is asked to satisfy the discrepancy principle[15].Solution of Eq.(48) about μ is equivalent to the positive zero of the function

$ {\varphi _a}\left( \mu \right) = \sum\limits_{i = 1}^r {{{\left( {\lambda \sigma _i^{a + 1} + 1} \right)}^{ - 2}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} + \sum\limits_{i = r + 1}^m {{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} - {\delta ^2} $ (49)

where r is the rank of A.Thus

$ {{\varphi '}_a}\left( \mu \right) = - 2\sum\limits_{i = 1}^r {\sigma _i^{a + 1}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 3}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ (50)

and

$ {{\varphi ''}_a}\left( \mu \right) = 6\sum\limits_{i = 1}^r {\sigma _i^{2a + 2}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 4}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $ (51)

We consider the initial approximate solution μ0:=0 for Newton method with μ=μ0-φa(μ)/φa(μ) to compute the positive zero of the function φa(μ).The iterations with Newton′s method are terminated as soon as a value of μ, such that

$ {\varphi _a}\left( \mu \right) \le \frac{1}{{100}}\left( {{\eta ^2} - 1} \right){\varepsilon ^2} $ (52)

has been determined.The factor 1/100 in Eq.(52) is used in our implementation, but other positive factors strictly smaller than 1 can be also used.

4 Numerical Examples

We use three text examples to illustrate the performance of the Arnoldi projection fractional Tikhonov(APFT) regularization and compare them to Arnoldi fractional Tikhonov (AFT) and Arnoldi Tikhonov (AT) for large scale linear discrete ill-posed problems.The orthogonal projection with

$ P: = 1/{n^{1/2}}{\left[ {1,1, \cdots ,1} \right]^{\rm{T}}} \in {{\bf{R}}^n} $

has the same null space as the regularization operator

$ \mathit{\boldsymbol{L}} = \left[ {\begin{array}{*{20}{c}} 1&{ - 1}&{}&0\\ {}&1&{ - 1}&{}\\ {}&{}& \ddots&\ddots \\ 0&{}&1&{ - 1} \end{array}} \right] \in {{\bf{R}}^{\left( {n - 1} \right) \times n}} $

which will be applied in the following examples.All computations were carried out in MATLAB with about 16 significant decimal digits.

Example 1 Considering the Fredholm integral equation of the first kind shown as

$ \int_{ - \frac{{\rm{ \mathsf{ π} }}}{2}}^{\frac{{\rm{ \mathsf{ π} }}}{2}} {k\left( {\omega ,\rho } \right)x\left( \rho \right){\rm{d}}\rho } = g\left( \omega \right)\;\;\;\;\; - \frac{{\rm{ \mathsf{ π} }}}{2} \le \omega \le \frac{{\rm{ \mathsf{ π} }}}{2} $

the MATLAB code Shaw produces a discretization AR1 000×1 000 and the right-hand side ${\mathit{\boldsymbol{\hat b}}}$R1 000 by a Galerkin method with orthonormal box functions[16].The noise-level λ is defined by $\lambda = \left\| \mathit{\boldsymbol{e}} \right\|/\left\| {\mathit{\boldsymbol{\hat b}}} \right\|$.Then, we will give a comparison of the approximate solution by the APFT regularization method and exact solution when taking the different value of the error vector e.

Fig. 1 illustrates that the approximate solution obtained by the APFT method can approximate the exact solution well, which means that APFT regularization method is effective.

Fig. 1 Recovery results of Phillips with diverse noise-level

Example 2 The Fredholm integral equation of the first kind is

$ \int_a^b {k\left( {s,t} \right)f\left( t \right){\rm{d}}t} = g\left( s \right)\;\;\;\;\;c \le s \le d $

and the MATLAB code discretes Barrt, Shaw, Phillips, Gravity, Foxgod and Deriv2 by a Galerkin method with orthonormal box functions about the matrix order n=1 000.The noise-level λ is defined by

$ \lambda = \frac{{\left\| \mathit{\boldsymbol{e}} \right\|}}{{\left\| {\mathit{\boldsymbol{\hat b}}} \right\|}} $

The regularization parameter μ is determined by the discrepancy principle.The tables report relative errors $\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}} - \mathit{\boldsymbol{\hat x}}} \right\|/\left\| {\mathit{\boldsymbol{\hat x}}} \right\|$ for several noise-level and show that the method we proposed improves the accuracy of the computed solutions.

Tables 1 and 2 show the qualities of AT, AFT and APFT for various examples (n=1 000).The following results show that APFT usually renders solutions of high quality.In other words, we can see that APFT is superior to AFT and AT.

Table 1 Qualities of these methods with the error-level (λ=1%)

Table 2 Qualities of these methods with the error-level (λ=10%)

Example 3 We show the performance of the method about the restoration of a discrete image which has been contaminated by blur and noise.Our task is to deblur the two-dimensional images degraded by additive noise and spatially invariant blur.The restoration problems were proposed by the US Air Force Phillips Laboratory.The two-dimensional image restoration problem can be modeled by a linear system of equations Ax=b.The matrix A is a discrete blurring operator referred to as a discrete point spread function.Then the components of the vectors b and ${\mathit{\boldsymbol{\hat x}}}$ are the lexicographically-ordered pixel values of distorted images and the exact, respectively.We efficiently compute matrix-vector products without explicitly forming A by using the fast discrete Fourier transform and the discrete point spread function.

Fig. 2 displays the noise- and blur-free images, the contaminated image, as well as restored images of Lena which determined by the AFT and APFT methods.Meanwhile, the images above illustrate that APFT gives better reconstructions than AFT.

Fig. 2 Original, blurred, and restored Lena images

Fig. 3 displays the noise- and blur-free images, the contaminated image, as well as restored images of "MATH" which are determined by the AT and APFT methods.The approximate solutions abtained by the APFT method are nearly optimal for this example.Actually, the computed solutions are close to the orthogonal projection of the exact solution into the range-restricted subspace.However, the AT produces an approximate solution of lower quality than the APFT method.

Fig. 3 Original, blurred, and restored MATH images

5 Conclusions

In this paper, we propose the APFT regularization method for solving the large scale linear discrete ill-posed problems.Our method is easy to realize and numerical examples show that the proposed method is effective by which we can give a more accurate approximation than AT and AFT methods.

Acknowledgements

This work was supported by the National Natural Science Foundations of China (Nos.11571171 and 61473148).

References
[1]
HANSEN P C. Regularization tools:A Matlab package for analysis and solution of discrete ill-posed problems[J]. Numerical Algorithms, 1994, 6(1): 1-35. DOI:10.1007/BF02149761
[2]
FUHRY M, REICHEL L. A new Tikhonov regularization method[J]. Numerical Algorithms, 2012, 59(3): 433-445. DOI:10.1007/s11075-011-9498-x
[3]
HANKE M, HANSEN P C. Regularization methods for large-scale problems[J]. Surveys on Mathematics for Industry, 1993, 3(4): 253-315.
[4]
CALVETTI D, REICHEL L, SHUIBI A. Invertible smoothing preconditioners for linear discrete ill-posed problems[J]. Applied Numerical Mathematics, 2005, 54(2): 135-149. DOI:10.1016/j.apnum.2004.09.027
[5]
HANSEN P C, JENSEN T K. Smoothing-norm preconditioning for regularizing minimum-residual methods[J]. SIAM Journal on Matrix Analysis and Applications, 2006, 29(1): 1-14.
[6]
MILLER M. Least squares methods for ill-posed problems with a prescribed bound[J]. SIAM Journal on Mathematical Analysis, 2012, 1(1): 52-74.
[7]
MORIGI S, REICHEL L, SGALLARI F. Orthogonal projection regularization operators[J]. Numerical Algorithms, 2007, 44(2): 99-114. DOI:10.1007/s11075-007-9080-8
[8]
HOCHSTENBAC M E, REICHEL L. Fractional Tikhonov regularization for linear discrete ill-posed problems[J]. BIT Numerical Mathematics, 2011, 51(1): 197-215. DOI:10.1007/s10543-011-0313-9
[9]
HOCHSTENBAC M E, NOSCHESE S, REICHEL L. Fractional regularization matrices for linear discrete ill-posed problems[J]. Journal of Engineering Mathematics, 2015, 93(1): 1-17. DOI:10.1007/s10665-014-9780-8
[10]
REICHEL L, RODRIGUEZ G. Old and new parameter choice rules for discrete ill-posed problems[J]. Numerical Algorithms, 2013, 63(1): 65-87. DOI:10.1007/s11075-012-9612-8
[11]
LEWIS B, REICHEL L. Arnoldi-Tikhonov regularization methods[J]. Journal of Computational and Applied Mathematics, 2009, 226(1): 92-102. DOI:10.1016/j.cam.2008.05.003
[12]
SIADAT M, AGHAZADE H, OKTEM N. Reordering for improving global Arnoldi-Tikhonov method in image restoration problems[J]. Signal Image and Video Processing, 2017(2): 1-8.
[13]
LIN Y, BAO L, CAO Y. Augmented Arnoldi-Tikhonov regularization methods for solving large-scale linear ill-posed systems[J]. Mathematical Problems in Engineering, 2013, 5: 87-118.
[14]
KLANN E, RAMLAU R. Regularization by fractional filter methods and data smoothing[J]. Inverse Problems, 2008, 24(2): 025018. DOI:10.1088/0266-5611/24/2/025018
[15]
ENGL H W. Discrepancy principles for Tikhonov regularization of ill-posed problems leading to optimal convergence rates[J]. Journal of Optimization Theory and Applications, 1987, 52(2): 209-215. DOI:10.1007/BF00941281
[16]
HANSEN P C. Regularization tools version 4.0 for Matlab 7.3[J]. Numerical Algorithms, 2007, 46(2): 189-194. DOI:10.1007/s11075-007-9136-9