Arnoldi Projection Fractional Tikhonov for Large Scale Ill-Posed Problems

How to cite this article

Wang Zhengsheng, Mu Liming, Liu Rongrong, et al. Arnoldi Projection Fractional Tikhonov for Large Scale Ill-Posed Problems[J]. Trans. Nanjing Univ. Aero. Astro., 2018, 35(3): 395-402 . DOI: 10.16356/j.1005-1120.2018.03.395.

Wang Zhengsheng¹, Mu Liming¹, Liu Rongrong¹, Xu Guili²

1. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China;
2. Institute of Automation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, P. R. China

Received 11 April 2018; revised 19 May 2018; accepted 23 May 2018

Author: Prof.Wang Zhengsheng received Ph.D.degree in Computational Mathematics from Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China, in 2006.From 2013 to present, he has been a professor in College of Science, NUAA.His research interest focuses on numerical algebra and applications, matrix methods in data mining and pattern recognition and computational methods in science & engineering;
Ms.Mu Liming received B.S.degree in Mathematics from Xinyang Normal University, Xinyang, China, in 2016.She began to study for a master degree in Nanjing University of Aeronautics and Astronautics in September 2016.Her research interest focuses on numerical algebra and applications;
Ms.Liu Rongrong received B.S.degree in Mathematics from Jinan University, Jinan, China, in 2017.She began to study for a master degree in Nanjing University of Aeronautics and Astronautics in September 2017.Her research interest focuses on numerical algebra and applications;
Prof.Xu Guili received Ph.D.degree in System Detection and Control from Jiangsu University, Zhenjiang, China, in 2002.From 2010 to present, he has been a professor in College of Automation Engineering, Nanjing University of Aeronautics and Astronautics.His research interest focuses on the photoelectric measure, computer vision and intelligent system.

Corresponding author: Wang Zhengsheng, E-mail address:wangzhengsheng@nuaa.edu.cn.

Abstract: It is well known that Tikhonov regularization in standard form may determine approximate solutions that are too smooth for ill-posed problems, so fractional Tikhonov methods have been introduced to remedy this shortcoming. And Tikhonov regularization for large-scale linear ill-posed problems is commonly implemented by determining a partial Arnoldi decomposition of the given matrix. In this paper, we propose a new method to compute an approximate solution of large scale linear discrete ill-posed problems which applies projection fractional Tikhonov regularization in Krylov subspace via Arnoldi process. The projection fractional Tikhonov regularization combines the fractional matrices and orthogonal projection operators. A suitable value of the regularization parameter is determined by the discrepancy principle. Numerical examples with application to image restoration are carried out to examine that the performance of the method.

Key words: ill-posed problems fractional matrix Tikhonov regularization orthogonal projection operator image restoration

收稿日期: 2018-04-11; 修订日期: 2018-05-19; 接受日期: 2018-05-23

0 Introduction

This paper is concerned with the solution of least-square problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|\;\;\;\mathit{\boldsymbol{A}} \in {{\bf{R}}^{n \times n}},\mathit{\boldsymbol{x}} \in {{\bf{R}}^n},b \in {{\bf{R}}^n} $

(1)

with a large square matrix A of ill-determined rank.In particular, such a matrix is severely ill-conditioned and may be singular by which its singular values decrease to zero gradually and without obvious interval.The vector b represents the available data that is usually with a discrete error or measurement error e∈Rⁿ, i.e.

$ \mathit{\boldsymbol{b}} = \mathit{\boldsymbol{\hat b}} + \mathit{\boldsymbol{e}} $

(2)

where ${\mathit{\boldsymbol{\hat b}}}$ denotes the unknown error-free vector associated with b.Throughout this paper we will refer to the error e∈Rⁿ as "noise" and assume that the linear system of equations with the unknown error-free right-hand side

$ \mathit{\boldsymbol{Ax}} = \mathit{\boldsymbol{\hat b}} $

(3)

is consistent and ${\mathit{\boldsymbol{\hat x}}}$ denotes its solution of minimal Euclidean norm.Then we will determine an approximation of ${\mathit{\boldsymbol{\hat x}}}$ by computing an approximate solution of the large scale linear discrete ill-posed problem (1)^[1].

In view of the ill-condition of A and the error e in b, the straightforward solution generally yields a meaningless approximation, so it is essential that the computation is stabilized by regularization.Tikhonov regularization is one of the most popular regularization methods for properties and application.Based on Tikhonov regularization, we consider a penalized least-squares problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{x}} \in {{\bf{R}}^n}} \left\{ {{{\left\| {\mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{Lx}}} \right\|}^2}} \right\} $

(4)

where the scalar μ>0 is referred to the regularization parameter and the matrix L∈R^l×n is the regularization operator^[2-3].The method of this paper requires L to be a square matrix.Calvetti et al.^[4]and Hansen et al.^[5] described a variety of square regularization operators.For the purpose of obtaining an accurate approximate solution of ${\mathit{\boldsymbol{\hat x}}}$, the least-squares problem (1) is replaced by the minimization problem (4).The number of rows in L, l ≤ n, but regularization matrices with l>n were also applied.Let R(K) and N(K) denote the range and null space of the matrix K, respectively.The matrices A and L to be chosen are assumed to satisfy

$ N\left( \mathit{\boldsymbol{A}} \right) \cap N\left( \mathit{\boldsymbol{L}} \right) = \left\{ 0 \right\} $

Then the Tikhonov minimization problem (4) has the unique solution

$ {\mathit{\boldsymbol{x}}_\mu }: = {\left( {{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{A}} + {\mu ^{ - 1}}{\mathit{\boldsymbol{L}}^{\rm{T}}}\mathit{\boldsymbol{L}}} \right)^{ - 1}}{\mathit{\boldsymbol{A}}^{\rm{T}}}\mathit{\boldsymbol{b}} $

(5)

for any μ>0 and the superscript"Τ"denotes transposition of the matrix^[6].

This paper solves the minimization problem (4) by simplifying it to standard form as well as uses a fractional power of the matrix $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$ as weighting matrix to measure the residual error in standard with a semi-norm.And then, using a few steps of the Arnoldi process, this paper reduces the problem (3) to a problem of smaller size, which is solved by using the projection fractional Tikhonov, and the regularization parameters a and μ are determined.At last, the illustrative numerical examples are also reported, and concluding remark can be found.

1 Projection Fractional Tikhonov

In this section, we discuss the method which combines the fractional matrices and orthogonal projection operators.Projection fractional Tikhonov regularization provides that the penalized least-squares problem (4) can be simplified to standard form and uses a fractional power as weighting matrix to measure the residual error in standard with a semi-norm.

1.1 Form simplification

The penalized least-squares problem (4) can be simplified to standard form with the orthogonal projection

$ \mathit{\boldsymbol{L}}: = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}}\;\;\;\;\mathit{\boldsymbol{P}} \in {{\bf{R}}^{n \times l}},{\mathit{\boldsymbol{P}}^{\rm{T}}}\mathit{\boldsymbol{P}} = \mathit{\boldsymbol{I}} $

(6)

which is well suited for using in Tikhonov regularization.In Eq.(6), L is used as regularization operator.It is convenient to consider the relation of the choice of the matrix L and the matrix P, and actually the choice of P determines the choice of L.Moreover, the choice of matrix P can be carried out in many different ways, some of which may yield regularization operators, and they can give more accurate approximations of ${\mathit{\boldsymbol{\hat x}}}$ than the general finite difference-based regularization operators^[7].

Give the A-weighted pseudo-inverse of L as

$ \mathit{\boldsymbol{L}}_A^\dagger : = \left( {\mathit{\boldsymbol{I}} - {{\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)}^\dagger }\mathit{\boldsymbol{A}}} \right){\mathit{\boldsymbol{L}}^\dagger } \in {{\bf{R}}^{n \times l}} $

(7)

where L^†∈R^n×l denotes the Moore-Penrose pseudoinverse of the regularization operator L, and I is the identity matrix.

Suppose that Eq.(6) holds and introduce the QR-factorization shown as

$ \mathit{\boldsymbol{AP}} = \mathit{\boldsymbol{QR}} $

(8)

where R∈R^l×l is upper triangular and Q∈R^n×l has orthonormal columns.Using the properties of the Moore-Penrose pseudo-inverse and orthogonal projection, we have the following identities for L

$ \mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}} = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{P}}^{\rm{T}}},{\mathit{\boldsymbol{L}}^\dagger } = \mathit{\boldsymbol{L}} $

(9)

So yield that

$ {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger } = {\left( {\mathit{\boldsymbol{AP}}{\mathit{\boldsymbol{P}}^{\rm{T}}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\left( {\mathit{\boldsymbol{AP}}} \right)^\dagger } = \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}} $

(10)

Substituting Eqs.(8), (10) into Eq.(7), we get

$ \mathit{\boldsymbol{L}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}}} \right)\mathit{\boldsymbol{L}} $

(11)

which simplifies to

$ \mathit{\boldsymbol{L}}_A^\dagger = \mathit{\boldsymbol{I}} - \mathit{\boldsymbol{P}}{\mathit{\boldsymbol{R}}^{ - 1}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}\mathit{\boldsymbol{A}} $

(12)

Transforming the matrix and vectors of Tikhonov minimization problem (4) by the following substitutions

$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger $

(13)

$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} $

(14)

where

$ {\mathit{\boldsymbol{x}}_0}: = {\left( {\mathit{\boldsymbol{A}}\left( {\mathit{\boldsymbol{I}} - {\mathit{\boldsymbol{L}}^\dagger }\mathit{\boldsymbol{L}}} \right)} \right)^\dagger }\mathit{\boldsymbol{b}} $

(15)

When L is an orthogonal projection operator, Eqs.(13), (14), can be expressed in a simple manner as

$ \mathit{\boldsymbol{\tilde A}}: = \mathit{\boldsymbol{AL}}_A^\dagger = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{A}} $

(16)

$ \mathit{\boldsymbol{\tilde b}}: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_0} = \left( {\mathit{\boldsymbol{I}} - \mathit{\boldsymbol{Q}}{\mathit{\boldsymbol{Q}}^{\rm{T}}}} \right)\mathit{\boldsymbol{b}} $

(17)

Let $\mathit{\boldsymbol{\hat x}}: = \mathit{\boldsymbol{Lx}}$, then the penalized least-squares problem (4) can be translated into the standard form

$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $

(18)

An attractive property of this transformation is that the L_A^† defined by Eq.(7) is of simple form which makes the orthogonal projection (6) easy to use.For any μ>0, let λ=1/μ, and then the minimization problem of Eq.(18) is

$ F\left( \lambda \right): = {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{\tilde b}}} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2} = \min {J_\lambda }\left( {\mathit{\boldsymbol{\tilde x}}} \right) $

(19)

Given any λ>0, x(λ) has a certain value and is satisfied as

$ \lambda \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right) = {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $

(20)

Then

$ F\left( 0 \right): = \inf {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|^2}: = {\omega ^2} $

(21)

is defined.Consequently, F(λ) is continuous in [0, ∞), and some properties of F(λ) are given in the following.

Proposition 1 F(λ) is infinitely differentiable, and has the following properties:

(1) $\mathop {\lim }\limits_{\lambda \to \infty } F\left( \lambda \right) = {\left\| {\mathit{\boldsymbol{\tilde b}}} \right\|^2}$

(2) For any λ>0, the first and second order derivatives of F(λ) are as follows

$ F'\left( \lambda \right) = {\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|^2},F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right). $

Proof:

(1) Computing the inner product of the formula (20) with x(λ) yields

$ \begin{array}{*{20}{c}} {\lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} \le \lambda {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2} = }\\ {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}},\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) \le \left\| {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}}} \right\|\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|} \end{array} $

(22)

which implies that

$ \mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0 $

According to this estimate and Eq.(22), we obtain that

$ \mathop {\lim }\limits_{\lambda \to \infty } \lambda \left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\| = 0,\mathop {\lim }\limits_{\lambda \to \infty } \left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\| = 0 $

Thus the conclusion (1) can be drawn from the definition of F(λ).

(2) Implicit differentiation of Eq.(19) with respect to λ combining with Eq.(20) yields

$ \begin{array}{*{20}{c}} {F'\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right) - \mathit{\boldsymbol{b}},\mathit{\boldsymbol{\tilde A' \tilde x}}\left( \lambda \right)} \right) + }\\ {2\lambda \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) + {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2} = {{\left\| {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right\|}^2}}\\ {F''\left( \lambda \right) = 2\left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right)} \end{array} $

thus the conclusion (2) is proved.

Proposition 2 For $\mathit{\boldsymbol{\tilde b}} \notin {\text{Ker}}\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}} \right)$, the function F(λ) is nonnegative, and it is strictly monotonically increasing and strictly convex, i.e.

$ F'\left( \lambda \right) > 0,F''\left( \lambda \right) < 0\;\;\;\;\forall \lambda > 0 $

Proof:We consider

$ \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = - \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) $

(23)

Computing the inner product of the formula (23) with x′(λ) yields

$ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|^2} + \lambda {\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|^2} = - \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right) $

In view of Eq.(20), we obtain that

$ \begin{array}{*{20}{c}} {F''\left( \lambda \right) = - 2{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right)} \right\|}^2} - 2\lambda {{\left\| {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right)} \right\|}^2} \le 0}\\ {F'\left( \lambda \right) \ge 0} \end{array} $

Then we prove that the equal-sign in the above equation does not hold.Assume that $\hat \lambda > 0$ satisfies $F''\left( {\hat \lambda } \right) = 0$, then we have $\mathit{\boldsymbol{\tilde x'}}\left( {\hat \lambda } \right) = 0$.Due to Eq.(23), $\mathit{\boldsymbol{\tilde x}}\left( {\hat \lambda } \right) = 0$ is obtained, then note that the form (20) yields

$ {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} = 0 $

which is contradictory with $\mathit{\boldsymbol{\tilde b}} \notin {\text{Ker}}\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}} \right)$, therefore Proposition 2 has been proved.

Proposition 3 F(λ) satisfies the differential relationship

$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\lambda F'\left( \lambda \right) + F\left( \lambda \right) + {{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0\;\;\;\;\forall \lambda > 0 $

Proof:Implicit differentiation of Eq.(20) with respect to λ yields

$ \mathit{\boldsymbol{\tilde x}}\left( \lambda \right) + \lambda \mathit{\boldsymbol{\tilde x'}}\left( \lambda \right) + {{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A\tilde x'}}\left( \lambda \right) = 0 $

Computing the inner product of the above equation with x(λ) yields

$ \left( {\mathit{\boldsymbol{\tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \lambda \left( {\mathit{\boldsymbol{\tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde x}}\left( \lambda \right)} \right) + \left( {\mathit{\boldsymbol{\tilde A \tilde x'}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $

and combining with Eq.(20) yields

$ F'\left( \lambda \right) + \frac{\lambda }{2}F''\left( \lambda \right) + \frac{1}{2}\frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left( {\mathit{\boldsymbol{\tilde A \tilde x}}\left( \lambda \right),\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right) = 0 $

i.e.

$ \frac{{\rm{d}}}{{{\rm{d}}\lambda }}\left\{ {\frac{\lambda }{2}F'\left( \lambda \right) + \frac{1}{2}F\left( \lambda \right) + \frac{1}{2}{{\left\| {\mathit{\boldsymbol{\tilde A\tilde x}}\left( \lambda \right)} \right\|}^2}} \right\} = 0 $

Therefore, Proposition 3 has been proved.

1.2 Fractional Tikhonov

In this section, we use a fractional power of the matrix $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$ as weighting matrix to measure the residual error in standard form (18) with a semi-norm^[8].We will replace the penalized least-squares problem (18) by a minimization problem of the form

$ \mathop {\min }\limits_{\mathit{\boldsymbol{\tilde x}} \in {{\bf{R}}^l}} \left\{ {\left\| {\mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}}} \right\|_\mathit{\boldsymbol{H}}^2 + {\mu ^{ - 1}}{{\left\| {\mathit{\boldsymbol{\tilde x}}} \right\|}^2}} \right\} $

(24)

where the matrix H is symmetric positive semi-definite and

$ {\left\| \mathit{\boldsymbol{M}} \right\|_\mathit{\boldsymbol{H}}} = {\left( {{\mathit{\boldsymbol{M}}^{\rm{T}}}\mathit{\boldsymbol{HM}}} \right)^{1/2}} $

(25)

for any M.It is quite natural that the value of μ counts for a great deal that determines how sensitive the solution of Eq.(24) is to the error e in ${\mathit{\boldsymbol{\tilde b}}}$.The minimization problem (24) has a unique solution ${{\mathit{\boldsymbol{\tilde x}}}_\mu }$ for any μ, such as the penalized least-squares problem (18).

Assuming that

$ \mathit{\boldsymbol{H}} = {\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)^{\left( {a - 1} \right)/2}} $

(26)

for a>0.When a < 1, we define H as the Moore-Penrose pseudo-inverse of $\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\text{T}}}$.The choice of a is the key to determine ${{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}$, which makes the approximate solution more accurate.We refer to the minimization problem (24) as the fractional Tikhonov method (the weighted Tikhonov method)^[9].When a=1, we can obtain the standard Tikhonov regularization.

The normal equation associated with the penalized least-squares problem (24) is given by

$ \left( {{{\left( {\mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}} \right)}^{\left( {a + 1} \right)/2}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)\mathit{\boldsymbol{\tilde x}} = {\left( {{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde A}}} \right)^{\left( {a + 1} \right)/2}}{{\mathit{\boldsymbol{\tilde A}}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $

(27)

Then introduce the singular value decomposition (SVD) of ${\mathit{\boldsymbol{\tilde A}}}$, shown as

$ \mathit{\boldsymbol{\tilde A}} = \mathit{\boldsymbol{U \boldsymbol{\varSigma} }}{\mathit{\boldsymbol{V}}^{\rm{T}}} $

(28)

where

$ \mathit{\boldsymbol{V}} = \left[ {{v_1},{v_2}, \cdots ,{v_n}} \right] \in {{\bf{R}}^{n \times n}} $

and

$ \mathit{\boldsymbol{U}} = \left[ {{u_1},{u_2}, \cdots ,{u_m}} \right] \in {{\bf{R}}^{m \times m}} $

are orthogonal matrices and

$ \mathit{\boldsymbol{ \boldsymbol{\varSigma} }} = {\rm{diag}}\left[ {{\sigma _1},{\sigma _2}, \cdots ,{\sigma _n}} \right] \in {{\bf{R}}^{m \times n}} $

(29)

whose diagonal elements are arranged in the following order

$ {\sigma _1} \ge {\sigma _2} \ge \cdots \ge {\sigma _r} > {\sigma _{r + 1}} = \cdots = {\sigma _n} = 0 $

(30)

where the index r is the rank of ${\mathit{\boldsymbol{\tilde A}}}$^[10].

Substituting the singular value decomposition (Eq.(28)) into Eq.(27) yields

$ \left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right){\mathit{\boldsymbol{V}}^{\rm{T}}}\mathit{\boldsymbol{\tilde x}} = {\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $

(31)

Then the solution of Eq.(27) can be written as

$ \mathit{\boldsymbol{\tilde x}} = \mathit{\boldsymbol{V}}{\left( {{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^{a + 1}} + {\mu ^{ - 1}}\mathit{\boldsymbol{I}}} \right)^{ - 1}}{\mathit{\boldsymbol{ \boldsymbol{\varSigma} }}^a}{\mathit{\boldsymbol{U}}^{\rm{T}}}\mathit{\boldsymbol{\tilde b}} $

(32)

which is equivalent to

$ {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} = \sum\limits_{i = 1}^n {\varphi \left( {{\sigma _i}} \right)\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right){v_i}} $

(33)

where

$ \varphi \left( {{\sigma _i}} \right) = \frac{{\sigma _i^a}}{{\sigma _i^{a + 1} + \mu }} $

(34)

The solution x_μ of Eq.(5) can be recovered from the solution of Eq.(33) according to

$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $

(35)

In addition, the filter function for some a>0 is given by Eq.(34), it has the following asymptotics

$ \mathop {\lim }\limits_{\sigma \to 0} \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to 0} \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = \frac{{{\sigma ^a}}}{\mu } + O\left( {{\sigma ^{2a + 1}}} \right) $

(36)

and

$ \mathop {\lim }\limits_{\sigma \to \infty } \varphi \left( \sigma \right) = \mathop {\lim }\limits_{\sigma \to \infty } \frac{{{\sigma ^a}}}{{{\sigma ^{a + 1}} + \mu }} = {\sigma ^{ - 1}} + O\left( {{\sigma ^{ - \left( {a + 2} \right)}}} \right) $

(37)

Then we consider the filter function of standard Tikhonov regularization shown as

$ \tilde \varphi \left( \sigma \right) = \frac{\sigma }{{{\sigma ^2} + \mu }} $

It is easy to show that the filter function (34) is less smoothing than $\tilde \varphi \left( \sigma \right)$ for 0 < a < 1, and the singular values are damped less by the filter function (34) than by $\tilde \varphi \left( \sigma \right)$, which means that the approximate solution (35) has higher quality than that with the exact solution.

2 Arnoldi-Projection Fractional Tikhonov

The regularization method is based on the singular value decomposition of the coefficient matrix.However, the singular value decomposition requires a very large amount of computation for the large-scale matrix.Therefore, we choose to project the large-scale problem to the low-dimensional Krylov subspace.Lewis and Reichel proposed Arnoldi Tikhonov regularization method^[11] in 2009, and introduced the method in detail.Moreover, Global Arnoldi Tikhonov and Augmented Arnoldi Tikhonov Regularization Methods were successively proposed^[12-13].

We propose to reduce the problem (3) to a problem of smaller size by application of the Arnoldi process applied to A with initial vector ${\mathit{\boldsymbol{v}}_1} = \mathit{\boldsymbol{b/}}\left\| \mathit{\boldsymbol{b}} \right\|$.This yields the decomposition

$ \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{V}}_k} = {\mathit{\boldsymbol{V}}_{k + 1}}{{\mathit{\boldsymbol{\bar H}}}_k} $

(38)

where V_k=[v₁, v₂, …, v_k]∈R^n×k is the first k columns of V_k+1, and V_k+1∈R^n×(k+1) has orthonormal columns, which span the Krylov subspace

$ {K_k}\left( {\mathit{\boldsymbol{A}},\mathit{\boldsymbol{b}}} \right) = {\rm{span}}\left( {\mathit{\boldsymbol{b}},\mathit{\boldsymbol{Ab}}, \cdots ,{\mathit{\boldsymbol{A}}^{k - 1}}\mathit{\boldsymbol{b}}} \right) $

(39)

We assume that k is chosen sufficiently small so that H_k is an upper Hessenberg matrix with nonvanishing subdiagonal entries.Then H_k is of rank k.We seek to determine an approximate solution x_{μ, k} of Eq.(4) in the Krylov subspace (39).

Substituting

$ \mathit{\boldsymbol{x}} = {\mathit{\boldsymbol{V}}_k}\mathit{\boldsymbol{y}}\;\;\;\mathit{\boldsymbol{y}} \in {{\bf{R}}^{n \times k}} $

into Eq.(4) and using Eq.(38) yields the reduced minimization problem

$ \mathop {\min }\limits_{\mathit{\boldsymbol{y}} \in {{\bf{R}}^k}} \left\{ {{{\left\| {{{\mathit{\boldsymbol{\bar H}}}_k}\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{V}}_{k + 1}^{\rm{T}}\mathit{\boldsymbol{b}}} \right\|}^2} + {\mu ^{ - 1}}{{\left\| \mathit{\boldsymbol{y}} \right\|}^2}} \right\} $

(40)

whose solution is denoted by y_{μ, k}.And the reduced minimization problem (40) solved using the projection fractional Tikhonov regularization methods is described in Section 1, then

$ {\mathit{\boldsymbol{x}}_{\mu ,k}} = {\mathit{\boldsymbol{V}}_k}{\mathit{\boldsymbol{y}}_{\mu ,k}} $

(41)

is an approximate solution of Eq.(4).

3 Parameters Selection

This section discusses the determination of the regularization parameter.We first consider the effects of parameters a and μ on ${{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}$.It follows from the solution that

$ \frac{\partial }{{\partial a}}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = 2\mu \sum\limits_{i = 1}^r {\frac{{\log \left( {{\sigma _i}} \right)\sigma _i^{ - a}}}{{{{\left( {{\sigma _i} + \mu \sigma _i^{ - a}} \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $

(42)

and

$ \frac{\partial }{{\partial \mu }}{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\|^2} = - 2\sum\limits_{i = 1}^r {\frac{{\sigma _i^{2a}}}{{{{\left( {\sigma _i^{a + 1} + \mu } \right)}^3}}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} < 0 $

(43)

i.e., ${\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|^2}$ is a monotonically decreasing function about μ.It is quite natural that $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|^2}$ is a monotonically increasing function about a when log(σ_i)>0^[14].The amount of error e determines the choice of the regularization parameter μ, so that different values of the regularization parameter μ are obtained with different e.Generally, the larger e is, the larger μ should be.Due to the norm of the computed solution decreases as μ increases, $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|}$ may be smaller than the norm of solution of Eq.(24).Interestingly, the value of a < 1 increases $\;{\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}}} \right\|}$, i.e., the appropriate value of a can yield a more accurate approximate solution of ${\mathit{\boldsymbol{\hat x}}}$.

Conjugating

$ {\mathit{\boldsymbol{x}}_\mu } = \mathit{\boldsymbol{L}}_A^\dagger {{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}} + {\mathit{\boldsymbol{x}}_0} $

we have

$ \mathit{\boldsymbol{\tilde A\tilde x}} - \mathit{\boldsymbol{\tilde b}} = \mathit{\boldsymbol{Ax}} - \mathit{\boldsymbol{b}} $

(44)

where ${\mathit{\boldsymbol{\tilde A}}}$ and ${\mathit{\boldsymbol{\tilde b}}}$ are given by Eq.(16) and Eq.(17).The discrepancy associated with x_μ is defined by

$ {\mathit{\boldsymbol{d}}_\mu }: = \mathit{\boldsymbol{b}} - \mathit{\boldsymbol{A}}{\mathit{\boldsymbol{x}}_\mu } $

(45)

and we assume that an estimate of the norm of the error

$ \varepsilon : = \left\| \mathit{\boldsymbol{e}} \right\| $

(46)

Then we can apply the discrepancy principle to determine a suitable value of the regularization parameter μ.Let a>0 be fixed and define that

$ \delta = \varepsilon \eta $

(47)

where η>1 is a user-supplied constant independent of ε.We determine μ>0, so that the solution x_μ of Eq.(4) satisfies

$ \left\| {\mathit{\boldsymbol{\tilde b}} - \mathit{\boldsymbol{\tilde A}}{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,a}}} \right\| = \varepsilon \eta = \delta $

(48)

Then the vector x_μ is asked to satisfy the discrepancy principle^[15].Solution of Eq.(48) about μ is equivalent to the positive zero of the function

$ {\varphi _a}\left( \mu \right) = \sum\limits_{i = 1}^r {{{\left( {\lambda \sigma _i^{a + 1} + 1} \right)}^{ - 2}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} + \sum\limits_{i = r + 1}^m {{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} - {\delta ^2} $

(49)

where r is the rank of A.Thus

$ {{\varphi '}_a}\left( \mu \right) = - 2\sum\limits_{i = 1}^r {\sigma _i^{a + 1}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 3}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $

(50)

and

$ {{\varphi ''}_a}\left( \mu \right) = 6\sum\limits_{i = 1}^r {\sigma _i^{2a + 2}{{\left( {\mu \sigma _i^{a + 1} + 1} \right)}^{ - 4}}{{\left( {\mathit{\boldsymbol{u}}_i^{\rm{T}}\mathit{\boldsymbol{\tilde b}}} \right)}^2}} $

(51)

We consider the initial approximate solution μ₀:=0 for Newton method with μ=μ₀-φ′_a(μ)/φ″_a(μ) to compute the positive zero of the function φ_a(μ).The iterations with Newton′s method are terminated as soon as a value of μ, such that

$ {\varphi _a}\left( \mu \right) \le \frac{1}{{100}}\left( {{\eta ^2} - 1} \right){\varepsilon ^2} $

(52)

has been determined.The factor 1/100 in Eq.(52) is used in our implementation, but other positive factors strictly smaller than 1 can be also used.

4 Numerical Examples

We use three text examples to illustrate the performance of the Arnoldi projection fractional Tikhonov(APFT) regularization and compare them to Arnoldi fractional Tikhonov (AFT) and Arnoldi Tikhonov (AT) for large scale linear discrete ill-posed problems.The orthogonal projection with

$ P: = 1/{n^{1/2}}{\left[ {1,1, \cdots ,1} \right]^{\rm{T}}} \in {{\bf{R}}^n} $

has the same null space as the regularization operator

$ \mathit{\boldsymbol{L}} = \left[ {\begin{array}{*{20}{c}} 1&{ - 1}&{}&0\\ {}&1&{ - 1}&{}\\ {}&{}& \ddots&\ddots \\ 0&{}&1&{ - 1} \end{array}} \right] \in {{\bf{R}}^{\left( {n - 1} \right) \times n}} $

which will be applied in the following examples.All computations were carried out in MATLAB with about 16 significant decimal digits.

Example 1 Considering the Fredholm integral equation of the first kind shown as

$ \int_{ - \frac{{\rm{ \mathsf{ π} }}}{2}}^{\frac{{\rm{ \mathsf{ π} }}}{2}} {k\left( {\omega ,\rho } \right)x\left( \rho \right){\rm{d}}\rho } = g\left( \omega \right)\;\;\;\;\; - \frac{{\rm{ \mathsf{ π} }}}{2} \le \omega \le \frac{{\rm{ \mathsf{ π} }}}{2} $

the MATLAB code Shaw produces a discretization A∈R^{1 000×1 000} and the right-hand side ${\mathit{\boldsymbol{\hat b}}}$∈R^{1 000} by a Galerkin method with orthonormal box functions^[16].The noise-level λ is defined by $\lambda = \left\| \mathit{\boldsymbol{e}} \right\|/\left\| {\mathit{\boldsymbol{\hat b}}} \right\|$.Then, we will give a comparison of the approximate solution by the APFT regularization method and exact solution when taking the different value of the error vector e.

Fig. 1 illustrates that the approximate solution obtained by the APFT method can approximate the exact solution well, which means that APFT regularization method is effective.

Fig. 1 Recovery results of Phillips with diverse noise-level

Example 2 The Fredholm integral equation of the first kind is

$ \int_a^b {k\left( {s,t} \right)f\left( t \right){\rm{d}}t} = g\left( s \right)\;\;\;\;\;c \le s \le d $

and the MATLAB code discretes Barrt, Shaw, Phillips, Gravity, Foxgod and Deriv2 by a Galerkin method with orthonormal box functions about the matrix order n=1 000.The noise-level λ is defined by

$ \lambda = \frac{{\left\| \mathit{\boldsymbol{e}} \right\|}}{{\left\| {\mathit{\boldsymbol{\hat b}}} \right\|}} $

The regularization parameter μ is determined by the discrepancy principle.The tables report relative errors $\left\| {{{\mathit{\boldsymbol{\tilde x}}}_{\mu ,\mathit{a}}} - \mathit{\boldsymbol{\hat x}}} \right\|/\left\| {\mathit{\boldsymbol{\hat x}}} \right\|$ for several noise-level and show that the method we proposed improves the accuracy of the computed solutions.

Tables 1 and 2 show the qualities of AT, AFT and APFT for various examples (n=1 000).The following results show that APFT usually renders solutions of high quality.In other words, we can see that APFT is superior to AFT and AT.

Table 1 Qualities of these methods with the error-level (λ=1%)

Table 2 Qualities of these methods with the error-level (λ=10%)

Example 3 We show the performance of the method about the restoration of a discrete image which has been contaminated by blur and noise.Our task is to deblur the two-dimensional images degraded by additive noise and spatially invariant blur.The restoration problems were proposed by the US Air Force Phillips Laboratory.The two-dimensional image restoration problem can be modeled by a linear system of equations Ax=b.The matrix A is a discrete blurring operator referred to as a discrete point spread function.Then the components of the vectors b and ${\mathit{\boldsymbol{\hat x}}}$ are the lexicographically-ordered pixel values of distorted images and the exact, respectively.We efficiently compute matrix-vector products without explicitly forming A by using the fast discrete Fourier transform and the discrete point spread function.

Fig. 2 displays the noise- and blur-free images, the contaminated image, as well as restored images of Lena which determined by the AFT and APFT methods.Meanwhile, the images above illustrate that APFT gives better reconstructions than AFT.

Fig. 2 Original, blurred, and restored Lena images

Fig. 3 displays the noise- and blur-free images, the contaminated image, as well as restored images of "MATH" which are determined by the AT and APFT methods.The approximate solutions abtained by the APFT method are nearly optimal for this example.Actually, the computed solutions are close to the orthogonal projection of the exact solution into the range-restricted subspace.However, the AT produces an approximate solution of lower quality than the APFT method.

Fig. 3 Original, blurred, and restored MATH images

5 Conclusions

In this paper, we propose the APFT regularization method for solving the large scale linear discrete ill-posed problems.Our method is easy to realize and numerical examples show that the proposed method is effective by which we can give a more accurate approximation than AT and AFT methods.

Acknowledgements

This work was supported by the National Natural Science Foundations of China (Nos.11571171 and 61473148).

References

[1]	HANSEN P C. Regularization tools:A Matlab package for analysis and solution of discrete ill-posed problems[J]. Numerical Algorithms, 1994, 6(1): 1-35. DOI:10.1007/BF02149761
[2]	FUHRY M, REICHEL L. A new Tikhonov regularization method[J]. Numerical Algorithms, 2012, 59(3): 433-445. DOI:10.1007/s11075-011-9498-x
[3]	HANKE M, HANSEN P C. Regularization methods for large-scale problems[J]. Surveys on Mathematics for Industry, 1993, 3(4): 253-315.
[4]	CALVETTI D, REICHEL L, SHUIBI A. Invertible smoothing preconditioners for linear discrete ill-posed problems[J]. Applied Numerical Mathematics, 2005, 54(2): 135-149. DOI:10.1016/j.apnum.2004.09.027
[5]	HANSEN P C, JENSEN T K. Smoothing-norm preconditioning for regularizing minimum-residual methods[J]. SIAM Journal on Matrix Analysis and Applications, 2006, 29(1): 1-14.
[6]	MILLER M. Least squares methods for ill-posed problems with a prescribed bound[J]. SIAM Journal on Mathematical Analysis, 2012, 1(1): 52-74.
[7]	MORIGI S, REICHEL L, SGALLARI F. Orthogonal projection regularization operators[J]. Numerical Algorithms, 2007, 44(2): 99-114. DOI:10.1007/s11075-007-9080-8
[8]	HOCHSTENBAC M E, REICHEL L. Fractional Tikhonov regularization for linear discrete ill-posed problems[J]. BIT Numerical Mathematics, 2011, 51(1): 197-215. DOI:10.1007/s10543-011-0313-9
[9]	HOCHSTENBAC M E, NOSCHESE S, REICHEL L. Fractional regularization matrices for linear discrete ill-posed problems[J]. Journal of Engineering Mathematics, 2015, 93(1): 1-17. DOI:10.1007/s10665-014-9780-8
[10]	REICHEL L, RODRIGUEZ G. Old and new parameter choice rules for discrete ill-posed problems[J]. Numerical Algorithms, 2013, 63(1): 65-87. DOI:10.1007/s11075-012-9612-8
[11]	LEWIS B, REICHEL L. Arnoldi-Tikhonov regularization methods[J]. Journal of Computational and Applied Mathematics, 2009, 226(1): 92-102. DOI:10.1016/j.cam.2008.05.003
[12]	SIADAT M, AGHAZADE H, OKTEM N. Reordering for improving global Arnoldi-Tikhonov method in image restoration problems[J]. Signal Image and Video Processing, 2017(2): 1-8.
[13]	LIN Y, BAO L, CAO Y. Augmented Arnoldi-Tikhonov regularization methods for solving large-scale linear ill-posed systems[J]. Mathematical Problems in Engineering, 2013, 5: 87-118.
[14]	KLANN E, RAMLAU R. Regularization by fractional filter methods and data smoothing[J]. Inverse Problems, 2008, 24(2): 025018. DOI:10.1088/0266-5611/24/2/025018
[15]	ENGL H W. Discrepancy principles for Tikhonov regularization of ill-posed problems leading to optimal convergence rates[J]. Journal of Optimization Theory and Applications, 1987, 52(2): 209-215. DOI:10.1007/BF00941281
[16]	HANSEN P C. Regularization tools version 4.0 for Matlab 7.3[J]. Numerical Algorithms, 2007, 46(2): 189-194. DOI:10.1007/s11075-007-9136-9


Transactions of Nanjing University of Aeronautics and Astronautics 2018, Vol. 35 Issue (3): 395-402	PDF