Error Estimations for Total Variation Type Regularization

Li, Kuan; Huang, Chun; Yuan, Ziyang

doi:10.3390/math9121373

Open AccessArticle

Error Estimations for Total Variation Type Regularization

by

Kuan Li

^1,2,*

,

Chun Huang

³ and

Ziyang Yuan

⁴

¹

School of Cyberspace Security, Dongguan University of Technology, Dongguan 523808, China

²

Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen 518060, China

³

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China

⁴

Department of Mathematics, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(12), 1373; https://0-doi-org.brum.beds.ac.uk/10.3390/math9121373

Submission received: 1 June 2021 / Revised: 8 June 2021 / Accepted: 11 June 2021 / Published: 13 June 2021

Download Versions Notes

Abstract

:

This paper provides several error estimations for total variation (TV) type regularization, which arises in a series of areas, for instance, signal and imaging processing, machine learning, etc. In this paper, some basic properties of the minimizer for the TV regularization problem such as stability, consistency and convergence rate are fully investigated. Both a priori and a posteriori rules are considered in this paper. Furthermore, an improved convergence rate is given based on the sparsity assumption. The problem under the condition of non-sparsity, which is common in practice, is also discussed; the results of the corresponding convergence rate are also presented under certain mild conditions.

Keywords:

total variation; regularization; inverse problem

1. Introduction

Compressed sensing [1,2] has gained increasing attention in recent years; it plays an important role in signal processing [3,4], imaging science [5,6] and machine learning [7]. Compressed sensing focusses on signals with sparse presentation. Let

H_{1}

be a Hilbert space, and

{e_{i} \in H_{1} | i \in N}

be the orthonormal basis of

H_{1}

. For any

x \in H_{1}

, let

x_{i} : = 〈 x, e_{i} 〉

. Given some operators K satisfy certain conditions, it is possible to recover a sparse

x^{†} \in C^{n}

signal with length n by Basis Pursuit (BP) [8], i.e.,

{min | | x | |}_{1} s . t . y^{†} = K x,

from the samples

y^{†} = K x^{†}

, even K is ill-posed [2,9,10]. However, in most cases, noise is inevitable. The literatures has turned to studying the noised BP model

{min | | x | |}_{1} s . t . {∥ K x - y^{†} ∥}_{2} \leq δ,

where

δ

is the allowed error. Actually, the unconstrained form of the noised BP model, i.e., sparse regularization which is the focus in [11,12,13,14,15,16] is more attractive. While the success of compressed sensing greatly inspired the development of sparse regularization, it is interesting to see that sparse regularization appeared much earlier than compressed sensing [11,12]. As an inverse problem, the error theory of sparse regularization is well studied in the literature [17,18,19].

In practical terms, a large crowd of signals is not sparse unless being transformed by some operators (maybe ill posed). Thus, many studies have been proposed to analyze the regularized optimization problem [20]. A typical example of them is signal with a sparse gradient which arises frequently from imaging processing (nature images are usually piece-wise constant, i.e., they have a sparse gradient). The Total Variation (TV) has been used extensively in the literature for decades in imaging sciences and a series of techniques have been dedicated to researching its choice of regularization parameter [21,22,23,24,25,26,27,28,29,30,31]; others [32,33] are developed based on this observation. Similar to [34], Total Variation can also smooth the signal of interest. Let

H_{2}

be another Hilbert space. For any

x \in H_{1}

, define that

T : H_{1} \mapsto H_{1}

satisfies

{(T x)}_{i} : = x_{i} - x_{i + 1} .

Under the above definition, T is an ill-posed linear operator. Given a linear map

K : H_{1} \mapsto H_{2}

and

y^{δ} \in H_{2}

, the total variation regularization problem can be represented as

Ψ_{α} (x) = \frac{1}{2} ∥ K x - y^{δ} ∥_{2}^{2} + α \sum_{i} | {(T x)}_{i} |,

where

α > 0

is the regularization parameter. The regularization term

\sum_{i} | {(T x)}_{i} |

is the right total variation (TV) of x. The TV type regularization has a similar form to the sparse regularization. However, the perfect reconstruction result established in sparse regularization can not be applied to theTV type directly, especially when T is ill-posed (T has a nontrivial null space).

So in this paper, firstly, we discuss the stability and consistency of the minimizers of

Ψ_{α}

. Besides basic properties, we are also interested in the convergence rate to solve the TV problem. Then, under the source conditions [19,35,36], convergence rates get obtained for both a priori and a posteriori parameter choice rules. However, the linear convergence rate requires K to be injective, which is strict usually. In the latter part, the linear convergence rate can also be derived under the sparsity assumption on

T x^{†}

and some suitable conditions for K. This requirement of deduction does not depend on the injectivity of K. Meanwhile, this paper also considers the case when the sparsity assumption on

T x^{†}

fails. Last, based on some recent works [37,38,39], which also assume the

T x^{†}

is not sparse, a convergence rate is also given in this case.

The rest of this paper is organized as follows. Section 2 provides a brief summary of the notations. Section 3 presents some basic properties and gives the convergence rate of the minimizer. Section 4 proves the improved convergence rate. Finally, Section 5 concludes the whole paper.

2. Notation

The notations described in this section are adopted throughout this paper. Let

H_{1}

,

H_{2}

be two Hilbert spaces and

{e_{i} \in H_{1} | i \in N}

,

{ξ_{i} \in H_{2} | i \in N}

be the orthonormal basis of

H_{1}

and

H_{2}

, respectively. For any

x \in H_{1}

and

y \in H_{2}

,

x_{i} : = 〈 x, e_{i} 〉

and

y_{i} : = 〈 y, ξ_{i} 〉

. The

ℓ_{1}

and

ℓ_{2}

norms of x and y are denoted by

{∥ x ∥}_{ℓ^{1}} : = \sum_{i} | x_{i} |

,

{∥ x ∥}_{ℓ^{2}} : = {(\sum_{i} {| x_{i} |}^{2})}^{\frac{1}{2}}

and

{∥ y ∥}_{1} : = \sum_{i} | y_{i} |

,

{∥ y ∥}_{2} : = {(\sum_{i} {| y_{i} |}^{2})}^{\frac{1}{2}}

, respectively. In this paper, if not specified, for any

x \in H_{1}

and

y \in H_{2}

, we assume that

x, y \in L^{2}

, i.e.,

{∥ x ∥}_{ℓ^{2}} < + \infty

and

{∥ y ∥}_{2} < + \infty

.

x^{n} ⇀ x

means that

x^{n}

converges weakly to x, while

x^{n} \to x

means

x^{n}

converges strongly to x. The operator norm of the linear operator

K : H_{1} \mapsto H_{2}

is defined as

∥ K ∥ : = max_{{∥ x ∥}_{ℓ^{2}} = 1} {∥ K x ∥}_{2} .

Through the paper,

x^{†}

means the signal of interest;

y^{†} : = K x^{†}

are the measurements.

y^{δ}

denotes an element in

H_{2}

satisfying

∥ y^{†} - y^{δ} ∥_{2} \leq δ

. Under these notations, the TV regularization can be expressed as

Ψ_{α} (x) = \frac{1}{2} ∥ K x - y^{δ} ∥_{2}^{2} + α {∥ T x ∥}_{ℓ^{1}} .

Denote that

x_{α}^{δ}

is one of the minimizers of

Ψ_{α}

.

Remark 1.

Considering the set

L = {x^{n}}_{n = 1, 2, \dots} \subseteq H_{1}

, where

\begin{matrix} x_{i}^{n} : = \{\begin{matrix} 1 / \sqrt{n} & i f i \leq n, \\ 0 & i f i > n . \end{matrix} . \end{matrix}

Obviously, for any n,

x^{n} \in L^{2}

and

T (x^{n}) = 1 / \sqrt{n}

. As

n \to + \infty

,

∥ x^{n} ∥_{ℓ^{2}} = 1

and

T (x^{n}) \to 0

. That means T is ill posed.

Remark 2.

Let

D = T - I_{d}

, where

I_{d}

is the identical operator over

H_{1}

. Then,

{(D x)}_{i} = - x_{i + 1}

for any

i \in N

. It is easy to verify that D is continuous. Then, T is continuous over

H_{1}

and

∥ T - I_{d} ∥ = ∥ D ∥ \leq 1 .

(1)

In practice, The ill condition of T brings trouble to the analysis. To overcome this problem, we consider a condition which plays an important role in the deduction.

Condition 1.

There exist two constants

c, m > 0

such that

{c ∥ K x ∥}_{2} + {m ∥ T x ∥}_{ℓ^{2}} \geq {∥ x ∥}_{ℓ^{2}}

for any

x \in H_{1}

.

We present a finite-dimensional understanding of this condition. Let

d i m (H_{1}) = M

and

d i m (H_{2}) = N

. Then,

K \in R^{M \times N}

satisfies

n u l l (K) \neq 0

and

T \in R^{(N - 1) \times N}

. In the finite dimension case, T has the form

{(\begin{matrix} 1 & - 1 \\ 1 & - 1 \\ ⋱ & ⋱ \\ 1 & - 1 \end{matrix})}_{(N - 1) \times N} .

The definition of T gives that

n u l l (T) = s p a n (\vec{1})

. If

K \vec{1} \neq 0

. Then,

n u l l (K) ⋂ n u l l (T)

= 0

, we have that

n u l l (\begin{matrix} \hat{c} K \\ \hat{m} T \end{matrix}) = 0

, where

\hat{c}, \hat{m} > 0

. Hence, for any

x \in R^{n}

and some

ι > 0

,

ι ∥ (\begin{matrix} \hat{c} K \\ \hat{m} T \end{matrix}) {x ∥}_{2} \geq {∥ x ∥}_{2}

. Note that

∥ (\begin{matrix} \hat{c} K \\ \hat{m} T \end{matrix}) {x ∥}_{2} \leq \hat{c} {∥ K x ∥}_{2} + \hat{m} {∥ T x ∥}_{2}

; we then have

ι \hat{c} {∥ K x ∥}_{2} + ι \hat{m} {∥ T x ∥}_{2} \geq {∥ x ∥}_{2}

.

3. Basic Error Estimations

The properties of TV type regularization are investigated in this section. First, a lemma is introduced which is used in this section frequently.

Lemma 1.

Let

y^{δ}

be bounded, α be fixed and

{x^{n}}_{n = 1, 2, \dots}

be a sequence. Assume that Condition 1 holds and

{Ψ_{α} (x^{n})}_{n = 1, 2, \dots}

is bounded. Then,

{x^{n}}_{n = 1, 2, \dots}

is also bounded.

Proof.

It is trivial to prove

{∥ K x^{n} - y^{δ} {∥_{2}}}_{n = 1, 2, \dots}

and

{∥ T x^{n} {∥_{ℓ^{1}}}}_{n = 1, 2, \dots}

are bounded. Note that

∥ T x^{n} ∥_{ℓ^{2}} \leq ∥ T x^{n} ∥_{ℓ^{1}} and ∥ K x^{n} ∥_{2} \leq ∥ K x^{n} - y^{δ} ∥_{2} + {∥ y^{δ} ∥}_{2},

which implies

{∥ K x^{n} {∥_{2}}}_{n = 1, 2, \dots}

and

{∥ T x^{n} {∥_{ℓ^{2}}}}_{n = 1, 2, \dots}

are bounded. From Condition 1, we derive that

∥ x^{n} ∥_{ℓ^{2}} \leq c ∥ K x^{n} ∥_{2} + m {∥ T x^{n} ∥}_{ℓ^{2}}^{.}

implies the boundedness of

{x^{n}}_{n = 1, 2, \dots}

. □

3.1. Stability

In this subsection, we investigate the performance of

x_{\hat{α}}^{δ}

as

\hat{α} \to α

, when

y^{δ}

is fixed. A lemma is introduced which arises in convex optimization.

Lemma 2

([40,41]). Let

χ^{*}

be the solution set of the convex minimization problem

min_{x} Ψ_{α} (x) .

Then,

K x

and

{∥ T x ∥}_{1}

is constant over

χ^{*}

.

Theorem 1.

Assume that

K, T

satisfies Condition 1. For any fixed

α > 0

and

y^{δ} \in H_{2}

, we have

lim_{α_{n} \to α} K x_{α_{n}}^{δ} = K x_{α}^{δ} .

(2)

Proof.

The minimizing property of

x_{α_{n}}^{δ}

gives that

\frac{1}{2} ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2}^{2} + α {∥ T x_{α_{n}}^{δ} ∥}_{1} \leq Ψ_{α_{n}} (0)

. Then, Lemma 1 indicates that there exists a subsequence of

{x_{α_{n}}^{δ}}

converging weakly to some

x^{*} \in ℓ^{2}

. For simplicity, we also denote this subsequence as

{x_{α_{n}}^{δ}}

. By the weak lower continuity of the norms, we have

∥ K x^{*} - y^{δ} ∥_{2} \leq \underset{n}{lim inf} ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2} and ∥ T x^{*} ∥_{ℓ^{1}} \leq \underset{n}{lim inf} {∥ T x_{α_{n}}^{δ} ∥}_{ℓ^{1}} .

(3)

Therefore, we have that

\begin{matrix} Ψ_{α} (x^{*}) & = & \frac{1}{2} ∥ K x^{*} - y^{δ} ∥_{2}^{2} + α {∥ T x^{*} ∥}_{1} \\ \leq & \underset{n}{lim inf} {\frac{1}{2} ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2}^{2} + α_{n} ∥ T x_{α_{n}}^{δ} ∥_{ℓ^{1}}} \\ = & \underset{n}{lim inf} Ψ_{α_{n}} (x_{α_{n}}^{δ}) . \end{matrix}

On the other hand, by the minimizing property of

x_{α_{n}}^{δ}

,

\underset{n}{lim sup} Ψ_{α_{n}} (x_{α_{n}}^{δ}) \leq \underset{n}{lim sup} Ψ_{α_{n}} (x_{α}^{δ}) = lim_{n} Ψ_{α_{n}} (x_{α}^{δ}) = Ψ_{α} (x_{α}^{δ}) .

Obviously, it holds that

\underset{n}{lim sup} Ψ_{α_{n}} (x_{α_{n}}^{δ}) \leq Ψ_{α} (x_{α}^{δ}) \leq Ψ_{α} (x^{*}) \leq \underset{n}{lim inf} Ψ_{α_{n}} (x_{α_{n}}^{δ}) .

That means

x^{*}

minimizes

Ψ_{α} (x)

. From Lemma 2,

K x^{*} = K x_{α}^{δ}

and

∥ T x^{*} ∥_{ℓ^{1}} = {∥ T x_{α}^{δ} ∥}_{ℓ^{1}}

. Consequently, we have

K x_{α_{n}}^{δ} ⇀ K x_{α}^{δ}

,

Ψ_{α_{n}} (x_{α_{n}}^{δ}) \to Ψ_{α} (x_{α}^{δ})

and

∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq {lim inf}_{n} {∥ T x_{α_{n}}^{δ} ∥}_{ℓ^{1}}

. In the following, we present the proof by the mean of contradiction. Assume that

t : = {lim sup}_{n} ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2} > {∥ K x_{α}^{δ} - y^{δ} ∥}_{2}

. We can obtain that

\begin{matrix} α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} & \leq & \underset{n}{lim inf} {α ∥ T x_{α_{n}}^{δ} ∥_{ℓ^{1}}} \\ = & \underset{n}{lim inf} {Ψ_{α_{n}} (x_{α_{n}}^{δ}) - ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2}^{2}} \\ = & Ψ_{α} (x_{α}^{δ}) - \underset{n}{lim sup} {∥ K x_{α_{n}}^{δ} - y^{δ} ∥}_{2}^{2} \\ = & α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} + (∥ K x_{α}^{δ} - y^{δ} ∥_{2} - t) \\ < & α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} . \end{matrix}

(4)

This is a contradiction. Then, we have

\underset{n}{lim sup} ∥ K x_{α_{n}}^{δ} - y^{δ} ∥_{2} \leq {∥ K x_{α}^{δ} - y^{δ} ∥}_{2} .

From relations (3), we can obtain that

K x_{α_{n}}^{δ} \to K x_{α}^{δ}

. □

If K is injective, we can further have that

{lim}_{α_{n} \to α} x_{α_{n}}^{δ} = x_{α}^{δ}

. The theorem above indicates that

Ψ_{α} (x_{α}^{δ})

and

∥ T x_{α}^{δ} ∥_{1}

are continuous at

α

. In fact, we can obtain a stronger result; the value function is differentiable at

α

.

Theorem 2.

Let

F (α) : = Ψ_{α} (x_{α}^{δ})

; then,

F (α)

is differentiable with respect to α, and

F^{'} (α) = {∥ T x_{α}^{δ} ∥}_{1}

.

Proof.

For

α > \hat{α}

, we have

\begin{matrix} F (α) - F (\hat{α}) & = & \frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} - \frac{1}{2} {∥ K x_{\hat{α}}^{δ} - y^{δ} ∥}_{2}^{2} \\ - & α ∥ T x_{\hat{α}}^{δ} ∥_{ℓ^{1}} + (α - \hat{α}) {∥ T x_{\hat{α}}^{δ} ∥}_{ℓ^{1}} . \end{matrix}

Due to that

x_{α}^{δ}

minimizing

Ψ_{α}

, we have

\frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} - \frac{1}{2} ∥ K x_{\hat{α}}^{δ} - y^{δ} ∥_{2}^{2} - α {∥ T x_{\hat{α}}^{δ} ∥}_{ℓ^{1}} \leq 0 .

It follows that

F (α) - F (\hat{α}) \leq (α - \hat{α}) {∥ T x_{\hat{α}}^{δ} ∥}_{ℓ^{1}}

. On the other hand,

F (α) - F (\hat{α})

can be written as

\begin{matrix} F (α) - F (\hat{α}) & = & \frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + \hat{α} ∥ T x_{α}^{δ} ∥_{ℓ^{1}} - \frac{1}{2} {∥ K x_{\hat{α}}^{δ} - y^{δ} ∥}_{2}^{2} \\ - & \hat{α} ∥ T x_{\hat{α}}^{δ} ∥_{ℓ^{1}} + (α - \hat{α}) {∥ T x_{α}^{δ} ∥}_{ℓ^{1}} . \end{matrix}

Similarly, we have

F (α) - F (\hat{α}) \geq (α - \hat{α}) {∥ T x_{α}^{δ} ∥}_{ℓ^{1}}

. Combining the two inequalities above, we have

∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq \frac{F (α) - F (\hat{α})}{α - \hat{α}} \leq {∥ T x_{\hat{α}}^{δ} ∥}_{ℓ^{1}} .

When

α < \hat{α}

, similar results can be also obtained. The continuity of

∥ T x_{α}^{δ} ∥_{ℓ^{1}}

at

α

gives that

\frac{d F (α)}{d α} = {∥ T x_{α}^{δ} ∥}_{ℓ^{1}}

. □

3.2. Consistency

The performance of

x_{α}^{δ}

is investigated under a prior parameter choice as

δ \to 0

. In the analysis, we assume that the following conditions hold.

Condition 2.

For any

x \in H_{1}

obeying

K x = y^{†}

,

x^{†}

satisfies that

∥ T x^{†} ∥_{ℓ^{1}} \leq {∥ T x ∥}_{ℓ^{1}} .

The equality holds if and only if

x = x^{†}

.

Lemma 3.

Let

{x^{n}}_{n} ⇀ x^{*}

,

∥ K x^{n} - y^{†} ∥_{2}^{2} \to {∥ K x^{*} - y^{†} ∥}_{2}^{2}

and

∥ T x^{n} ∥_{ℓ^{1}} \to {∥ T x^{*} ∥}_{ℓ^{1}}

. Then, we have

∥ T (x^{n} - x^{*}) ∥_{ℓ^{1}} \to 0

and

∥ K (x^{n} - x^{*}) ∥_{2} \to 0

.

Proof.

We can obtain that

\begin{matrix} \underset{n}{lim sup} {∥ T (x^{n} - x^{*}) ∥}_{ℓ^{1}} & = & \underset{n}{lim sup} ((∥ T x^{n} ∥_{ℓ^{1}} + ∥ T x^{*} ∥_{ℓ^{1}}) \\ - & (∥ T x^{n} ∥_{ℓ^{1}} + ∥ T x^{*} ∥_{ℓ^{1}}) + ∥ T (x^{n} - x^{*}) ∥_{ℓ^{1}})) \\ = & 2 ∥ T x^{*} ∥_{ℓ^{1}} - \underset{n}{lim inf} ((∥ T x^{n} ∥_{ℓ^{1}} + ∥ T x^{*} ∥_{ℓ^{1}}) \\ - & ∥ T (x^{n} - x^{*}) ∥_{ℓ^{1}}) . \end{matrix}

The triangle inequality gives that

(∥ T x^{n} ∥_{ℓ^{1}} + ∥ T x^{*} ∥_{ℓ^{1}}) - ∥ T (x^{n} - x^{*}) ∥_{ℓ^{1}} \geq 0

. The Fatou’s lemma gives that

\begin{matrix} \underset{n}{lim inf} ((∥ T x^{n} ∥_{ℓ^{1}} & + & ∥ T x^{*} ∥_{ℓ^{1}}) - ∥ T (x^{n} - x^{*}) ∥_{ℓ^{1}}) \\ = & \underset{n}{lim inf} (\sum_{i} | {(T x^{n})}_{i} | + | {(T x^{*})}_{i} | - | {(T (x^{n} - x^{*}))}_{i} |) \\ \leq & \sum_{i} \underset{n}{lim inf} (| {(T x^{n})}_{i} | + | {(T x^{*})}_{i} | - | {(T (x^{n} - x^{*}))}_{i} |) . \end{matrix}

Note that

x^{n} - x^{*} ⇀ 0

; then,

T (x^{n} - x^{*}) ⇀ T 0 = 0

. Hence,

{[T (x^{n} - x^{*})]}_{i} \to 0

. Similarly, we can obtain

{(T x^{n})}_{i} \to {(T x^{*})}_{i}

. Therefore,

\sum_{i} \underset{n}{lim inf} (| {(T x^{n})}_{i} | + | {(T x^{*})}_{i} | - | {(T (x^{n} - x^{*}))}_{i} |) = \sum_{i} 2 | {(T x^{*})}_{i} | = 2 ∥ T x^{*} ∥_{ℓ^{1}} .

Thus, we have

\underset{n}{lim sup} {∥ T (x^{n} - x^{*}) ∥}_{ℓ^{1}} = 0 .

By the same method, we also can obtain that

∥ K (x^{n} - x^{*}) ∥_{2} \to 0

. □

Theorem 3.

Assume that

K, T

satisfies Condition 1 and Lemma 1. Let the parameters satisfy that

α (δ), \frac{δ^{2}}{α (δ)} \to 0 a s δ \to 0 .

Then the sequence

{x_{α}^{δ}}_{δ} \to x^{†}

.

Proof.

By the definition of

x_{α}^{δ}

, we have

\begin{matrix} \frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α {∥ T x_{α}^{δ} ∥}_{ℓ^{1}} & \leq & \frac{1}{2} ∥ K x^{†} - y^{δ} ∥_{2}^{2} + α {∥ T x^{†} ∥}_{ℓ^{1}} \\ \leq & \frac{1}{2} δ^{2} + α {∥ T x^{†} ∥}_{ℓ^{1}} . \end{matrix}

From the parameters’ choice rule of

α

and

δ

, we can see that

{Ψ_{α} (x_{α}^{δ})}

are bounded. Then, from Lemma 1, there exists a subsequence also denoted by

{x_{α}^{δ}}_{δ}

and some point

x^{*}

such that

x_{α}^{δ} ⇀ x^{*}

. We can have that

\begin{matrix} ∥ K x^{*} - y^{†} ∥_{2}^{2} & \leq & \underset{δ}{lim inf} {∥ K x_{α}^{δ} - y^{†} ∥}_{2}^{2} \\ \leq & 2 \underset{δ}{lim inf} (∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + ∥ y^{δ} - y^{†} ∥_{2}^{2}) \\ \leq & \underset{δ}{lim inf} (δ^{2} + 2 α (δ) ∥ T x^{†} ∥_{ℓ^{1}} + δ^{2}) = 0 . \end{matrix}

This means

K x^{*} = y^{†}

. It is easy to see that

{lim}_{δ} {∥ K x_{α}^{δ} - y^{†} ∥}_{2}^{2} = 0

. On the other hand, we can obtain that

\begin{matrix} ∥ T x^{*} ∥_{ℓ^{1}} \leq \underset{δ}{lim inf} {∥ T x_{α}^{δ} ∥_{ℓ^{1}} + \frac{δ^{2}}{2 α (δ)}} = ∥ T x^{†} ∥_{ℓ^{1}} . \end{matrix}

Condition 2 gives that

x^{*} = x^{†}

. From the inequality above, we see that

{lim}_{δ} ∥ T x_{α}^{δ} ∥_{ℓ^{1}} = {∥ T x^{†} ∥}_{ℓ^{1}} .

By Lemma 3, we have

∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{1}} \to 0

and

∥ K (x_{α}^{δ} - x^{*}) ∥_{2} \to 0

. Consequently, from Condition 1, it holds that

\begin{matrix} lim_{δ} {∥ x_{α}^{δ} - x^{*} ∥}_{ℓ^{2}} & \leq & lim_{δ} m ∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{2}} + c {∥ K (x_{α}^{δ} - x^{*}) ∥}_{2} \\ \leq & lim_{δ} m ∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{1}} + c {∥ K (x_{α}^{δ} - x^{*}) ∥}_{2} = 0 . \end{matrix}

□

3.3. Convergence Rate

This subsection concerns the convergence rate under different parameter choice rules (a priori and a posteriori). First, we discuss the a priori one. Like the classical Tikhonov regularization method [19,35,36], we introduce a source condition.

Condition 3.

Let

x^{†}

satisfy the source condition

\exists w : K^{*} w \in T^{*} \partial {∥ T x^{†} ∥}_{ℓ^{1}} .

Theorem 4.

If

x^{†}

satisfies the source condition, it holds that

∥ K x_{α}^{δ} - y^{δ} ∥_{2} \leq 2 α {∥ w ∥}_{2} + δ .

If K is injective, there exists

γ > 0

such that

∥ x_{α}^{δ} - x^{†} ∥_{ℓ^{2}} \leq 2 γ α {∥ w ∥}_{2} + 2 γ δ .

Proof.

The definition of

x_{α}^{δ}

gives that

\frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq \frac{1}{2} ∥ K x^{†} - y^{δ} ∥_{2}^{2} + α {∥ T x^{†} ∥}_{ℓ^{1}} .

Using the notation

C (x) = {∥ T x ∥}_{ℓ^{1}}

, we obtain that

\frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α C (x_{α}^{δ}) \leq \frac{1}{2} {∥ K x^{†} - y^{δ} ∥}_{2}^{2} + α C (x^{†}) .

For any

v \in \partial C (x^{†})

, the convexity of C indicates

C (x_{α}^{δ}) \geq C (x^{†}) + 〈 v, x_{α}^{δ} - x^{†} 〉

. Then, we have that

\begin{matrix} \frac{1}{2} {∥ K x_{α}^{δ} - y^{δ} ∥}_{2}^{2} + α C (x^{†}) & + & α 〈 v, x_{α}^{δ} - x^{†} 〉 \\ \leq & \frac{1}{2} {∥ K x_{α}^{δ} - y^{δ} ∥}_{2}^{2} + α C (x_{α}^{δ}) \\ \leq & \frac{1}{2} {∥ K x^{†} - y^{δ} ∥}_{2}^{2} + α C (x^{†}) . \end{matrix}

Choose

v = K^{*} w

in the source condition; after simplification, we derive that

\frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α 〈 w, K x_{α}^{δ} - y^{δ} 〉 \leq \frac{1}{2} {∥ K x^{†} - y^{δ} ∥}_{2}^{2} + α 〈 w, K x^{†} - y^{δ} 〉 .

By adding both sides with

\frac{α^{2} {∥ w ∥}_{2}^{2}}{2}

, we obtain that

∥ K x_{α}^{δ} - y^{δ} {+ α w ∥}_{2} \leq {∥ K x^{†} - y^{δ} + α w ∥}_{2} .

This means

∥ K x_{α}^{δ} - y^{δ} ∥_{2} \leq {2 α ∥ w ∥}_{2} + ∥ K x^{†} - y^{δ} ∥_{2} \leq 2 α {∥ w ∥}_{2} + δ .

If K is injective, there exists

γ > 0

such that

{∥ x ∥}_{ℓ^{2}} \leq γ {∥ K x ∥}_{ℓ^{2}}

. Then, we derive that

\begin{matrix} ∥ x_{α}^{δ} - x^{†} ∥_{2} & \leq & γ ∥ K x_{α}^{δ} - y^{†} ∥_{2} \\ \leq & γ (∥ K x_{α}^{δ} - y^{δ} ∥_{2} + ∥ y^{δ} - y^{†} ∥) \\ \leq & γ (2 α ∥ w ∥_{2} + 2 δ) . \end{matrix}

□

Remark 3.

In fact, the first result in Theorem 4 has been proved by [42] for general convex regularization. The proof here is for the completeness.

The following part investigates the a posteriori parameter choice rule. The analysis is motivated by the work in [43,44]. For simplicity of presentation, the parameter

α

is chosen as

∥ K x_{α}^{δ} - y^{δ} ∥_{2} = δ .

(5)

Theorem 5.

Assume that α is chosen as rule (5), and

x^{†}

satisfies Condition 2. It then holds that

lim_{δ \to 0} x_{α}^{δ} = x^{†} .

If K is injective, there exists

θ > 0

such that

∥ x_{α}^{δ} - x^{*} ∥_{ℓ^{2}} \leq 2 θ δ .

Proof.

It is trivial to prove that

\frac{1}{2} ∥ K x_{α}^{δ} - y^{δ} ∥_{2}^{2} + α ∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq \frac{1}{2} ∥ K x^{†} - y^{δ} ∥_{2}^{2} + α {∥ T x^{†} ∥}_{ℓ^{1}} .

Lemma 2 indicates that

{x_{α}^{δ}}_{δ}

is bounded. Note that

∥ K x_{α}^{δ} - y^{δ} ∥_{2} = δ

and

∥ K x^{†} - y^{δ} ∥_{2} \leq δ

. It then follows that

∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq {∥ T x^{†} ∥}_{ℓ^{1}} .

(6)

Then, the sequence has a sub-sequence also denoted by

{x_{α}^{δ}}_{δ}

converging weakly to some

x^{*}

. We can easily see that

\begin{matrix} ∥ K x^{*} - y^{†} ∥_{2} & \leq & \underset{δ \to 0}{lim inf} {∥ K x_{α}^{δ} - y^{†} ∥}_{2} \\ \leq & \underset{δ \to 0}{lim inf} (∥ K x_{α}^{δ} - y^{δ} ∥_{2} + ∥ y^{δ} - y^{†} ∥_{2}) \\ \leq & \underset{δ \to 0}{lim inf} 2 δ = 0 . \end{matrix}

That is actually to say that

K x^{*} = y^{†}

. Moreover, it is easy to see that

{lim}_{δ} {∥ K x_{α}^{δ} - y^{†} ∥}_{2}^{2} = 0

. Using relation (6), we have that

∥ T x^{*} ∥_{ℓ^{1}} \leq \underset{δ \to 0}{lim inf} ∥ T x_{α}^{δ} ∥_{ℓ^{1}} \leq {∥ T x^{†} ∥}_{ℓ^{1}} .

Condition 2 gives that

x^{*} = x^{†}

; hence, the whole sequence converges weakly to

x^{†}

and

∥ T x^{†} ∥_{ℓ^{1}} \leq \underset{δ \to 0}{lim inf} (∥ T x_{α}^{δ} ∥_{ℓ^{1}}) \leq \underset{δ \to 0}{lim sup} (∥ T x_{α}^{δ} ∥_{ℓ^{1}}) \leq ∥ T x^{†} ∥_{ℓ^{1}} .

Thus, we have

∥ T x_{α}^{δ} ∥_{ℓ^{1}} \to {∥ T x^{†} ∥}_{ℓ^{1}}

. From Lemma 3, we have

∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{1}} \to 0

and

∥ K (x_{α}^{δ} - x^{*}) ∥_{2} \to 0

which leads to

\begin{matrix} ∥ x_{α}^{δ} - x^{*} ∥_{ℓ^{2}} & \leq & m ∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{2}} + c {∥ K (x_{α}^{δ} - x^{*}) ∥}_{2} \\ \leq & m ∥ T (x_{α}^{δ} - x^{*}) ∥_{ℓ^{1}} + c {∥ K (x_{α}^{δ} - x^{*}) ∥}_{2} \to 0 . \end{matrix}

(7)

If K is injective, there exists

θ > 0

such that

{∥ x ∥}_{ℓ^{2}} \leq θ {∥ K x ∥}_{ℓ^{2}}

. Then, we derive that

\begin{matrix} ∥ x_{α}^{δ} - x^{†} ∥_{2} & \leq & θ ∥ K x_{α}^{δ} - y^{†} ∥_{2} \\ \leq & θ (∥ K x_{α}^{δ} - y^{δ} ∥_{2} + ∥ y^{δ} - y^{†} ∥) \leq 2 θ δ . \end{matrix}

□

4. Improved Convergence Rate

In this section, we investigate the convergence rate when K may be not injective. The first part presents the analysis under the sparse assumption while the second one deals with the case when the sparsity assumption fails.

4.1. Performance under Sparsity Assumption

The analysis in this subsection assumes that

T x^{†}

is sparse. To prove the convergence rate we need the finite injectivity property [45].

Condition 4.

The operatorKsatisfies the uniformly finite injectivity property, i.e., for any finite subset

S \subseteq N

,

{K |}_{S}

is injective.

Remark 4.

In the finite dimension case, if

♯ s u p p (S)

is small, it is easy to find that the finite injectivity property is actually the restrict isometry property [2,46].

Let

z : = T x

and

z^{†} : = T x^{†}

. Denote S as the set

S : = {i \in N : | v_{i} | > \frac{1}{2}}

, where

v \in \partial ∥ z^{†} ∥_{ℓ^{1}}

satisfies the source condition. Let

m = {sup}_{i \notin S} {| v_{i} |}

. Due to that

v \in ℓ^{2}

, S is finite and it contains the support of

z^{†}

. Let P be the identical projection onto S and

P^{⊥}

be the one onto

N \ S

. From Condition 4, there exists some

d > 0

such that

{d ∥ K P z ∥}_{2} \geq {∥ P z ∥}_{ℓ^{2}} .

Lemma 4.

Assume that

x^{†}

satisfies the source condition and Condition 1 holds. If

m d ∥ K ∥ < 1

, there exist

c_{1} > 0

and

c_{2} > 0

such that

{∥ T x ∥}_{ℓ^{1}} - ∥ T x^{†} ∥_{ℓ^{1}} \geq c_{1} ∥ x - x^{†} ∥_{ℓ^{2}} - c_{2} {∥ T (x - x^{†}) ∥}_{2} .

Proof.

Assume the conditions in Lemma 4 are held. Then, we can obtain that

\begin{matrix} ∥ z - z^{†} ∥_{ℓ^{2}} & \leq & ∥ P (z - z^{†}) ∥_{ℓ^{2}} + {∥ P^{⊥} (z - z^{†}) ∥}_{ℓ^{2}} \\ \leq & d ∥ K P (z - z^{†}) ∥_{2} + {∥ P^{⊥} z ∥}_{ℓ^{2}} \\ \leq & d ∥ K (z - z^{†}) ∥_{2} + (1 + d ∥ K ∥) ∥ P^{⊥} {z ∥}_{ℓ^{2}} . \end{matrix}

Hence, we derive that

\begin{matrix} ∥ K (z - z^{†}) ∥_{2} & = & ∥ K T (x - x^{†}) ∥_{2} \\ \leq & ∥ K (x - x^{†}) ∥_{2} + {∥ K (T - I_{d}) (x - x^{†}) ∥}_{2} \\ \leq & ∥ K (x - x^{†}) ∥_{2} + ∥ K ∥ \cdot ∥ T - I_{d} ∥ \cdot ∥ x - x^{†} ∥_{ℓ^{2}} \\ \leq & ∥ K (x - x^{†}) ∥_{2} + ∥ K ∥ \cdot ∥ x - x^{†} ∥_{ℓ^{2}} . \end{matrix}

We now turn to estimating

∥ P^{⊥} {z ∥}_{ℓ^{2}}

. Let

m = {sup}_{i \notin S} {| v_{i} |}

. Obviously,

m \leq \frac{1}{2}

. We then have that

\begin{matrix} ∥ P^{⊥} {z ∥}_{ℓ^{2}} & \leq & \sum_{i \notin S} | z_{i} | \leq 2 \sum_{i \notin S} (1 - m) | z_{i} | \leq 2 \sum_{i \notin S} (| z_{i} | - v_{i} z_{i}) \\ \leq & 2 \sum_{i \notin S} (| z_{i} | - | z_{i}^{†} | - v_{i} (z_{i} - z_{i}^{†})) \\ \leq & 2 (∥ z ∥_{ℓ^{1}} - ∥ z^{†} ∥_{ℓ^{1}} - 〈 v, z - z^{†} 〉) \\ = & 2 (∥ T x ∥_{ℓ^{1}} - ∥ T x^{†} ∥_{ℓ^{1}} - 〈 v, z - z^{†} 〉) . \end{matrix}

The source condition gives that

\begin{matrix} - 〈 v, z - z^{†} 〉 & = & - 〈 v, T x - T x^{†} 〉 = - 〈 T^{*} v, x - x^{†} 〉 \\ = & - 〈 K^{*} w, x - x^{†} 〉 = - 〈 w, K x - K x^{†} 〉 \\ \leq & {∥ w ∥}_{2} \cdot {∥ K x - K x^{†} ∥}_{2} . \end{matrix}

Therefore, we have that

\begin{matrix} ∥ z - z^{†} ∥_{ℓ^{2}} & \leq & d ∥ K ∥ \cdot ∥ x - x^{†} ∥_{ℓ^{2}} {+ 2 (1 + d ∥ K ∥) (∥ T x ∥}_{ℓ^{1}} - ∥ T x^{†} ∥_{ℓ^{1}}) \\ + & {(2 ∥ w ∥}_{2} + {2 d ∥ K ∥ ∥ w ∥}_{2} + d) ∥ K (x - x^{†}) ∥_{2} . \end{matrix}

From Condition 1, we have that

\begin{matrix} ∥ x - x^{†} ∥_{ℓ^{2}} & = & c ∥ K (x - x^{†}) ∥_{2} + m {∥ T x - T x^{†} ∥}_{ℓ^{2}} \\ \leq & c ∥ K (x - x^{†}) ∥_{2} + m {∥ z - z^{†} ∥}_{ℓ^{2}} \\ \leq & m d ∥ K ∥ \cdot ∥ x - x^{†} ∥_{ℓ^{2}} {+ 2 (m + m d ∥ K ∥) (∥ T x ∥}_{1} - ∥ T x^{†} ∥_{1}) \\ + & {(2 m ∥ w ∥}_{2} + {2 d m ∥ K ∥ ∥ w ∥}_{2} + m d + c) ∥ K (x - x^{†}) ∥_{2} . \end{matrix}

Note that

m d ∥ K ∥ < 1

; let

q = \frac{1}{1 - m d ∥ K ∥}

; we have that

\begin{matrix} ∥ x - x^{†} ∥_{ℓ^{2}} & \leq & {2 q (m + m d ∥ K ∥) (∥ T x ∥}_{1} - ∥ T x^{†} ∥_{1}) \\ + & {q (2 m ∥ w ∥}_{2} + {2 m d ∥ K ∥ ∥ w ∥}_{2} + m d + c) ∥ K (x - x^{†}) ∥_{2} . \end{matrix}

□

With the lemma above, we can obtain the following result. The proofs can be found in [44,47,48,49].

Theorem 6.

Let the regularization parameter be chosen a priori as

α (δ) = O (δ)

or a posteriori as

α (δ)

according to the strong discrepancy principle (5) Then we have the convergence rate

∥ x_{α}^{δ} - x^{†} ∥_{ℓ^{2}} = O (δ) .

4.2. Performance if Sparsity Assumption Fails

In this subsection, we focus on the case where

T x^{†}

is not sparse. As presented in the last section, lemma 4 is critical for the convergence rate analysis. In this part, a similar lemma will be proposed. Then, the convergence rate will be proved. The first lemma is motivated by [37].

Lemma 5.

For any

x \in H_{1}

and

n \in N

, it holds that

∥ T (x - x^{†}) ∥_{ℓ^{1}} - {∥ T x ∥}_{ℓ^{1}} + ∥ T x^{†} ∥_{ℓ^{1}} \leq 2 (\sum_{k = n + 1}^{\infty} | {(T x^{†})}_{k} | + \sum_{k = 1}^{n} | {(T x)}_{k} - {(T x^{†})}_{k} |) .

Proof.

Denote the projection

P_{n} (x) = (x_{1}, x_{2}, \dots, x_{n}, 0, \dots)

for any

x \in H_{1}

. Hence, we have

{∥ T x ∥}_{ℓ^{1}} = ∥ P_{n} {T x ∥}_{ℓ^{1}} + {∥ (I_{d} - P_{n}) T x ∥}_{ℓ^{1}} .

Algebra computation gives that

\begin{matrix} ∥ T (x - x^{†}) ∥_{ℓ^{1}} & - & {∥ T x ∥}_{ℓ^{1}} + ∥ T x^{†} ∥_{ℓ^{1}} = ∥ P_{n} T (x - x^{†}) ∥_{ℓ^{1}} + {∥ (I_{d} - P_{n}) T x^{†} ∥}_{ℓ^{1}} \\ + & ∥ (I_{d} - P_{n}) (T x - T x^{†}) ∥_{ℓ^{1}} - {∥ (I_{d} - P_{n}) T x ∥}_{ℓ^{1}} \\ + & ∥ P_{n} T x^{†} ∥_{ℓ^{1}} - {∥ P_{n} T x ∥}_{ℓ^{1}} . \end{matrix}

Note that

∥ (I_{d} - P_{n}) (T x - T x^{†}) ∥_{ℓ^{1}} \leq ∥ (I_{d} - P_{n}) {T x ∥}_{ℓ^{1}} + {∥ (I_{d} - P_{n}) T x^{†} ∥}_{ℓ^{1}}

and

∥ P_{n} T x^{†} ∥_{ℓ^{1}} \leq ∥ P_{n} T (x - x^{†}) ∥_{ℓ^{1}} + {∥ P_{n} T x ∥}_{ℓ^{1}} .

Combining the equations above, we obtain

∥ T (x - x^{†}) ∥_{1} - {∥ T x ∥}_{ℓ^{1}} + ∥ T x^{†} ∥_{ℓ^{1}} \leq 2 ∥ P_{n} T (x - x^{†}) ∥_{ℓ^{1}} + 2 {∥ (I_{d} - P_{n}) T x^{†} ∥}_{ℓ^{1}} .

□

Condition 5.

For all

k \in N

there exists

f_{k} \in H_{2}

such that

T^{*} e_{k} = K^{*} f_{k}

and

{lim}_{k \to \infty} {∥ f_{k} ∥}_{2} \to + \infty

.

Lemma 6.

Let

φ (t) : = 2 {inf}_{n} {\sum_{k = n + 1}^{\infty} | {(T x^{†})}_{k} | + t \sum_{k = 1}^{n} ∥ f_{k} ∥}

;

φ (t)

is concave index function. Assume that

x^{†}

satisfies the source condition and Conditions 1 and 5 hold. If

c ∥ K ∥ < 1

, it holds that

∥ x - x^{†} ∥_{ℓ^{2}} \leq c_{1} {∥ T x ∥}_{ℓ^{1}} - c_{2} ∥ T x^{†} ∥_{ℓ^{1}} + c_{3} φ (∥ K (x - x^{†}) ∥_{2})

for some positive

c_{1}, c_{2}, c_{3}

.

Proof.

φ

is concave and upper semi-continuous since it is an infimum of affine functions. For any

t \geq 0

,

φ

is finite and continuous. Note that

φ (0) = 0

; the upper semi-continuity at

t = 0

gives the continuity of

φ

at

t = 0

. We turn to the strict monotonicity of

φ

. Condition 5 means the infimum of

φ (t)

is attained at some

n \in N

. Considering

0 < t_{1} < t_{2} < + \infty

, we have

\begin{matrix} φ (t_{1}) & = & 2 (\sum_{k = n_{1} + 1}^{\infty} | {(T x^{†})}_{k} | + t_{1} \sum_{k = 1}^{n_{1}} ∥ f_{k} ∥_{2}) \\ \leq & 2 (\sum_{k = n_{2} + 1}^{\infty} | {(T x^{†})}_{k} | + t_{1} \sum_{k = 1}^{n_{2}} ∥ f_{k} ∥_{2}) \\ < & 2 (\sum_{k = n_{2} + 1}^{\infty} | {(T x^{†})}_{k} | + t_{2} \sum_{k = 1}^{n_{2}} ∥ f_{k} ∥_{2}) = φ (t_{2}) . \end{matrix}

From Condition 5, we have that

\begin{matrix} \sum_{k = 1}^{n} | {(T x - T x^{†})}_{k} | & = & \sum_{k = 1}^{n} | 〈 T x - T x^{†}, e_{k} 〉 | \\ = & \sum_{k = 1}^{n} | 〈 x - x^{†}, T^{*} e_{k} 〉 | \\ = & \sum_{k = 1}^{n} | 〈 x - x^{†}, K^{*} f_{k} 〉 | \\ \leq & ∥ K (x - x^{†}) ∥_{2} \sum_{k = 1}^{n} {∥ f_{k} ∥}_{2} . \end{matrix}

(8)

Therefore, we obtain that

\begin{matrix} ∥ T (x - x^{†}) ∥_{ℓ^{2}} \leq {∥ T x ∥}_{ℓ^{1}} - ∥ T x^{†} ∥_{ℓ^{1}} + 2 φ (∥ K (x - x^{†}) ∥_{2}) . \end{matrix}

From Condition 1, we have that

\begin{matrix} ∥ x - x^{†} ∥_{ℓ^{2}} & \leq & c ∥ K (x - x^{†}) ∥_{ℓ^{2}} + m {∥ T (x - x^{†}) ∥}_{2} \\ \leq & c ∥ K ∥ ∥ x - x^{†} ∥_{ℓ^{2}} + {m ∥ T x ∥}_{ℓ^{1}} - m {∥ T x^{†} ∥}_{ℓ^{1}} \\ + & 2 m φ (∥ K (x - x^{†}) ∥_{2}) . \end{matrix}

Let

q = \frac{1}{1 - c ∥ K ∥}

, we obtain that

∥ x - x^{†} ∥_{ℓ^{2}} \leq {q m ∥ T x ∥}_{ℓ^{1}} - q m ∥ T x^{†} ∥_{ℓ^{1}} + 2 q m φ (∥ K (x - x^{†}) ∥_{2}) .

□

Theorem 7.

Let the regularization parameter be chosen a priori as

α (δ) = O (\frac{δ^{2}}{φ (δ)})

or a posteriori as

α (δ)

according to the strong discrepancy principle (5). Then we have the convergence rate

∥ x_{α}^{δ} - x^{†} ∥_{ℓ^{2}} = O (φ (δ)) .

5. Conclusions

In this paper, we study some problems in total variation type regularization. While owning a familiar form as the sparse regularization, the TV type is hard to investigate for the ill condition of T. A group of regularization conditions has been given in this paper. Under these conditions, we study several theoretical properties such as stability, consistency and convergence rates of the minimizer of the TV type regularization. These analyses are deepened for the convergence rate under the assumption of sparsity. In the non-sparse case, we also present a conservative result based on some recent works. Now, the regularizers learned from the data are all the rage in research. So, in future work, we will make the error estimations for this type of regularization problem.

Author Contributions

Conceptualization, K.L. and Z.Y.; methodology, C.H.; validation, K.L., C.H. and Z.Y.; formal analysis, K.L.; writing—original draft preparation, K.L. and Z.Y.; writing—review and editing, K.L. and Z.Y.; supervision, C.H.; project administration, Z.Y.; funding acquisition, K.L. and C.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China under Grant 2020YFA0709803, 173 Program under Grant 2020-JCJQ-ZD-029, the Science Challenge Project under Grant TZ2016002, and Dongguan Science and Technology of Social Development Program under Grant 2020507140146.

Conflicts of Interest

The authors declare no conflict of interest.

References

Donoho, D.L. Compressed sensing. IEEE T. Inform. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
Candes, E.J.; Tao, T. Decoding by linear programming. IEEE T. Inform. Theory 2005, 51, 4203–4215. [Google Scholar] [CrossRef] [Green Version]
Lustig, M.; Donoho, D.L.; Santos, J.M.; Pauly, J.M. Compressed Sensing MRI. IEEE Signal Proc. Mag. 2008, 25, 72–82. [Google Scholar] [CrossRef]
Haupt, J.; Bajwa, W.U.; Rabbat, M.; Nowak, R. Compressed Sensing for Networked Data. IEEE Signal Proc. Mag. 2008, 25, 92–101. [Google Scholar] [CrossRef]
Yin, W.; Osher, S.; Goldfarb, D.; Darbon, J. Bregman iterative algorithms for ℓ1-minimization with applications to compressed sensing. SIAM J. Imaging Sci. 2008, 1, 143–168. [Google Scholar] [CrossRef] [Green Version]
Cai, J.F.; Osher, S.; Shen, Z. Split Bregman methods and frame based image restoration. Multiscale Model. Sim. 2009, 8, 337–369. [Google Scholar] [CrossRef]
Van den Berg, E.; Friedlander, M.P. Probing the Pareto Frontier for Basis Pursuit Solutions. SIAM J. Sci. Comput. 2008, 31, 890–912. [Google Scholar] [CrossRef] [Green Version]
Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 1998, 20, 33–61. [Google Scholar] [CrossRef]
Candès, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 2006, 52, 489–509. [Google Scholar] [CrossRef] [Green Version]
Candès, E.J. The restricted isometry property and its implications for compressed sensing. CR Math. 2008, 346, 589–592. [Google Scholar]
Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
Daubechies, I.; Defrise, M.; Mol, C.D. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 2004, 57, 1413–1457. [Google Scholar] [CrossRef] [Green Version]
Daubechies, I.; Devore, R.; Fornasier, M.; Güntürk, C.S. Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 2010, 38, 1–38. [Google Scholar] [CrossRef] [Green Version]
Bot, R.I.; Hofmann, B. The impact of a curious type of smoothness conditions on convergence rates in ℓ1-regularization. Eur. J. Math. Comput. Appl. 2013, 1, 29–40. [Google Scholar]
Jin, B.; Maass, P. Sparsity regularization for parameter identification problems. Inverse Probl. 2012, 28, 123001. [Google Scholar] [CrossRef]
Grasmair, M.; Haltmeier, M.; Scherzer, O. Sparse regularization with ℓ^q penalty term. Inverse Probl. 2008, 24, 055020. [Google Scholar] [CrossRef] [Green Version]
Hans, E.; Raasch, T. Global convergence of damped semismooth Newton methods for 1 Tikhonov regularization. Inverse Probl. 2015, 31, 025005. [Google Scholar] [CrossRef]
Lorenz, D.A.; Schiffler, S.; Trede, D. Beyond convergence rates: Exact recovery with the Tikhonov regularization with sparsity constraints. Inverse Probl. 2011, 27, 085009. [Google Scholar] [CrossRef]
Lorenz, D.A. Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J. Inverse III-Pose P 2008, 16, 463–478. [Google Scholar] [CrossRef] [Green Version]
Rubinov, A.M.; Yang, X.Q.; Bagirov, A.M. Penalty functions with a small penalty parameter. Optim. Methods Softw. 2002, 17, 931–964. [Google Scholar] [CrossRef]
Cai, J.F.; Dong, B.; Osher, S.; Shen, A.Z. Image Restoration: Total Variation, Wavelet Frames and Beyond. J. Am. Math. Soc. 2012, 25, 1033–1089. [Google Scholar] [CrossRef] [Green Version]
Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D 1992, 60, 259–268. [Google Scholar] [CrossRef]
Jun, L.; Huang, T.Z.; Gang, L.; Wang, S.; Lv, X.-G. Total variation with overlapping group sparsity for speckle noise reduction. Neurocomputing 2019, 216, 502–513. [Google Scholar]
Van den Berg, P.M.; Kleinman, R.E. A total variation enhanced modified gradient algorithm for profile reconstruction. Inverse Probl. 1995, 11, L5. [Google Scholar] [CrossRef]
Clason, C.; Jin, B.; Kunisch, K. A Duality-Based Splitting Method for ℓ¹-TV Image Restoration with Automatic Regularization Parameter Choice. SIAM J. Sci. Comput. 2010, 32, 1484–1505. [Google Scholar] [CrossRef]
Cai, J.F.; Xu, W.Y. Guarantees of total variation minimization for signal recovery. Inform. Infer. 2015, 4, 328–353. [Google Scholar] [CrossRef] [Green Version]
Needell, D.; Ward, R. Stable Image Reconstruction Using Total Variation Minimization. SIAM J. Imaging Sci. 2012, 6, 1035–1058. [Google Scholar] [CrossRef] [Green Version]
Sun, T.; Yin, P.; Cheng, L.; Jiang, H. Alternating direction method of multipliers with difference of convex functions. Adv. Comput. Math. 2018, 44, 723–744. [Google Scholar] [CrossRef]
Sun, T.; Jiang, H.; Cheng, L.; Zhu, W. Iteratively Linearized Reweighted Alternating Direction Method of Multipliers for a Class of Nonconvex Problems. IEEE Trans. Signal Proc. 2018, 66, 5380–5391. [Google Scholar] [CrossRef] [Green Version]
Leonov, A.S. On the total-variation convergence of regularizing algorithms for ill-posed problems. Comput. Math. Math. Phys. 2007, 47, 732–747. [Google Scholar] [CrossRef]
Tikhonov, A.; Leonov, A.; Yagola, A. Nonlinear Ill-Posed Problems; De Gruyter: Berlin, Germany, 2011; pp. 505–512. [Google Scholar] [CrossRef]
Hào, D.N.; Quyen, T.N.T. Convergence rates for total variation regularization of coefficient identification problems in elliptic equations I. Inverse Probl. 2011, 27, 075008. [Google Scholar] [CrossRef]
Hào, D.N.; Quyen, T.N.T. Convergence rates for total variation regularization of coefficient identification problems in elliptic equations II. J. Math. Anal. Appl. 2012, 388, 593–616. [Google Scholar] [CrossRef] [Green Version]
Ciegis, R.; Sev, A.J. Nonlinear Diffusion Problems In Image Smoothing. Math. Model. Anal. 2005, 1, 381–388. [Google Scholar]
Grasmair, M.; Scherzer, O.; Haltmeier, M. Necessary and sufficient conditions for linear convergence of ℓ1-regularization. Commun. Pure Appl. Math. 2011, 64, 161–182. [Google Scholar] [CrossRef]
Burger, M.; Osher, S. Convergence rates of convex variational regularization. Inverse Probl. 2004, 20, 1411. [Google Scholar] [CrossRef]
Burger, M.; Flemming, J.; Hofmann, B. Convergence rates in ℓ¹-regularization if the sparsity assumption fails. Inverse Probl. 2013, 29, 025013. [Google Scholar] [CrossRef] [Green Version]
Flemming, J.; Hofmann, B.; Veselić, I. A unified approach to convergence rates for ℓ¹-regularization and lacking sparsity. J. Inverse III-Pose P 2015, 24, 139–148. [Google Scholar] [CrossRef]
Anzengruber, S.W.; Hofmann, B.; Mathé, P. Regularization properties of the sequential discrepancy principle for Tikhonov regularization in Banach spaces. Appl. Anal. 2014, 93, 1382–1400. [Google Scholar] [CrossRef]
Luo, Z.Q.; Tseng, P. On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 1992, 72, 7–35. [Google Scholar] [CrossRef] [Green Version]
Luo, Z.Q.; Tseng, P. Error bound and convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 1992, 2, 43–54. [Google Scholar] [CrossRef] [Green Version]
Jin, B.; Lorenz, D.A. Heuristic parameter-choice rules for convex variational regularization based on error estimates. SIAM J. Numer. Anal. 2010, 48, 1208–1229. [Google Scholar] [CrossRef] [Green Version]
Jin, B.; Zou, J. Iterative parameter choice by discrepancy principle. IMA J. Numer. Anal. 2012, 32, 1714–1732. [Google Scholar] [CrossRef]
Jin, B.; Lorenz, D.A.; Schiffler, S. Elastic-Net Regularization: Error estimates and Active Set Methods. Inverse Probl. 2009, 25, 115022. [Google Scholar] [CrossRef] [Green Version]
Bredies, K.; Lorenz, D.A. Linear convergence of iterative soft-thresholding. J. Fourier Anal. Appl. 2008, 14, 813–837. [Google Scholar] [CrossRef] [Green Version]
Foucart, S.; Rauhut, H. A Mathematical Introduction to Compressive Sensing; Springer: New York, NY, USA, 2013. [Google Scholar]
Anzengruber, S.W.; Ramlau, R. Convergence rates for Morozov’s discrepancy principle using variational inequalities. Inverse Probl. 2011, 27, 105007–105024. [Google Scholar] [CrossRef] [Green Version]
Flemming, J. Generalized Tikhonov Regularization and Modern Convergence Rate Theory in Banach Spaces; Shaker Verlag GmbH: Düren, Germany, 2012. [Google Scholar]
Hofmann, B.; Mathé, P. Parameter choice in Banach space regularization under variational inequalities. Inverse Probl. 2012, 6, 1035–1058. [Google Scholar] [CrossRef] [Green Version]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, K.; Huang, C.; Yuan, Z. Error Estimations for Total Variation Type Regularization. Mathematics 2021, 9, 1373. https://0-doi-org.brum.beds.ac.uk/10.3390/math9121373

AMA Style

Li K, Huang C, Yuan Z. Error Estimations for Total Variation Type Regularization. Mathematics. 2021; 9(12):1373. https://0-doi-org.brum.beds.ac.uk/10.3390/math9121373

Chicago/Turabian Style

Li, Kuan, Chun Huang, and Ziyang Yuan. 2021. "Error Estimations for Total Variation Type Regularization" Mathematics 9, no. 12: 1373. https://0-doi-org.brum.beds.ac.uk/10.3390/math9121373

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Error Estimations for Total Variation Type Regularization

Abstract

1. Introduction

2. Notation

3. Basic Error Estimations

3.1. Stability

3.2. Consistency

3.3. Convergence Rate

4. Improved Convergence Rate

4.1. Performance under Sparsity Assumption

4.2. Performance if Sparsity Assumption Fails

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI