# A CS Recovery Algorithm for Model and Time Delay Identification of MISO-FIR Systems

^{1}

^{2}

^{*}

Next Article in Journal

Previous Article in Journal

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China

School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China

Author to whom correspondence should be addressed.

Academic Editor: Paul M. Goggans

Received: 29 June 2015 / Revised: 6 August 2015 / Accepted: 2 September 2015 / Published: 10 September 2015

This paper considers identifying the multiple input single output finite impulse response (MISO-FIR) systems with unknown time delays and orders. Generally, parameters, orders and time delays of an MISO system are separately identified from different algorithms. In this paper, we aim to perform the model identification and time delay estimation simultaneously from a limited number of observations. For an MISO-FIR system with many inputs and unknown input time delays, the corresponding identification model contains a large number of parameters, requiring a great number of observations for identification and leading to a heavy computational burden. Inspired by the compressed sensing (CS) recovery theory, a threshold orthogonal matching pursuit algorithm (TH-OMP) is presented to simultaneously identify the parameters, the orders and the time delays of the MISO-FIR systems. The proposed algorithm requires only a small number of sampled data compared to the conventional identification methods, such as the least squares method. The effectiveness of the proposed algorithm is verified by simulation results.

In many industry processes, time delay is an unavoidable behavior in their dynamical models. For example, in a chemical industry, analyzers need a certain amount of time to process an analysis, which is used as a part of the control loop [1]. Analyzing and designing a controller for a system with time delays depend on an effective dynamical system model. System identification is an effective way to build mathematical models of dynamical systems from observed input-output data. Conventional identification methods, such as the least squares methods [2], the stochastic gradient methods [3] and the maximum likelihood estimation methods [4], usually require a large number of sampled data and that the model structures, including time delays, are a priori known [5,6]. However, this is not the case in many situations in practice for which the time delays and systems orders have to be estimated together with the parameters. Most of the process industries, such as distillation columns, heat exchangers and reactors, can be modeled as multiple input multiple output (MIMO) or multiple input single output (MISO) systems with long time delays [1]. For a multiple input single output finite impulse response (MISO-FIR) system with many inputs and unknown input time delays, the parameterized model contains a large number of parameters, requiring a great number of observations and a heavy computational burden for identification. Furthermore, the measuring cost is an issue. Especially for chemical processes, it will take a long time to get enough data because of the long analysis time. Therefore, it is a challenging work to find new methods to identify the time delays and parameters of multivariable systems with less computational burden using a small number of sampled data.

System models of practical interest, especially the control-oriented models, often require low order, which implies that the systems contain only a few non-zero parameters. Because the time delays and orders are unknown, we can parameterize such an MISO-FIR system by a high-dimensional parameter vector, with only a few non-zero elements. These systems can be termed as sparse systems. On the other hand, a sparse MISO-FIR system contains a long impulse response, but only a few terms are non-zeros. The sparse systems are widely discussed in many fields [7,8,9,10,11]. For such sparse systems, the identification objective is to estimate the sparse parameter vector and to read the time delays and orders from the structure of the estimated parameter vector.

In this paper, we consider the identification of the sparse MISO-FIR systems by using the compressed sensing (CS) recovery theory [12,13,14,15]. CS has been widely applied in signal processing and communication systems, which enables the recovery of an unknown vector from a set of measurements under the assumption that the signal is sparse and certain conditions on the measurement matrices [16,17,18,19]. The identification of a sparse system can be viewed as a CS recovery problem. The sparse recovery problem can be solved by the ${l}_{1}$-norm convex relaxation. However, the ${l}_{1}$ regularization schemes are always complex, requiring heavy computational burdens [20,21,22]. The greedy methods have speed advantages and are easily implemented. In this literature, a block orthogonal matching pursuit (BOMP) algorithm has been applied to identify the determined linear time invariant (LTI) and linear time variant (LTV) auto-regressive with external (ARX) models without disturbances [9]. In this paper, we will consider identifying the parameters, orders and time delays of the MISO-FIR systems from a small number of noisy measured data.

Briefly, the structure of this paper is as follows. Section 2 gives the model description of MISO-FIR systems and describes the identification problem. Section 3 presents a threshold orthogonal matching pursuit (TH-OMP) algorithm based on the compressed sensing theory. Section 4 provides a simulation example to show the effectiveness of the proposed algorithm. Finally, some concluding remarks are given in Section 5.

Consider an MISO-FIR system with time delays:
where ${u}_{i}\left(t\right)$ is the i-th input, $y\left(t\right)$ the output, ${d}_{i}$ the time delay of the i-th input channel and $v\left(t\right)$ a white noise with zero mean and variance ${\text{\sigma}}^{2}$, ${B}_{i}\left(z\right)$, $i=1,2,\cdots ,r$ are the polynomials in the unit backward shift operator ${z}^{-1}$ (i.e., ${z}^{-1}y\left(t\right)=y(t-1)$) and:

$$y\left(t\right)=\sum _{i=1}^{r}{z}^{-{d}_{i}}{B}_{i}\left(z\right){u}_{i}\left(t\right)+v\left(t\right)$$

$${B}_{i}\left(z\right):={b}_{i1}{z}^{-1}+{b}_{i2}{z}^{-2}+\cdots +{b}_{i{n}_{bi}}{z}^{-{n}_{bi}}$$

Assume that $y\left(t\right)=0,u\left(t\right)=0$ and $v\left(t\right)=0$ for $t\u2a7d0$ and the orders ${n}_{bi}$, the time delays ${d}_{i}$, as well as the parameters ${b}_{ij}$ are unknown. Thus, the identification goal is to estimate the unknown parameters ${b}_{ij}$, orders ${n}_{bi}$ and the time delays ${d}_{i}$ from the observations.

Define the information vector:
where l is the input maximum regression length. l must be chosen to capture all of the time delays and the degrees of the polynomials ${B}_{i}\left(z\right)$. Thus, $l\u2a7emax({d}_{i}+{n}_{bi})$. The dimension of $\mathit{\phi}\left(t\right)$ is $N=lr$. The corresponding parameter vector **θ** is:

$$\begin{array}{ccc}\hfill \mathit{\phi}\left(t\right)& :=& [{u}_{1}(t-1),\cdots ,{u}_{1}(t-{d}_{1}),{u}_{1}(t-{d}_{1}-1),\cdots ,{u}_{1}(t-{d}_{1}-{n}_{{b}_{1}}),\cdots ,{u}_{1}(t-l),\cdots ,\hfill \\ & & \phantom{\rule{4pt}{0ex}}{u}_{r}(t-1),\cdots ,{u}_{r}(t-{d}_{r}),{u}_{r}(t-{d}_{r}-1),\cdots ,{u}_{r}(t-{d}_{r}-{n}_{{b}_{r}}),\cdots ,{u}_{r}(t-l){]}^{\mathrm{T}}\in {\mathbb{R}}^{N}\hfill \end{array}$$

$$\begin{array}{ccc}\hfill \mathit{\theta}& :=& {[0,\cdots ,0,{b}_{11},\cdots ,{b}_{1{n}_{{b}_{1}}},0,\cdots ,0,{b}_{i1},\cdots ,{b}_{i{n}_{{b}_{i}}},0,\cdots ,0,{b}_{r1},\cdots ,{b}_{r{n}_{{b}_{r}}},0,\cdots ,0]}^{\mathrm{T}}\in {\mathbb{R}}^{N}\hfill \end{array}$$

Then, Equation (1) can be written as:

$$y\left(t\right)={\mathit{\phi}}^{\mathrm{T}}\left(t\right)\mathit{\theta}+v\left(t\right)$$

Because of the unknown orders and time delays, the parameter vector **θ** contains many zeros, but only a few non-zeros. According to the CS theory, the parameter vector **θ** can be viewed as a sparse signal, and the number of non-zero entries $K={\sum}_{i=1}^{r}{n}_{bi}$ is the sparsity level of **θ** [8]. Let ${\parallel \mathit{\theta}\parallel}_{0}$ denote the number of non-zero entries in **θ**. Then, the identification problem can be described as:
where $\widehat{\mathit{\theta}}$ is the estimate of **θ** and $\epsilon >0$ is the error tolerance.

$$\widehat{\mathit{\theta}}={argmin\parallel \mathit{\theta}\parallel}_{0},\phantom{\rule{1.em}{0ex}}\mathrm{s}.\mathrm{t}.\phantom{\rule{1.em}{0ex}}{\parallel \mathit{y}-{\mathit{\phi}}^{\mathrm{T}}\left(t\right)\mathit{\theta}\parallel}_{2}<\mathbf{\epsilon},$$

For m observations, define the stacked matrix and vectors as:

$$\begin{array}{ccc}\hfill {\mathbf{\Phi}}_{m}& :=& \left[\begin{array}{c}{\mathit{\phi}}^{\mathrm{T}}\left(1\right)\\ {\mathit{\phi}}^{\mathrm{T}}\left(2\right)\\ \vdots \\ {\mathit{\phi}}^{\mathrm{T}}\left(m\right)\end{array}\right]\in {\mathbb{R}}^{m\times N},\phantom{\rule{4pt}{0ex}}{\mathit{y}}_{m}:=\left[\begin{array}{c}y\left(1\right)\\ y\left(2\right)\\ \vdots \\ y\left(m\right)\end{array}\right]\in {\mathbb{R}}^{m},\phantom{\rule{4pt}{0ex}}{\mathit{v}}_{m}:=\left[\begin{array}{c}v\left(1\right)\\ v\left(2\right)\\ \vdots \\ v\left(m\right)\end{array}\right]\in {\mathbb{R}}^{m}\hfill \end{array}$$

Then, we have:

$${\mathit{y}}_{m}={\mathbf{\Phi}}_{m}\mathit{\theta}+{\mathit{v}}_{m}$$

If there are enough measurements, i.e., $m\gg N$, according to the least squares principle, we can get the least squares estimate of **θ**,

$${\widehat{\mathit{\theta}}}_{\mathrm{LS}}={\left({\mathbf{\Phi}}_{m}^{\mathrm{T}}{\mathbf{\Phi}}_{m}\right)}^{-1}{\mathbf{\Phi}}_{m}^{\mathrm{T}}{\mathit{y}}_{m}$$

However, from Equation (2), it is obvious that N is a large number, especially for systems with a large number of inputs. Therefore, it will take a lot of time and effort to get enough measurements. In order to improve the identification efficiency, in this paper, we aim to identify the parameter vector **θ** using less observations ($K<m<N$) based on the CS recovery theory.

The CS theory, which was introduced by Cand$\stackrel{`}{e}$s, Romberg and Tao [12,13] and Donoho [14], has been widely used in signal processing. It indicates that an unknown signal vector can be recovered from a small set of measurements under the assumptions that the signal vector is sparse and some conditions on the measurement matrix. The identification problem can be viewed as a CS recovery problem in that the parameter vector **θ** is sparse.

From Equation (4), we can see that the output vector ${\mathit{y}}_{m}$ can be written as a linear combination of the columns of ${\mathbf{\Phi}}_{m}$ plus the noise vector ${v}_{m}$, i.e.,
where ${\text{\theta}}_{i}$ is the i-th element of **θ** and ${\varphi}_{i}$ is the i-th column of ${\mathbf{\Phi}}_{m}$. Because there are only K non-zero parameters in **θ**, the main idea is to find the K non-zero items at the right-hand side of Equation (5).

$${\mathit{y}}_{m}={\varphi}_{1}{\text{\theta}}_{1}+{\varphi}_{2}{\text{\theta}}_{2}+\cdots +{\varphi}_{i}{\text{\theta}}_{i}+\cdots +{\varphi}_{N}{\text{\theta}}_{N}+{v}_{m}$$

In this paper, we propose a TH-OMP algorithm to do the identification. First, we give some notations. Let $k=1,2,\cdots $ be the iterative number and ${\lambda}_{k}$ be the solution support of the k-th iterative number; ${\text{\Lambda}}_{k}$ is a set composed of ${\lambda}_{i}$, $i=1,2,\cdots ,k$; ${r}_{k}$ denotes the residual of the k-th iterative number; ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$ is the sub-information matrix composed by the k columns of ${\mathbf{\Phi}}_{m}$ indexed by ${\text{\Lambda}}_{k}$. ${\widehat{\mathit{\theta}}}_{k}$ is the parameter estimation of the k-th iterative number. The TH-OMP algorithm is initialized as ${\mathit{r}}_{0}={\mathit{y}}_{m}$, ${\text{\Lambda}}_{0}=\u2300$ and ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{0}}=\mathbf{0}$. At the k-th iteration, minimizing the criteria function:
with respect to ${\text{\theta}}_{i}$ yields:

$$\epsilon \left(i\right)=\mathrm{min}\parallel {\mathit{r}}_{k-1}-{\varphi}_{i}{\text{\theta}}_{i}{\parallel}^{2},\phantom{\rule{4pt}{0ex}}i=1,2,\cdots ,N$$

$${\mathit{\theta}}_{i}=\frac{{\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}}{\parallel {\mathit{\theta}}_{i}{\parallel}^{2}}$$

Plugging it back into $\epsilon \left(i\right)$, we have:

$$\begin{array}{ccc}\hfill \epsilon \left(i\right)& =& \underset{{\text{\theta}}_{i}}{min}{\parallel {\varphi}_{i}{\text{\theta}}_{i}-{\mathit{r}}_{k-1}\parallel}^{2}\hfill \\ & =& \parallel \frac{{\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}}{\parallel {\varphi}_{i}{\parallel}^{2}}{\varphi}_{i}-{\mathit{r}}_{k-1}{\parallel}^{2}\hfill \\ & =& \parallel {\mathit{r}}_{k-1}{\parallel}^{2}-2\frac{{\left({\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}\right)}^{2}}{\parallel {\varphi}_{i}{\parallel}^{2}}+\frac{{\left({\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}\right)}^{2}}{\parallel {\varphi}_{i}{\parallel}^{2}}\hfill \\ & =& \parallel {\mathit{r}}_{k-1}{\parallel}^{2}-\frac{{\left({\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}\right)}^{2}}{\parallel {\varphi}_{i}{\parallel}^{2}}\hfill \\ & =& \parallel {\mathit{r}}_{k-1}{\parallel}^{2}-{\left(\frac{1}{\parallel {\varphi}_{i}\parallel}{\varphi}_{i}^{\mathrm{T}}{\mathit{r}}_{k-1}\right)}^{2}\hfill \end{array}$$

From the above equation, we can see that the quest for the smallest error is equivalent to the quest for the largest inner product between the residual ${\mathbf{r}}_{k-1}$ and the normalized column vectors of ${\mathbf{\Phi}}_{m}$. Thus, the k-th solution support can be obtained by:

$$\begin{array}{c}\hfill {\lambda}_{k}=arg\underset{i=1,2,\cdots ,N}{max}|\langle {\mathit{r}}_{k-1},\frac{{\varphi}_{i}}{\parallel {\varphi}_{i}\parallel}\rangle |\end{array}$$

Update the support set ${\text{\Lambda}}_{k}$ and the sub-information matrix ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$ by:

$${\text{\Lambda}}_{k}:={\text{\Lambda}}_{k-1}\cup {\lambda}_{k},\phantom{\rule{1.em}{0ex}}{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}:={\mathbf{\Phi}}_{{\text{\Lambda}}_{k-1}}\cup {\varphi}_{{\lambda}_{k}}$$

The next step is to find the k-th provisional parameter estimate from the sub-information matrix ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$ and measurement vector ${\mathit{y}}_{m}$. Minimizing the cost function:
leads to the k-th parameter estimate:

$$J\left({\mathit{\theta}}_{{\text{\Lambda}}_{k}}\right)={\parallel {\mathit{y}}_{m}-{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}{\mathit{\theta}}_{{\text{\Lambda}}_{k}}\parallel}^{2}$$

$${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}={\left({\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}^{\mathrm{T}}{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}\right)}^{-1}{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}^{\mathrm{T}}{\mathit{y}}_{m}$$

Then, the k-th residual can be computed by:

$${\mathit{r}}_{k}={\mathit{y}}_{m}-{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}{\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$$

If the system is noise free, then the residual will be zero after K iterations, and the non-zero parameters, as well as their positions can be exactly determined. This is the so-called orthogonal matching pursuit (OMP) algorithm. However, for systems disturbed by noises, the OMP algorithm cannot give the sparsest solution, because the residual ${\mathit{r}}_{K}$ is non-zero after K iterations, and the iteration continues. In order to solve this problem, an appropriate small threshold ϵ can be set to filter the parameter estimate ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$. If $|{\widehat{\text{\theta}}}_{{\text{\Lambda}}_{\u03f5}}|<\u03f5$, let ${\widehat{\text{\theta}}}_{{\text{\Lambda}}_{\u03f5}}=0$, where ${\widehat{\text{\theta}}}_{{\text{\Lambda}}_{\u03f5}}$ is the i-th element of ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$. The choice of ϵ will be discussed in the simulation part. Denote ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k\u03f5}}$ as the parameter estimate after the filtering. Then, the residual can be computed by:
if $\parallel {\mathbf{r}}_{k}\parallel <\epsilon $, then stop the iteration; otherwise, go to apply another iteration. If the iteration stops at k, we can recover the parameter estimate $\widehat{\mathit{\theta}}\in {\mathbb{R}}^{N}$ from ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k\u03f5}}\in {\mathbb{R}}^{k}$ based on the support set ${\text{\Lambda}}_{k}$.

$${\mathit{r}}_{k}={\mathit{y}}_{m}-{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}{\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k\mathbf{\u03f5}}}$$

It is worth noting that taking the derivative of $J\left({\mathit{\theta}}_{{\text{\Lambda}}_{k}}\right)$ with respect to ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$ gives:

$$\begin{array}{ccc}\hfill \frac{\partial J\left({\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}\right)}{\partial {\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}}& =& -{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}^{\mathrm{T}}({\mathit{y}}_{m}-{\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}{\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}})\hfill \\ & =& {\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}^{\mathrm{T}}{\mathit{r}}_{k}=\mathbf{0}\hfill \end{array}$$

The above equation shows that the residual ${\mathbf{r}}_{k}$ is orthogonal to the sub-information matrix ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$. Therefore, it does not require computing the inner products of ${\mathbf{r}}_{k}$ and the columns in ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$ during the $(k+1)$-th iteration. Modify Equation (7) as:

$$\begin{array}{c}\hfill {\lambda}_{k}=arg\underset{i\in \text{\Omega}\backslash {\text{\Lambda}}_{k-1}}{max}|\langle {\mathit{r}}_{k-1},\frac{{\varphi}_{i}}{\parallel {\varphi}_{i}\parallel}\rangle |,\phantom{\rule{1.em}{0ex}}\text{\Omega}=\{1,2,\cdots ,N\}\end{array}$$

Then, the computational burden can be reduced.

From the recovered parameter estimation $\widehat{\mathit{\theta}}$, the orders and time delays of each input channel can be estimated. The orders ${n}_{bi}$ can be determined by the number of elements of each non-zero block. Then, the time delay estimates can be computed from the orders ${n}_{bi}$ and the input maximum regression length l. It can be seen from Equation (2) that **θ** contains $(r+1)$ zero blocks. Denote the number of zeros of each zero-block as ${n}_{i}$, $i=1,2,\cdots ,r+1$. Then, the time delay estimates can be computed by:

$$\begin{array}{c}\hfill \begin{array}{c}{\widehat{d}}_{1}={n}_{1},\hfill \\ {\widehat{d}}_{i}={n}_{i}-(l-{\widehat{d}}_{i-1}-{n}_{{b}_{i}}),\phantom{\rule{4pt}{0ex}}i=2,3,\cdots ,r\hfill \end{array}\end{array}$$

The proposed TH-OMP algorithm can be implemented as the following steps.

- Set the initial values: let $k=0$, ${\mathit{r}}_{0}={\mathit{y}}_{m}$, ${\text{\Lambda}}_{0}=\text{\u2300}$ and ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{0}}=\mathbf{0}$. Give the error tolerant ε.
- Update the support set ${\text{\Lambda}}_{k}$ and the sub-information matrix ${\mathbf{\Phi}}_{{\text{\Lambda}}_{k}}$ using Equation (8).
- Compute the parameter estimate ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$ by using Equation (10).
- Using the threshold ϵ to filter ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k}}$ to get ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k\u03f5}}$.
- Compute the residual using Equation (12). If $\parallel {\mathbf{r}}_{k}\parallel <\epsilon $, stop the iteration; otherwise, go back to Step 2 to apply another iteration.
- Recover the parameter estimate $\widehat{\mathit{\theta}}\in {\mathbb{R}}^{N}$ from ${\widehat{\mathit{\theta}}}_{{\text{\Lambda}}_{k\u03f5}}\in {\mathbb{R}}^{k}$ based on the support set ${\text{\Lambda}}_{k}$.
- Read the orders ${n}_{bi}$ from $\widehat{\mathit{\theta}}\in {\mathbb{R}}^{N}$ and estimate the time delays using Equation (15).

Consider the steam water heat exchanger system in Figure 1, where the steam flow ${u}_{1}$ and the water flow ${u}_{2}$ are two manipulated variables to control the water temperature y. The discrete input-output representation of the process is expressed as the following equation.

$$\begin{array}{ccc}\hfill y\left(t\right)& =& {z}^{-{d}_{1}}{B}_{1}\left(z\right){u}_{1}\left(t\right)+{z}^{-{d}_{2}}{B}_{2}\left(z\right){u}_{2}\left(t\right)+v\left(t\right)\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {B}_{1}\left(z\right)& =& 0.95{z}^{-1}+0.72{z}^{-2}+0.40{z}^{-3},\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {B}_{2}\left(z\right)& =& -0.63{z}^{-1}-0.47{z}^{-2}-0.25{z}^{-3},\hfill \end{array}$$

$$\begin{array}{ccc}\hfill {d}_{1}& =& 42,{d}_{2}=20\hfill \end{array}$$

Let $l=50$; then the parameter vector to be identified is:

$$\begin{array}{ccc}\hfill \mathit{\theta}& =& {[{\mathbf{0}}_{42},0.95,0.72,0.40,{\mathbf{0}}_{25},-0.63,-0.47,-0.25,{\mathbf{0}}_{27}]}^{\mathrm{T}}\in {\mathbb{R}}^{N}\hfill \end{array}$$

It can be seen from the above equation that the dimension of the parameter vector **θ** is $N=100$, but only six of the parameters are non-zeros.

In the identification, the inputs $\left\{{u}_{i}\left(t\right)\right\}$, $i=1,2$ are taken as independent persistent excitation signal sequences with zero mean and unit variances and $\left\{v\right(t\left)\right\}$ as white noise sequences with zero mean and variance ${\text{\sigma}}^{2}=0.{10}^{2}$. Let $\u03f5=0.05$, applying the proposed TH-OMP algorithm and the least squares (LS) algorithm to estimate the system with m measurements. The parameter estimation errors $\delta :=(\parallel \text{\theta}-\widehat{\text{\theta}}\parallel )/\parallel \text{\theta}\parallel $versus m are shown in Figure 2.

From Figure 2, we can see that under a limited dataset $(m<100)$, the parameter estimation error of the TH-OMP algorithm approaches zero, and the estimation accuracy is much higher than that of the LS algorithm.

Let $m=80$; using the TH-OMP algorithm to estimate **θ**, the parameter estimation errors δ versus iteration k are shown in Figure 3; the non-zero parameter estimates versus iteration k are shown in Table 1 and Figure 4, where the dash dot lines are true parameters and the solid lines are parameter estimates.

k | ${b}_{11}$ | ${b}_{12}$ | ${b}_{13}$ | ${b}_{21}$ | ${b}_{22}$ | ${b}_{23}$ | $\delta (\%)$ |
---|---|---|---|---|---|---|---|

1 | 1.0567 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 77.8404 |

2 | 1.0056 | 0.7357 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 61.0813 |

3 | 1.0274 | 0.7349 | 0.0000 | −0.6982 | 0.0000 | 0.0000 | 44.8210 |

4 | 0.9422 | 0.7635 | 0.0000 | −0.7006 | −0.4604 | 0.0000 | 31.8601 |

5 | 0.9748 | 0.7429 | 0.3734 | −0.6551 | −0.4605 | 0.0000 | 16.9639 |

6 | 0.9436 | 0.7041 | 0.3936 | −0.6397 | −0.4727 | −0.2522 | 1.3977 |

7 | 0.9495 | 0.7071 | 0.3973 | −0.6408 | −0.4715 | −0.2537 | 1.1650 |

8 | 0.9466 | 0.7127 | 0.3986 | −0.6422 | −0.4707 | −0.2509 | 0.9755 |

9 | 0.9487 | 0.7120 | 0.4022 | −0.6386 | −0.4722 | −0.2490 | 0.8137 |

10 | 0.9458 | 0.7129 | 0.3988 | −0.6380 | −0.4766 | −0.2498 | 0.8841 |

True values | 0.9500 | 0.7200 | 0.4000 | −0.6300 | −0.4700 | −0.2500 |

After 10 iterations, the estimation error is $\text{\delta}=0.8841\%$, and the estimated parameter vector is:

$$\begin{array}{ccc}\hfill {\widehat{\mathit{\theta}}}_{\u03f5=0.05}& =& {[{\mathbf{0}}_{42},0.9472,0.7105,0.3921,{\mathbf{0}}_{25},0.6246,0.4751,0.2524,{\mathbf{0}}_{27}]}^{\mathrm{T}}\in {\mathbb{R}}^{N}\hfill \end{array}$$

If we choose $\u03f5=0.01$, then we can get the estimated parameter vector as:

$$\begin{array}{ccc}\hfill {\widehat{\mathit{\theta}}}_{\u03f5=0.01}& =& [{\mathbf{0}}_{4},0.0217,{\mathbf{0}}_{25},0.0303,{\mathbf{0}}_{11},0.9458,0.7129,0.3988,{\mathbf{0}}_{25}\hfill \\ & & \phantom{\rule{4pt}{0ex}}0.6220,0.4634,0.2502,{\mathbf{0}}_{13},-0.0244,{\mathbf{0}}_{13}{]}^{\mathrm{T}}\in {\mathbb{R}}^{N}\hfill \end{array}$$

It is obvious that the parameter vector **θ** has only three zero blocks in this example. However, we can see from the above equation that there exist three undesirable parameter estimates ${\widehat{\text{\theta}}}_{5}$, ${\widehat{\text{\theta}}}_{31}$ and ${\widehat{\text{\theta}}}_{87}$. Moreover, the three parameter estimates are much smaller than other parameters. Therefore, according to the structure of **θ**, the parameters ${\widehat{\text{\theta}}}_{5}$, ${\widehat{\text{\theta}}}_{31}$ and ${\widehat{\text{\theta}}}_{87}$ can be set to zeros. Then, the estimation result is the same as the case when $\u03f5=0.05$. This implies that even if a too small ϵ is chosen, we can still obtain the effective parameter estimates based on the model structures.

From Equations (15) and (16), we can obtain the orders ${n}_{{b}_{1}}={n}_{{b}_{2}}=3$ and get the time delay estimates as:

$$\begin{array}{ccc}\hfill {\widehat{d}}_{1}& =& 42,\phantom{\rule{1.em}{0ex}}{\widehat{d}}_{2}={n}_{2}-(l-{d}_{1}-{n}_{b1})=25-(50-42-3)=20\hfill \end{array}$$

From Figure 2, Figure 3 and Figure 4, Table 1 and Equations (16) and (17), we can conclude that the TH-OMP algorithm converges faster than the LS algorithm, can achieve a high estimation accuracy by K iterations with finite measurements ($m<N$) and can effectively estimate the time delay of each channel.

This paper presents a TH-OMP algorithm for MISO-FIR systems with unknown orders and time delays. The parameter estimates, orders and time delays can be simultaneously estimated from a small number of observation data. The proposed method can be simply implemented and can reduce the measuring cost, as well as improve the identification efficiency. The simulation results are given to demonstrate the performance of the proposed method.

This paper is supported by the National Natural Science Foundation of China (Nos. 61203111, 61304138) and the Natural Science Foundation of Jiangsu Province (China; BK20130163).

Yanjun Liu worked on the main details of the algorithm and was responsible for writing the manuscript. Taiyang Tao helped with the MATLAB simulation.

The authors declare no conflict of interest.

- Normey-Rico, J.E.; Camacho, E.F. Control of Dead-Time Processes; Spinger: Berlin, Germany, 2007. [Google Scholar]
- Liu, Y.J.; Ding, F. Convergence properties of the least squares estimation algorithm for multivariable systems. Appl. Math. Model.
**2013**, 37, 476–483. [Google Scholar] [CrossRef] - Liu, Y.J.; Ding, R. Consistency of the extended gradient identification algorithm for multi-input multi-output systems with moving average noises. Int. J. Comput. Math.
**2013**, 90, 1840–1852. [Google Scholar] [CrossRef] - Li, J.H.; Ding, F.; Hua, L. Maximum likelihood newton recursive and the newton iterative estimation algorithms for hammerstein CARAR systems. Nonlinear Dyn.
**2014**, 75, 235–245. [Google Scholar] [CrossRef] - Ljung, L. System Identification, Theory for the User, 2nd ed.; Prentince Hall: Englewood Cliffs, NJ, USA, 1999. [Google Scholar]
- Ding, F. System Identification-New Theory and Methods; Science Press: Beijing, China, 2013. [Google Scholar]
- Haupt, J.; Bajwa, W.U.; Raz, G.; Nowak, R. Toeplitz compressed sensing matrices with applications to sparse channel estimation. IEEE Trans. Inf. Theory
**2010**, 56, 5862–5875. [Google Scholar] [CrossRef] - Sanandaji, B.M.; Vincent, T.L.; Wakin, M.B. Exact topology identification of large-scale interconnected dynamical systems from compressive observations. In Proceedings of the 2011 American Control Conference, San Francisco, CA, USA, 29 June–1 July 2011; pp. 649–656.
- Sanandaji, B.M.; Vincent, T.L.; Wakin, M.B.; Toth, R. Compressive system identification of LTI and LTV ARX models. In Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, Orlando, FL, USA, 12–15 December 2011; pp. 791–798.
- Gui, G.; Fumiyuki, A. Stable adaptive sparse filtering algorithms for estimating multiple-input multiple-output channels. IET Commun.
**2014**, 8, 1032–1040. [Google Scholar] [CrossRef] - Gui, G.; Xu, L.; Fumiyuki, A. Normalized least mean square-based adaptive sparse filtering algorithms for estimating multiple-input multiple-output channels. Wirel. Commun. Mob. Comput.
**2015**, 15, 1079–1088. [Google Scholar] [CrossRef] - Candès, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory
**2006**, 52, 489–509. [Google Scholar] [CrossRef] - Candès, E.J. Compressive sampling. In Proceedings of the 2006 International Congress of Mathematics; The European Mathematical Society: Madrid, Spain, 2006; pp. 1433–1452. [Google Scholar]
- Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory
**2006**, 52, 1289–1306. [Google Scholar] [CrossRef] - Elad, M. Optimized projections for compressed sensing. IEEE Trans. Signal Process.
**2007**, 55, 5695–5702. [Google Scholar] [CrossRef] - Baraniuk, R. A lecture on compressive sensing. IEEE Signal Process. Mag.
**2007**, 24, 118–121. [Google Scholar] [CrossRef] - Candès, E.J.; Wakin, M.B. An introduction to compressive sampling. IEEE Signal Process. Mag.
**2008**, 25, 21–30. [Google Scholar] [CrossRef] - Tropp, A.J. Just relax: Convex programming methods for ifentifying sparse signals in noise. IEEE Trans. Inf. Theory
**2006**, 52, 1030–1051. [Google Scholar] [CrossRef] - Candès, E.J. Robust uncertainty principles: Extra signal reconstruction from highly frequency information. IEEE Trans. Inf. Theory
**2006**, 52, 489–509. [Google Scholar] [CrossRef] - Tóth, R.; Sanandaji, B.M.; Poolla, K.; Vincent, T.L. Compressive system identification in the linear time-invariant framework. In Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, Orlando, FL, USA, 12–15 December 2011; pp. 783–790.
- Yang, Z.; Zhang, C.S.; Deng, J.; Lu, W.M. Orthonormal expansion ℓ
_{1}minimization algorithms for compressed sensing. IEEE Trans. Signal Process.**2011**, 59, 6285–6290. [Google Scholar] [CrossRef] - Le, V.L.; Lauer, F.; Bloch, G. Selective ℓ
_{1}minimization for sparse recovery. IEEE Trans. Autom. Control**2014**, 59, 3008–3013. [Google Scholar] [CrossRef]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).