|
1.INTRODUCTIONAs it is crucial to minimize the sample size required in estimation problems, compressive sensing (CS) plays an indispensable role in the field of signal processing1-3. Many existing problems can be formulated in terms of compressed sensing (CS), including sparse channel estimation4, beam pattern synthesis5, direction-of-arrival (DOA) estimation6,7, and range-Doppler estimation8. The recovery is completed by finding a sparse representation of the time-domain signal over a dictionary matrix, which is always a row-sampled discrete Fourier transform (DFT) matrix. Since this kind of problem is ill-posed, various methods have been proposed. One of the well-known ℓ1-norm regularization techniques is the iterative shrinkage thresholding algorithm (ISTA)9, which is often used in a wide range of applications10,11. Since ISTA is solving a convex optimization problem, it is guaranteed to converge to a solution under the correct circumstances. As an iterative solver, it often costs too much time for ISTA to converge, thus difficult to apply in various real-time applications. Some pre-defined optimization parameters such as the step size and regularization parameter are also required to be set carefully based on prior knowledge, which may become quite challenging in some cases12. To address this problem, Gregor and LeCun13 have proposed a recurrent neural network (RNN) to solve sparse coding problems, named learned ISTA (LISTA). Based on the iterative structure of ISTA, the authors free the traditional parameters in ISTA to data-driven variables and unfold ISTA algorithms into a RNN structure. LISTA networks show improved performance in terms of convergence speed in both theoretical analysis and empirical results14-16. Many works have modified LISTA to increase its adaptability by embedding the dictionary in the network architecture. Reference17 couples the learned matrices in LISTA by embedding the dictionary matrix into the network structure. Based on the work of Reference17, robust- ALISTA is proposed in Reference18, which explicitly calculates the learned matrices by solving a coherence minimization problem. Another adaptive variation of LISTA, named Adaptive-LISTA (AdaLISTA)19, generalizes the application of robust-ALISTA to various model scenarios. Beyond enjoying accelerated convergence speed, AdaLISTA can serve different dictionaries using the same weight matrix without retraining the whole network. However, all these LISTA-type networks are sensitive to the signal sparsity level. While increasing the number of nonzero elements in the sparse signal of interest, LISTA and its variants will suffer decreased signal recovery accuracy. In practical applications, the recovered signal may not be sufficiently sparse, such as block-sparse problems which recover an unknown, block-sparse signal, where the nonzero elements are distributed in blocks. Based on AdaLISTA, a block-sparse reconstruction network, named Ada-BlockLISTA, has been proposed in Reference20, which makes good use of the natural structure of a block-sparse signal and can recover its nonzero blocks correctly, while original AdaLISTA fails, especially with many nonzero blocks. In this paper, we formulate the signal model with block structure and introduce relevant iterative algorithms for solving the block-sparse recovery problem in Section 2. By exploring the block structure of the signal model, we introduce Ada-BlockLISTA network in Section 3, and further analyze its convergency rate as well as support recovery guarantee. In numerical simulations, we demonstrate its convergence acceleration and a significant advantage in recovery performance. 2.SYSTEM MODEL AND TRADITIONAL ALGORITHMSIn this section, we introduce the signal model with block-sparsity and review traditional algorithms such as ISTA and LISTA, which leverage the assumption of sparsity in the ground truth signal. 2.1System modelWe aim to recover an unknown, sparse signal from noisy observations taken from a known, under-determined dictionary Φ ∈ ℂN×M, with N < M. In this ill-conditioned problem, noisy observation y ∈ ℂN can be formulated as where x* is the ground truth, and w is additive random noise present in the system. Next, we define block sparsity according to Reference21. Suppose that a block-sparse signal x* is divided into Q blocks of length P. If there are at most K nonzero blocks (K ≪ Q) in x*, it is called K-block-sparse. Sharing the same structure with x*, the dictionary matrix Φ is also divided into Q blocks. A visualization of the block-sparse model is shown in Figure 1. 2.2Brief review of traditional algorithmsThis subsection briefly reviews traditional algorithms such as ISTA and LISTA, designed to solve sparse problems. In the framework of CS, the ℓ1-regularized regression is a well-known technique to harness prior knowledge of sparsity. Standard ISTA solves the ℓ1-regularized optimization problem by iteratively performing Lipschitz gradient descent with respect to the measurement error9,22. While the signal of interest x possesses a block-sparse structure, many researchers have designed specialized algorithms in principle to utilize the block-sparse structure, such as Block ISTA20, block-OMP23, block-based CoSaMP24 and block-sparse Bayesian learning25. To compute an estimate which maintains a block-sparse structure, the original ℓ1-norm optimization problem turns into the following mixed-norm optimization problem as follows, where is the estimated signal, and λ is a regularization parameter controlling the sparsity penalty characterized by minimizing the ℓ2,1 norm. ISTA demonstrates considerable accuracy in recovering sparse signals but takes too much time to converge26. As an unfolded version of ISTA iterations, LISTA is a RNN containing only T layers, where each layer corresponds to an ISTA iteration13. To build a more generalized network for various dictionary choices, AdaLISTA, a variant of LISTA, has been proposed to increase the adaptability of the deep unfolding network19. One layer of AdaLISTA is defined as, where x(k) is the output signal of the k-th layer, W ∈ ℂN×N is the learned weight shared across different layers, while θ(k), γ(k) are the learned soft threshold and step size of the k-th layer in AdaLISTA. 3.BLOCK-SPARSE RECOVERY NETWORK AND CONVERGENCE ANALYSISMotivated by ℓ2,1 minimization methods such as Block-ISTA, we explore the block structure of the signal model and propose our Ada-BlockLISTA network, which is derived from AdaLISTA. As Φ and x* share the same block structure, i.e., and Φ=[Φ1 Φ2 ⋯ ΦQ], we learn an individual weight matrix {Wq}1≤q≤Q for each block and encourage block sparsity for the nonlinear shrinkage function by applying soft thresholding to each block individually. At the k-th layer of Ada-BlockLISTA, the update rule of the q-th block in the block-sparse signal is formulated as, where is the q-th block of after block-wise gradient descent, and is the q-th block of the output signal x(k+1) after block-wise soft thresholding. In Ada-BlockLISTA, we need to learn the individual weight Wq for the q-th block, as well as the learned soft threshold θ(k) and the learned step size γ(k). As shown above, Ada-BlockLISTA consists of two main steps.
Comparing Figures 2 and 3, we can see that Ada-BlockLISTA updates each block in x independently, while AdaLISTA uses the same weight across different blocks. Furthermore, AdaLISTA and Ada-BlockLISTA use the soft threshold parameters {θ(k)}1≤k≤T in different ways. In AdaLISTA, we perform element-wise soft thresholding and set each element in x(k) with an absolute value less than θ(k) to 0. On the other hand, we use θ(k) as a block-wise threshold in Ada-BlockLISTA, which is compared with the ℓ2 norm of every single block rather than the absolute value of elements. Then we show some main theoretical results, i.e., the linear convergence rate and support recovery guarantee of Ada-BlockLISTA. We generalize the two basic definitions of coherence21, i.e., the mutual coherence within a block, also referred to as sub-coherence, and block-coherence of the dictionary, which describe its local and global properties, respectively. Here we illustrate the linear convergence rate of our Ada-BlockLISTA in the following theorem (More details on the convergence analysis of Ada-BlockLISTA can be found in the arXiv article: https://arxiv.org/abs/2111.09801). The proof of Theorem 1 mimics the corresponding proof steps in References18,19 and can be viewed as an extension to the block-sparse case. Theorem 1 (Convergence rate of Ada-BlockLISTA). We first consider the three quantities, Suppose x* to be sufficiently small such that , and the soft thresholds are set as where σ is the standard deviation of noise, then Ada-BlockLISTA converges linearly, i.e., its recovered error decreases exponentially with the number of layers. The above theorem can be interpreted in two aspects:
4.NUMERICAL RESULTSWe simulate various experiments with different signal dimensions and the number of nonzero blocks. In our simulation, we set P = 16 and Q = 64, respectively. We generate observed signals according to Reference27. In Figures 4 and 5, we compared the convergence speed and recovered the 2D block-sparse signal plane of four methods (ISTA, Block-ISTA, AdaLISTA and our Ada-BlockLISTA) when the block sparsity K is 2. The results show that Block ISTA and Ada-BlockLISTA successfully reconstruct two blocks, while ISTA and AdaLISTA fail. It demonstrates that block sparse recovery methods can recover a larger number of blocks than their non-block counterparts, which is also proved in Theorem 1. 5.CONCLUSIONIn this work, we considered the block-sparse signal recovery problem, where nonzero entries of recovered signals occur in clusters. A block-sparse reconstruction network, Ada-BlockLISTA, explores the block structure in the signal model. We prove that the unrolled block-sparse reconstruction network enjoys a linear convergence rate and provides a sufficient condition on block-sparsity to guarantee the successful recovery of block-sparse signals. We performed extensive simulations to verify the recovery performance of the proposed network. The numerical results show that Ada-BlockLISTA yields better reconstruction properties than the original AdaLISTA network, especially with a large number of nonzero blocks. We demonstrate that explicit use of block-sparsity in unfolded deep learning networks improves block-sparse recovery performance by conducting theoretical and experimental analyses. REFERENCESXu, W. and Hassibi, B.,
“Efficient compressive sensing with deterministic guarantees using expander graphs,”
2007 IEEE Information Theory Workshop, 414
–419
(2007). https://doi.org/10.1109/ITW.2007.4313110 Google Scholar
Indyk, P.,
“Sparse recovery using sparse random matrices,”
[LATIN: Theoretical Informatics], 157 Springer, Berlin, Heidelberg
(2010). Google Scholar
Eldar, Y. C. and Kutyniok, G.,
“[Compressed Sensing: Theory and Applications],”
Cambridge University Press, (2012). https://doi.org/10.1017/CBO9780511794308 Google Scholar
Berger, C. R., Wang, Z., Huang, J. and Zhou, S.,
“Application of compressive sensing to sparse channel estimation,”
IEEE Communications Magazine, 48
(11), 164
–174
(2010). https://doi.org/10.1109/MCOM.2010.5621984 Google Scholar
Xiong, J. and Wang, W.,
“Sparse reconstruction-based beampattern synthesis for multi-carrier frequency diverse array antenna,”
in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
3395
–3398
(2017). Google Scholar
Balakrishnan, R. D. and Kwon, H. M.,
“A new inverse problem based approach for azimuthal DOA estimation,”
in GLOBECOM’03 Global Telecommunications Conference,
2187
–2191
(2003). Google Scholar
Xenaki, A., Gerstoft, P. and Mosegaard, K.,
“Compressive beamforming,”
The Journal of the Acoustical Society of America, 136
(1), 260
(2014). https://doi.org/10.1121/1.4883360 Google Scholar
Huang, T. and Liu, Y.,
“Compressed sensing for a frequency agile radar with performance guarantees,”
in IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP),
1057
–1061
(2015). Google Scholar
Beck, A. and Teboulle, M.,
“A fast iterative shrinkage-thresholding algorithm for linear inverse problems,”
Siam J. Imaging Sciences, 2
(1), 183
–202
(2009). https://doi.org/10.1137/080716542 Google Scholar
Combettes, P. L. and Wajs, V. R.,
“Signal recovery by proximal forward-backward splitting,”
Multiscale Model Simul, 4
(4), 1168
–1200
(2006). https://doi.org/10.1137/050626090 Google Scholar
Antonello, N., Stella, L., Patrinos, P. and van Waterschoot, T.,
“Proximal gradient algorithms: Applications in signal processing,”
arXiv preprint arXiv:1803.01621,
(2018). Google Scholar
Zhang, J. and Ghanem, B.,
“ISTA-Net: Iterative shrinkage-thresholding algorithm inspired deep network for image compressive sensing,”
arXiv:1706.07929,
(2017). Google Scholar
Gregor, K. and Lecun, Y.,
“Learning fast approximations of sparse coding,”
in International Conference on International Conference on Machine Learning,
399
–406
(2010). Google Scholar
Borgerding, M., Schniter, P. and Rangan, S.,
“AMP-inspired deep networks for sparse linear inverse problems,”
IEEE Transactions on Signal Processing, 65
(16), 4293
–4308
(2017). https://doi.org/10.1109/TSP.2017.2708040 Google Scholar
Borgerding, M. and Schniter, P.,
“Onsager-corrected deep learning for sparse linear inverse problems,”
in 2016 IEEE Global Conference on Signal and Information Processing,
227
–231
(2016). Google Scholar
Fu, R., Huang, T., Liu, Y. and Eldar, Y. C.,
“Compressed LISTA exploiting Toeplitz structure,”
in 2019 IEEE Radar Conference (RadarConf),
1
–6
(2019). Google Scholar
Chen, X., Liu, J., Wang, Z. and Yin, W.,
“Theoretical linear convergence of unfolded ista and its practical weights and thresholds,”
[Advances in Neural Information Processing Systems], Curran Associates, Inc, 2018). Google Scholar
Liu, J., Chen, X., Wang, Z. and Yin, W.,
“ALISTA: Analytic weights are as good as learned weights in LISTA,”
in International Conference on Learning Representations,
(2019). Google Scholar
Aberdam, A., Golts, A. and Elad, M.,
“Ada-lista: Learned solvers adaptive to varying models,”
arXiv:2001.08456,
(2020). Google Scholar
Fu, R., Monardo, V., Huang, T. and Liu, Y.,
“Deep unfolding network for block-sparse signal recovery,”
in 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
2880
–2884
(2021). Google Scholar
Eldar, Y. C., Kuppinger, P. and Bolcskei, H.,
“Block-sparse signals: Uncertainty relations and efficient recovery,”
IEEE Transactions on Signal Processing, 58
(6), 3042
–3054
(2010). https://doi.org/10.1109/TSP.2010.2044837 Google Scholar
Wu, K., Guo, Y., Li, Z. and Zhang, C.,
“Sparse coding with gated learned ISTA,”
in International Conference on Learning Representations,
(2020). Google Scholar
Eldar, Y. C., Kuppinger, P. and Bolcskei, H.,
“Block-sparse signals: Uncertainty relations and efficient recovery,”
IEEE Transactions on Signal Processing, 58
(6), 3042
–3054
(2010). https://doi.org/10.1109/TSP.2010.2044837 Google Scholar
Baraniuk, R. G., Cevher, V., Duarte, M. F. and Hegde, C.,
“Model-based compressive sensing,”
IEEE Transactions on Information Theory, 56
(4), 1982
–2001
(2010). https://doi.org/10.1109/TIT.2010.2040894 Google Scholar
Zhang, Z. and Rao, B. D.,
“Sparse signal recovery with temporally correlated source vectors using sparse bayesian learning,”
IEEE Journal of Selected Topics in Signal Processing, 5
(5), 912
–926
(2011). https://doi.org/10.1109/JSTSP.2011.2159773 Google Scholar
Draganic, A., Orovic, I. and Stankovic, S.,
“On some common compressive sensing recovery algorithms and applications—Review paper,”
Facta universitatis-series: Electronics and Energetics, 30 477
–510
(2017). Google Scholar
Chi, Y. and Chen, Y.,
“Compressive two-dimensional harmonic retrieval via atomic norm minimization,”
IEEE Transactions on Signal Processing, 63
(4), 1030
–1042
(2015). https://doi.org/10.1109/TSP.2014.2386283 Google Scholar
|