22 results found
Wang R, Xiu N, Zhou S, 2021, An extended Newton-type algorithm for ℓ2-regularized sparse logistic regression and its efficiency for classifying large-scale datasets, Journal of Computational and Applied Mathematics, Vol: 397, Pages: 113656-113656, ISSN: 0377-0427
Sun J, Kong L, Zhou S, 2021, Gradient projection newton algorithm for sparse collaborative learning, Publisher: arXiv
Exploring the relationship among multiple sets of data from one same groupenables practitioners to make better decisions in medical science andengineering. In this paper, we propose a sparse collaborative learning (SCL)model, an optimization with double-sparsity constraints, to process the problemwith two sets of data and a shared response variable. It is capable of dealingwith the classification problems or the regression problems dependent on thediscreteness of the response variable as well as exploring the relationshipbetween two datasets simultaneously. To solve SCL, we first present somenecessary and sufficient optimality conditions and then design a gradientprojection Newton algorithm which has proven to converge to a unique locallyoptimal solution globally with at least a quadratic convergence rate. Finally,the reported numerical experiments illustrate the efficiency of the proposedmethod.
Wang H, Shao Y, Zhou S, et al., 2021, Support vector machine classifier via L0/1 soft-margin loss, IEEE Transactions on Pattern Analysis and Machine Intelligence, Pages: 1-12, ISSN: 0162-8828
Support vector machines (SVM) have drawn wide attention for the last two decades due to its extensive applications, so a vast body of work has developed optimization algorithms to solve SVM with various soft-margin losses. To distinguish all, in this paper, we aim at solving an ideal soft-margin loss SVM: L0/1 soft-margin loss SVM (dubbed as L0/1 -SVM). Many of the existing (non)convex soft-margin losses can be viewed as one of the surrogates of the L0/1 soft-margin loss. Despite its discrete nature, we manage to establish the optimality theory for the L0/1 -SVM including the existence of the optimal solutions, the relationship between them and P-stationary points. These not only enable us to deliver a rigorous definition of L0/1 support vectors but also allow us to define a working set. Integrating such a working set, a fast alternating direction method of multipliers is then proposed with its limit point being a locally optimal solution to the L0/1 -SVM. Finally, numerical experiments demonstrate that our proposed method outperforms some leading classification solvers from SVM communities, in terms of faster computational speed and a fewer number of support vectors. The bigger the data size is, the more evident its advantage appears.
Zhou S, Luo Z, Xiu N, 2021, Computing one-bit compressive sensing via double-sparsity constrained optimization, Publisher: arXiv
One-bit compressive sensing is popular in signal processing and communications due to the advantage of its low storage costs and hardware complexity. However, it has been a challenging task all along since only the one-bit (the sign) information is available to recover the signal. In this paper, we appropriately formulate the one-bit compressed sensing by a double-sparsity constrained optimization problem. The first-order optimality conditions via the newly introduced τ-stationarity for this nonconvex and discontinuous problem are established, based on which, a gradient projection subspace pursuit (GPSP) approach with global convergence and fast convergence rate is proposed. Numerical experiments against other leading solvers illustrate the high efficiency of our proposed algorithm in terms of the computation time and the quality of the signal recovery as well.
Zhou S, 2021, Sparse SVM for sufficient data reduction, IEEE Transactions on Pattern Analysis and Machine Intelligence, Pages: 1-11, ISSN: 0162-8828
Kernel-based methods for support vector machines (SVM) have shown highly advantageous performance in various applications. However, they may incur prohibitive computational costs for large-scale sample datasets. Therefore, data reduction (reducing the number of support vectors) appears to be necessary, which gives rise to the topic of the sparse SVM. Motivated by this problem, the sparsity constrained kernel SVM optimization has been considered in this paper in order to control the number of support vectors. Based on the established optimality conditions associated with the stationary equations, a Newton-type method is developed to handle the sparsity constrained optimization. This method is found to enjoy the one-step convergence property if the starting point is chosen to be close to a local region of a stationary point, thereby leading to a super-high computational speed. Numerical comparisons with several powerful solvers demonstrate that the proposed method performs exceptionally well, particularly for large-scale datasets in terms of a much lower number of support vectors and shorter computational time.
Zhou S, Pan L, Xiu N, et al., 2021, Quadratic convergence of smoothing Newton's method for 0/1 loss optimization
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modeling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are often able to provide approximate solutions of good quality. However, those methods are not optimizing the 0/1 loss function directly and hence no optimality has been established for the original problem. This paper aims to study the optimality conditions of the 0/1 function minimization, and for the first time to develop Newton's method that directly optimizes the 0/1 function with a local quadratic convergence under reasonable conditions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.
Zhou S, Pan L, Xiu N, 2021, Newton method for ℓ0-regularized optimization, Numerical Algorithms, Vol: 88, Pages: 1541-1570, ISSN: 1017-1398
As a tractable approach, regularization is frequently adopted in sparse optimization. This gives rise to regularized optimization, which aims to minimize the ℓ0 norm or its continuous surrogates that characterize the sparsity. From the continuity of surrogates to the discreteness of the ℓ0 norm, the most challenging model is the ℓ0-regularized optimization. There is an impressive body of work on the development of numerical algorithms to overcome this challenge. However, most of the developed methods only ensure that either the (sub)sequence converges to a stationary point from the deterministic optimization perspective or that the distance between each iteration and any given sparse reference point is bounded by an error bound in the sense of probability. In this paper, we develop a Newton-type method for the ℓ0-regularized optimization and prove that the generated sequence converges to a stationary point globally and quadratically under the standard assumptions, theoretically explaining that our method can perform surprisingly well.
Zhou S, Shang M, Pan L, et al., 2021, Newton hard-thresholding pursuit for sparse linear complementarity problem via a new merit function, SIAM Journal on Scientific Computing, Vol: 43, Pages: A772-A799, ISSN: 1064-8275
Zemkoho AB, Zhou S, 2021, Theoretical and numerical comparison of the Karush–Kuhn–Tucker and value function reformulations in bilevel optimization, Computational Optimization and Applications, Vol: 78, Pages: 625-674, ISSN: 0926-6003
Zhou S, Xiu N, Qi H-D, et al., 2021, Global and quadratic convergence of Newton hard-thresholding pursuit, Journal of Machine Learning Research, Vol: 22, ISSN: 1532-4435
Algorithms based on the hard thresholding principle have been well studied with sounding theoretical guarantees in the compressed sensing and more general sparsity-constrained optimization. It is widely observed in existing empirical studies that when a restricted Newton step was used (as the debiasing step), the hard-thresholding algorithms tend to meet halting conditions in a significantly low number of iterations and are very efficient. Hence, the thus obtained Newton hard-thresholding algorithms call for stronger theoretical guarantees than for their simple hard-thresholding counterparts. This paper provides a theoretical justification for the use of the restricted Newton step. We build our theory and algorithm, Newton Hard-Thresholding Pursuit (NHTP), for the sparsity-constrained optimization. Our main result shows that NHTP is quadratically convergent under the standard assumption of restricted strong convexity and smoothness. We also establish its global convergence to a stationary point under a weaker assumption. In the special case of the compressive sensing, NHTP effectively reduces to some of the existing hard-thresholding algorithms with a Newton step. Consequently, our fast convergence result justifies why those algorithms perform better than without the Newton step. The efficiency of NHTP was demonstrated on both synthetic and real data in compressed sensing and sparse logistic regression.
Zhou S, Xiu N, Qi H-D, 2020, Robust Euclidean embedding via EDM optimization, Mathematical Programming Computation, Vol: 12, Pages: 337-387, ISSN: 1867-2949
Zhou S, Pan L, Xiu N, 2020, Heaviside Set Constrained Optimization: Optimality and Newton Method, arXiv
Data in the real world frequently involve binary status: truth or falsehood,positiveness or negativeness, similarity or dissimilarity, spam or non-spam,and to name a few, with applications into the regression, classificationproblems and so on. To characterize the binary status, one of the idealfunctions is the Heaviside step function that returns one for one status andzero for the other. Hence, it is of dis-continuity. Because of this, theconventional approaches to deal with the binary status tremendously benefitfrom its continuous surrogates. In this paper, we target the Heaviside stepfunction directly and study the Heaviside set constrained optimization:calculating the tangent and normal cones of the feasible set, establishingseveral first-order sufficient and necessary optimality conditions, as well asdeveloping a Newton type method that enjoys locally quadratic convergence andexcellent numerical performance.
Li X, Xiu N, Zhou S, 2020, Matrix Optimization Over Low-Rank Spectral Sets: Stationary Points and Local and Global Minimizers, Journal of Optimization Theory and Applications, Vol: 184, Pages: 895-930, ISSN: 0022-3239
Zhou S, Xiu N, Qi H-D, 2018, A Fast Matrix Majorization-Projection Method for Penalized Stress Minimization With Box Constraints, IEEE Transactions on Signal Processing, Vol: 66, Pages: 4331-4346, ISSN: 1053-587X
Pan L, Zhou S, Xiu N, et al., 2017, A convergent iterative hard thresholding for sparsity and nonnegativity constrained optimization, Pacific Journal of Optimization, Vol: 13, Pages: 325-353, ISSN: 1348-9151
Zhang L, Kong L, Li Y, et al., 2017, A smoothing iterative method for quantile regression with nonconvex Lp penalty, Journal of Industrial & Management Optimization, Vol: 13, Pages: 93-112, ISSN: 1553-166X
Zhou S, Xiu N, Wang Y, et al., 2016, A null-space-based weighted ∂1 minimization approach to compressed sensing, Information and Inference, Vol: 5, Pages: 76-102
It has become an established fact that the constrained ∂1 minimization is capable of recovering the sparse solution from a small number of linear observations and the reweighted version can significantly improve its numerical performance. The recoverability is closely related to the Restricted Isometry Constant (RIC) of order s (s is an integer), often denoted as δs. A class of sufficient conditions for successful k-sparse signal recovery often take the form δtk <c for some t ≫ 1 and c being a constant. When t >1, such a bound is often called RIC bound of high order. There exist a number of such bounds of RICs, high order or not. For example, a high-order bound is recently given by Cai & Zhang (2014, IEEE Trans. Inform. Theory, 60, 122-132): δtk < √ (t -1)/t, and this bound is known sharp for t ≫ 4/3. In this paper, we propose a new weighted ∂1 minimization which only requires the following RIC bound that is more relaxed (i.e., bigger) than the above-mentioned bound: (Equation Presented) where t >1 and 0<ω≪ 1 is determined by two optimizations of a similar type over the null space of the linear observation operator. In tackling the combinatorial nature of the two optimization problems, we develop a reweighted ∂1 minimization that yields a sequence of approximate solutions, which enjoy strong convergence properties. Moreover, the numerical performance of the proposed method is very satisfactory when compared with some of the state-of-the-art methods in compressed sensing.
Pan L-L, Xiu N-H, Zhou S-L, 2015, On Solutions of Sparsity Constrained Optimization, Journal of the Operations Research Society of China, Vol: 3, Pages: 421-439, ISSN: 2194-668X
Shang M, Zhang C, Peng D, et al., 2015, A half thresholding projection algorithm for sparse solutions of LCPs, Optimization Letters, Vol: 9, Pages: 1231-1245, ISSN: 1862-4472
Zhou S, Kong L, Xiu N, 2013, New Bounds for RIC in Compressed Sensing, Journal of the Operations Research Society of China, Vol: 1, Pages: 227-237, ISSN: 2194-668X
This paper gives new bounds for restricted isometry constant (RIC) in compressed sensing. Let Φ be an m×n real matrix and k be a positive integer with k≤n. The main results of this paper show that if the restricted isometry constant of Φ satisfies δ8ak<1 and (Formula presented.) for a> 3/8, then k-sparse solution can be recovered exactly via l1 minimization in the noiseless case. In particular, when a=1,1.5,2 and 3, we have δ2k<0.5746 and δ8k<1, or δ2.5k<0.7046 and δ12k<1, or δ3k<0.7731 and δ16k<1 or δ4k<0.8445 and δ24k<1. © 2013 The Author(s).
Zhou S, Li GY, Communication-Efficient ADMM-based Federated Learning
Federated learning has shown its advances over the last few years but isfacing many challenges, such as how algorithms save communication resources,how they reduce computational costs, and whether they converge. To addressthese issues, this paper proposes exact and inexact ADMM-based federatedlearning. They are not only communication-efficient but also converge linearlyunder very mild conditions, such as convexity-free and irrelevance to datadistributions. Moreover, the inexact version has low computational complexity,thereby alleviating the computational burdens significantly.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.