Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (18): 78-89.DOI: 10.3778/j.issn.1002-8331.2203-0037

• Theory, Research and Development • Previous Articles     Next Articles

Robust Least Squares Twin Support Vector Machine and Its Sparse Algorithm

JIN Qifan, CHEN Li, XU Mingliang, JIANG Xiaoheng   

  1. 1.College of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
    2.Institute of Physical Education (Main Campus), Zhengzhou University, Zhengzhou 450001, China
  • Online:2022-09-15 Published:2022-09-15

鲁棒最小二乘孪生支持向量机及其稀疏算法

靳启帆,陈丽,徐明亮,姜晓恒   

  1. 1.郑州大学 计算机与人工智能学院,郑州 450001
    2.郑州大学 体育学院(校本部),郑州 450001

Abstract: Least squares twin support vector machine(LSTSVM) solves two linear programming problems instead of solving complex quadratic programming problems. LSTSVM has the advantages of simple calculation and fast training speed. However, the hyperplanes obtained by LSTSVM are easily affected by outliers and the solution to LSTSVM lacks sparse. To solve this problem, a robust least squares twin support vector machine(R-LSTSVM) based on truncated least squares loss is proposed, and it verifies that the new model is robust to outliers in theory. Furthermore, in order to handle large-scale datasets, a sparse solution to the R-LSTSVM is obtained based on the representation theorem and incomplete Cholesky decomposition, and it proposes a sparse R-LSTSVM algorithm which is suitable for dealing with large-scale datasets with outliers. Numerical experiments show that compared with the existing algorithm, the classification accuracy, sparsity and convergence speed of the new algorithm are improved by 1.97%-37.7%, 26-199 times and 6.6-2 027.4 times, respectively.

Key words: robust least squares twin support vector machine, truncated least square loss function, incomplete Cholesky decomposition, representation theorem, sparse solution

摘要: 最小二乘孪生支持向量机通过求解两个线性规划问题来代替求解复杂的二次规划问题,具有计算简单和训练速度快的优势。然而,最小二乘孪生支持向量机得到的超平面易受异常点影响且解缺乏稀疏性。针对这一问题,基于截断最小二乘损失提出了一种鲁棒最小二乘孪生支持向量机模型,并从理论上验证了模型对异常点具有鲁棒性。为使模型可处理大规模数据,基于表示定理和不完全Cholesky分解得到了新模型的稀疏解,并提出了适合处理带异常点的大规模数据的稀疏鲁棒最小二乘孪生支持向量机算法。数值实验表明,新算法比已有算法分类准确率、稀疏性、收敛速度分别提高了1.97%~37.7%、26~199倍和6.6~2 027.4倍。

关键词: 鲁棒最小二乘孪生支持向量机, 截断最小二乘损失函数, 不完全Cholesky分解, 表示定理, 稀疏解