Education, Science, Technology, Innovation and Life
Open Access
Sign In

WXGCB: A Clustering Prior Weighting Semi-Supervised Learning Method Based on Space Level Constraint and Mixed Variable Metrics

Download as PDF

DOI: 10.23977/acss.2023.070806 | Downloads: 18 | Views: 405

Author(s)

Xinguang Wang 1

Affiliation(s)

1 School of Information and Electronic Technology, Key Laboratory of Autonomous Intelligence and Information Processing in Heilongjiang Province, Jiamusi University, Jiamusi, China

Corresponding Author

Xinguang Wang

ABSTRACT

A clustering prior weighted semi-supervised learning method called WXGCB has been proposed, which combines the characteristics of the cluster-then-label semi-supervised method and space-level constraint semi-supervised method. WXGCB can use mixed variable information, data prior information, and clustering prior information based on different clustering algorithms to adjust the distance matrix, thereby transforming different supervised learning algorithms into semi-supervised learning algorithms for improving their prediction accuracy. Due to the fact that WXGCB does not require internal adjustments to the clustering algorithms and supervised learning algorithms used, this method can flexibly combine different clustering algorithms and supervised learning algorithms to find combinations that can better compensate for each other's shortcomings, and can easily convert various supervised learning algorithms into semi-supervised learning algorithms. To verify the effectiveness of WXGCB, WXGCB transformed two supervised learning algorithms KSNN and DBGLM into semi-supervised mixed variable learning algorithms SMKSNN and SMGLM, and conducted performance comparison experiments with the other two semi-supervised learning algorithms on six benchmark datasets.

KEYWORDS

Semi-supervised learning, mixed variable, space-level constraints, clustering prior

CITE THIS PAPER

Xinguang Wang, WXGCB: A Clustering Prior Weighting Semi-Supervised Learning Method Based on Space Level Constraint and Mixed Variable Metrics. Advances in Computer, Signals and Systems (2023) Vol. 7: 42-52. DOI: http://dx.doi.org/10.23977/acss.2023.070806.

REFERENCES

[1] Jordan M I, Mitchell T M. (2015) Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.
[2] Mahesh B. (2020) Machine learning algorithms-a review. International Journal of Science and Research (IJSR). [Internet], 9(1), 381-386.
[3] Fradkov A L. (2020) Early history of machine learning. IFAC-PapersOnLine, 53(2), 1385-1390.
[4] Cover T, Hart P. (1967) Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
[5] Fix E, Hodges J L. (1989) Discriminatory analysis. Nonparametric discrimination: Consistency properties. International Statistical Review/Revue Internationale de Statistique, 57(3), 238-247.
[6] Wu X, Kumar V, Ross Quinlan J, et al. (2008) Top 10 algorithms in data mining. Knowledge information systems, 14(1), 1-37.
[7] Anava O, Levy K. (2016) k*-nearest neighbors: From global to local. Advances in neural information processing systems, 29.
[8] Nelder J A, Wedderburn R W. (1972) Generalized linear models. Journal of the Royal Statistical Society Series A: Statistics in Society, 135(3), 370-384.
[9] Boj E, Delicado P, Fortiana J. (2010) Distance-based local linear regression for functional predictors. Computational Statistics Data Analysis, 54(2), 429-437.
[10] Ghosal A, Nandy A, Das A K, et al. (2020) A short review on different clustering techniques and their applications. Emerging Technology in Modelling and Graphics: Proceedings of IEM Graph 2018, 69-83.
[11] Ester M, Kriegel H-P, Sander J, et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. kdd, 96(34), 226-231.
[12] Campello R J, Moulavi D, Sander J. (2013) Density-based clustering based on hierarchical density estimates. Pacific-Asia conference on knowledge discovery and data mining, 160-172.
[13] Van Engelen J E, Hoos H H. (2020) A survey on semi-supervised learning. Machine learning, 109(2), 373-440.
[14] Klein D, Kamvar S D, Manning C D. (2002) From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. ICML, 2, 307-314.
[15] Reddy Y, Viswanath P, Reddy B E. (2018) Semi-supervised learning: A brief review. Int. J. Eng. Technol, 7(1.8), 81.
[16] Li Y-F, Tsang I W, Kwok J T, et al. (2013) Convex and scalable weakly labeled SVMs. Journal of Machine Learning Research, 14(7).
[17] Cortes C, Vapnik V. (1995) Support-vector networks. Machine learning, 20(3), 273-297.
[18] Loog M. (2015) Contrastive pessimistic likelihood estimation for semi-supervised classification. IEEE transactions on pattern analysis and machine intelligence, 38(3), 462-475.
[19] Blei D M, Ng A Y, Jordan M I. (2003) Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
[20] Stevens S S. (1946) On the theory of scales of measurement. Science, 103(2684), 677-680.
[21] Bishnoi S, Hooda B. (2020) A survey of distance measures for mixed variables. International Journal of Chemical Studies.
[22] Gower J C. (1971) A general coefficient of similarity and some of its properties. Biometrics, 857-871.

Downloads: 13395
Visits: 257837

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.