SVM-Based Prediction of Protein Methylation Sites: A Comprehensive Analysis of 553 Properties from the AAindex Database
DOI: 10.23977/medbm.2024.020218 | Downloads: 15 | Views: 528
Author(s)
Xi Su 1, Mingjun Tang 1, Zipin Zhao 1, Ning Zhang 1
Affiliation(s)
1 Tianjin Key Laboratory of Brain Science and Neuroengineering, Department of Biomedical Engineering, Medical School of Tianjin University, Tianjin, 300072, China
Corresponding Author
Ning ZhangABSTRACT
Identifying protein methylation sites experimentally is a challenging and costly task, leading to increased reliance on machine learning-based computational predictors to enhance efficiency. This study aims to improve these predictors through a comprehensive analysis of 553 properties from the AAindex database. We employed support vector machine (SVM) models and utilized 10-fold cross-validation for model evaluation to identify optimal feature combinations for predicting lysine and arginine methylation. The results indicate that the feature set "RACS820104+FUKS010109" yielded the highest performance for lysine methylation, with a Recall (Re) of 71.11%, Precision (Pre) of 75.68%, Accuracy (Acc) of 74.12%, and a Matthews Correlation Coefficient (MCC) of 0.48. For arginine methylation, the feature set "BAEK050101+CHAM810101" achieved a Recall (Re) of 74.60%, Precision (Pre) of 81.08%, Accuracy (Acc) of 78.60%, and an MCC of 0.57. Furthermore, this study explores hydrophobicity as a potentially valuable property for distinguishing methylation from malonylation. This thorough analysis enhances our understanding of the available physicochemical properties, which could lead to the development of more accurate and reliable prediction models.
KEYWORDS
Protein Methylation; Support Vector Machine (SVM); AAindex Database; Feature SelectionCITE THIS PAPER
Xi Su, Mingjun Tang, Zipin Zhao, Ning Zhang, SVM-Based Prediction of Protein Methylation Sites: A Comprehensive Analysis of 553 Properties from the AAindex Database. MEDS Basic Medicine (2024) Vol. 2: 126-135. DOI: http://dx.doi.org/10.23977/medbm.2024.020218.
REFERENCES
[1] Murn, J., Shi, Y. (2017) The winding path of protein methylation research: milestones and new frontiers. Nat Rev Mol Cell Biol 18, 517-527.
[2] Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M. (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36(Database issue):D202-205.
[3] Li ZC, Zhou X, Dai Z, Zou XY. (2011) Identification of protein methylation sites by coupling improved ant colony optimization algorithm and support vector machine. Anal Chim Acta. 703(2):163-171.
[4] Wen PP, Shi SP, Xu HD, Wang LN, Qiu JD. (2016) Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization. Bioinformatics. 32(20):3107-3115.
[5] Zhongyan Li, Shangfu Li, Mengqi Luo, Jhih-Hua Jhong, Wenshuo Li, Lantian Yao, Yuxuan Pang, Zhuo Wang, Rulan Wang, Renfei Ma, Jinhan Yu, Hsien-Da Huang and Tzong-Yi Lee. (2022) dbPTM in 2022: an updated database for exploring regulatory networks and functional associations of protein post-translational modifications. Nucleic Acids Research, Volume 50, Issue D1, Pages D471-D479.
[6] Huang, T., Cui, W., Hu, L., Feng, K., Li, Y. X., & Cai, Y. D. (2009) Prediction of pharmacological and xenobiotic responses to drugs based on time course gene expression profiles. PloS one, 4(12), e8126.
[7] Cortes, C., Vapnik, V. (1995) Support-Vector Networks. Machine Learning 20, 273-297.
[8] Pedregosa et al. (2011) Scikit-learn: Machine Learning in Python. JMLR 12, pp. 2825-2830.
[9] Chatterjee, P., Basu, S., Zubek, J., Kundu, M., Nasipuri, M., & Plewczyński, D. (2015) PDP-RF: Protein domain boundary prediction using random forest classifier. In (Ed.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 441–450).
Downloads: | 1268 |
---|---|
Visits: | 53589 |
Sponsors, Associates, and Links
-
MEDS Clinical Medicine
-
Journal of Neurobiology and Genetics
-
Medical Imaging and Nuclear Medicine
-
Bacterial Genetics and Ecology
-
Transactions on Cancer
-
Journal of Biophysics and Ecology
-
Journal of Animal Science and Veterinary
-
Academic Journal of Biochemistry and Molecular Biology
-
Transactions on Cell and Developmental Biology
-
Rehabilitation Engineering & Assistive Technology
-
Orthopaedics and Sports Medicine
-
Hematology and Stem Cell
-
Journal of Intelligent Informatics and Biomedical Engineering
-
MEDS Stomatology
-
MEDS Public Health and Preventive Medicine
-
MEDS Chinese Medicine
-
Journal of Enzyme Engineering
-
Advances in Industrial Pharmacy and Pharmaceutical Sciences
-
Bacteriology and Microbiology
-
Advances in Physiology and Pathophysiology
-
Journal of Vision and Ophthalmology
-
Frontiers of Obstetrics and Gynecology
-
Digestive Disease and Diabetes
-
Advances in Immunology and Vaccines
-
Nanomedicine and Drug Delivery
-
Cardiology and Vascular System
-
Pediatrics and Child Health
-
Journal of Reproductive Medicine and Contraception
-
Journal of Respiratory and Lung Disease
-
Journal of Bioinformatics and Biomedicine