Indexed in SCIE, Scopus, PubMed & PMC
pISSN 1226-4512 eISSN 2093-3827

Article

home Article View

Original Article

Korean J Physiol Pharmacol 2024; 28(6): 527-537

Published online November 1, 2024 https://doi.org/10.4196/kjpp.2024.28.6.527

Copyright © Korean J Physiol Pharmacol.

Predicting antioxidant activity of compounds based on chemical structure using machine learning methods

Jinwoo Jung1,2, Jeon-Ok Moon1, Song Ih Ahn2,*, and Haeseung Lee1,*

1Department of Pharmacy, College of Pharmacy and Research Institute for Drug Development, 2School of Mechanical Engineering, Pusan National University, Busan 46241, Korea

Correspondence to:Song Ih Ahn
E-mail: songihahn@pusan.ac.kr
Haeseung Lee
E-mail: haeseung@pusan.ac.kr

Author contributions: J.J., Investigation, Data curation, Methodology, Visualization, Writing - Original Draft; J.O.M., Conceptualization, Writing - Original Draft; S.I.A., Supervision; H.L., Conceptualization, Supervision, Funding acquisition, Writing - Review & Editing.

Received: May 7, 2024; Revised: July 12, 2024; Accepted: July 12, 2024

Abstract

Oxidative stress is a well-established risk factor for numerous chronic diseases, emphasizing the need for efficient identification of potent antioxidants. Conventional methods for assessing antioxidant properties are often time-consuming and resource-intensive, typically relying on laborious biochemical assays. In this study, we investigated the applicability of machine learning (ML) algorithms for predicting the antioxidant activity of compounds based solely on their molecular structure. We evaluated the performance of five ML algorithms, Support Vector Machine (SVM), Logistic Regression (LR), XGBoost, Random Forest (RF), and Deep Neural Network (DNN), using a dataset of over 1,900 compounds with experimentally determined antioxidant activity. Both RF and SVM achieved the best overall performance, exhibiting high accuracy (> 0.9) and effectively distinguishing active and inactive compounds with high structural similarity. External validation using natural product data from the BATMAN database confirmed the generalizability of the RF and SVM models. Our results suggest that ML models serve as powerful tools to expedite the discovery of novel antioxidant candidates, potentially streamlining the development of future therapeutic interventions.

Keywords: Antioxidants, Artificial intelligence, Data mining, Machine learning, Quantitative structure-activity relationship