Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (13): 120-128.DOI: 10.3778/j.issn.1002-8331.2203-0288

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Predicting Bioactivities of Ligands Acting with G Protein-Coupled Receptors via Deep Transfer Learning

TANG Lihua, LU Ning, LAN Chuangchuang, CHEN Ronghua, WU Jiansheng   

  1. 1.School of Geographic and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
    2.School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
    3.Department of Neurosurgery, the First People’s Hospital of Changzhou, Changzhou, Jiangsu 213003, China
  • Online:2023-07-01 Published:2023-07-01

利用深度迁移学习靶向GPCRs的配体活性预测

汤丽华,卢宁,兰闯闯,陈荣华,吴建盛   

  1. 1.南京邮电大学 地理与生物信息学院,南京 210023
    2.南京邮电大学 通信与信息工程学院,南京 210023
    3.常州市第一人民医院 神经外科,江苏 常州 213003

Abstract: G protein coupled receptors(GPCRs) are among the most important drug targets, accounting for approximately 34% of drug targets on the market. Accurate modelling and interpretation of bioactivities of ligands are essential for screening hit compounds for GPCR targets in modern drug discovery. Previous studies demonstrate that homologous G protein-coupled receptors can boost the modelling and interpretation of bioactivities of ligand molecules. A new method called GLEM is proposed to model bioactivities of ligands via multi-task deep transfer learning and identify key substructures via group sparse learning by relied homologous information. The GLEM method is tested on 9 groups with 30 representative GPCR datasets which cover most subfamilies of human GPCRs, each with 60~3?000 ligand associations. The results show that the GLEM method performs best on most datasets, where it obtains an average improvement of 31.72% on [r2] over single-task learning methods, and an average improvement of 22.45% on [r2] against deep learning methods. In addition, the influence of the size of training samples on model performance is evaluated and shows that the GLEM method performs best in most small-sized datasets.

Key words: G protein-coupled receptors(GPCRs), extended-connectivity fingerprints, bioactivities of ligands, multi-task learning, deep transfer learning

摘要: G蛋白偶联受体(GPCRs)是最重要的药物靶标之一,约占市场上药物靶标的34%。药物发现过程中,配体生物活性的准确建模和解释对于筛选苗头化合物至关重要。研究表明,同源的G蛋白偶联受体能提升配体分子生物活性的预测性能和可解释性。提出了一种新的方法GLEM,用多任务下的深度迁移学习来预测配体的生物活性,并通过组稀疏来识别相关的关键子结构。GLEM方法在9组30个具有代表性的人类GPCR数据集上进行了实验,这些GPCRs涵盖了大部分人类GPCRs的子家族,每个GPCR数据集都包含60~3?000个配体。实验结果表明,GLEM方法在绝大多数数据集中都获得了最好的性能。与单任务学习方法相比,GLEM方法在[r2]上平均提升了31.72%;与深度学习方法相比,GLEM方法在[r2]上平均提升了22.45%。此外,还评估了不同数量的训练样本对模型性能的影响,实验发现GLEM方法在小样本情况下表现最好。

关键词: G蛋白偶联受体(GPCRs), 扩展连通性指纹, 配体活性, 多任务学习, 深度迁移学习