Skip to content

qiangbo1222/TransLearn_NP

Repository files navigation

TransferLearning_NP

A large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The target prediction model can be applied in the field of natural product-based drug discovery and has the potential to find more lead compounds or to assist researchers in drug repurposing. This repository contains the code to reproduce the results from our published paper 'Target Prediction Model for Natural Products Using Transfer Learning'. Only acadamic or non-commercial usage is allowed. image

Data

The bioactivity data used for training can be derived from the offical website of ChEMBL and the structures of natural products can be downloaded from COCONUT. The code needed for cleaning and processing data are provided.

Models

The model was pre-trained on a processed ChEMBL dataset and then fine-tuned on a natural product dataset. Benefitting from these techniques, the model achieved a highly promising area under the receiver operating characteristic curve (AUROC) score of 0.910, with limited task-related training samples. The boost effect of model's AUROC can be viewed in the belowed Figure. image

All the model's defination can be found in pretrain.py and finetune.py

Contributing

Bo Qiang, School of Pharmaceutical Sciences, Peking University

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published