The development of new and effective drugs is an expensive and long-lasting project which requires patience. Most drug discoveries are based on serendipity because the effect of compounds in the body is hard to predict. If a molecule in the body is identified as a possible target, a compound needs to be found which can interact with the target without many side effects. The number of possible compounds is often too high to screen in the laboratory. For this reason virtual screening is used to predict possible targets in a computational way.

There are two major approaches to virtually screen for possible drugs if the target is known: Structure based drug design which uses docking models and quantitative structure–activity relationship models. The latter is based on physico-chemical properties or theoretical molecular descriptors of chemicals to design a new compound. Machine learning can be utilized in both approaches to reduce laboratory work, development costs and to increase performance.


Structure based drug design

Docking programs use the characteristics of the 3D structure of a molecule to predict the efficiency of a compound. These programs can be used to perform a virtual screening of a big cohort of possible drugs. The program performs several rotations of the molecule to find the best orientation which result in the strongest binding to the active center of the target. A scoring function will be calculated to quantify the strength of the binding. This process implies a lot of data processing. Deep learning is a tool to efficiently extract features from the output of docking software and to use them for the improvement of the performance and the scoring function.

Bild6© Hernández-Santoyo et al. (2013). Protein-Protein and Protein-Ligand Docking, Protein Engineering – Technology and Application, Dr. Tomohisa Ogawa (Ed.), InTech


Quantitative Structure–Activity Relationship Models (QSAR)

QSAR is a statistical model used to describe the biological, physical, chemical and pharmacological effect of a molecule on a certain target. Therefore molecular descriptors derived from different theories (e.g. quantum-chemistry, information theory) are matched to the biological structure. Computational algorithms which use different statistical methods will result in an interaction prediction of the molecule and the target. The amount of molecular descriptors is high and the datasets are large which is why deep neural networks improve the workflow.


© Abuhammad et al.  (2016) QSAR studies in the discovery of novel type-II diabetic therapies, Expert Opinion on Drug Discovery

More information about drug discovery with machine learning:

Pereira, J.A, Caffarena, E.R., Nogueira dos Santos, C. (2016) Boosting Docking-Based Virtual Screening with Deep Learning. J. Chem. Inf. Model. 56, pp 2495–2506

Zhang, L., Tan, J., Han, D., Zhu, H. (2017)  From machine learning to deep learning: progress in machine intelligence for rational drug discovery . Drug Discov Today