{"id":7598,"date":"2021-11-23T14:51:16","date_gmt":"2021-11-23T14:51:16","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=7598"},"modified":"2021-12-01T15:13:37","modified_gmt":"2021-12-01T15:13:37","slug":"benchmarks-in-de-novo-drug-design","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2021\/11\/benchmarks-in-de-novo-drug-design\/","title":{"rendered":"Benchmarks in De Novo Drug Design"},"content":{"rendered":"\n<p>I recently came across a review of \u201c<em>De novo<\/em> molecular drug design benchmarking\u201d by Lauren L. Grant and Clarissa S. Sit where they highlighted the recently proposed benchmarking methods including Fr\u00e9chet ChemNet Distance [1], GuacaMol [2], and Molecular Sets (MOSES) [3] together with its current and future potential applications as well as the steps moving forward in terms of validation of benchmarking methods [4].<\/p>\n\n\n\n<p>From this review, I particularly wanted to note about the issues with current benchmarking methods and the points we should be aware of when using these methods to benchmark our own <em>de novo <\/em>molecular design methods. Goal-directed models are referring to <em>de novo<\/em> molecular design methods optimizing for a particular scoring function [2].<\/p>\n\n\n\n<!--more-->\n\n\n\n<ul class=\"wp-block-list\"><li>Usefulness of GuacaMol benchmark is questionable due to copy problem [5]: It is possible to score a perfect score on novelty, validity, and uniqueness and high score on KL divergence by using a model that only adds a single carbon to a randomly selected molecule from the training set (<em>i.e. <\/em>AddCarbon model). Hence, better metrics to quantify novelty would be beneficial &#8211; not just considering whether a new SMILES string becomes generated and its existence in the training set.<\/li><li>Potential issue with GuacaMol\u2019s goal-directed benchmarks [5]: It is challenging to incorporate all desired molecular qualities into one score, so it is possible to generate molecules that are unstable, synthetically unrealistic, or highly uncommon substructures by optimizing for goal-directed models.<\/li><li>High scoring models on GuacaMol and MOSES does not necessarily imply synthetically accessible molecules especially for goal-directed models [6].<\/li><li>Lack of evaluation on efficacy for GuacaMol and MOSES [4]: We are unsure whether medicinal chemists would agree with the scores proposed by these benchmarking methods.<\/li><\/ul>\n\n\n\n<p><em>De novo<\/em> molecular design is a growing field and so as its benchmarking methods and is apparent that metrics such as synthetic accessibility and rating by medicinal chemists could further improve future benchmarking methods.<\/p>\n\n\n\n<p><strong>References:<br><\/strong>[1] K. Preuer, P. Renz, T. Unterthiner, S. Hochreiter and G. Klambauer, Fr\u00e9chet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery, <em>J. Chem. Inf. Model.<\/em>, 2018, <strong>58<\/strong>(9), 1736\u20131741, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/30118593\/\">DOI: 10.1021\/acs.jcim.8b00234<\/a>.<br>[2] N. Brown, M. Fiscato, M. H. S. Segler and A. C. Vaucher, GuacaMol: Benchmarking Models for de Novo Molecular Design,<em> J. Chem. Inf. Model.<\/em>, 2019, <strong>59<\/strong>(3), 1096\u20131108, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/30887799\/\">DOI: 10.1021\/acs.jcim.8b00839<\/a>.<br>[3] D. Polykovskiy, A. Zhebrak, B. Sanchez-Lengeling, S. Golovanov, O. Tatanov, S. Belyaev, R. Kurbanov, A. Artamonov, V. Aladinskiy, M. Veselov, A. Kadurin, S. Johansson, H. Chen, S. Nikolenko, A. Aspuru-Guzik and A. Zhavoronkov, Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models, Front. <em>Pharmacol<\/em>., 2020, 11, 565644, <a href=\"https:\/\/www.frontiersin.org\/articles\/10.3389\/fphar.2020.565644\/full\">DOI: 10.3389\/fphar.2020.565644<\/a>.<br>[4] L. L. Grant and C. S. Sit, <em>De novo<\/em> molecular drug design benchmarking, <em>RSC Med. Chem<\/em>., 2021, <strong>12<\/strong>, 1273, <a href=\"https:\/\/doi.org\/10.1039\/D1MD00074H\">DOI: 10.1039\/D1MD00074H<\/a><br>[5] P. Renz, D. Van Rompaey, J. K. Wegner, S. Hochreiter and G. Klambauer, On Failure Modes in Molecule Generation and Optimization, <em>Drug Discovery Today: Technol.<\/em>, 2019, 32\u201333, 55\u201363, <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1740674920300159?via%3Dihub\">DOI: 10.1016\/j.ddtec.2020.09.003<\/a>.<br>[6] W. Gao and C. W. Coley, The Synthesizability of Molecules Proposed by Generative Models, <em>J. Chem. Inf. Model.<\/em>, 2020, <strong>60<\/strong>(12), 5714\u20135723, <a href=\"https:\/\/pubs.acs.org\/doi\/10.1021\/acs.jcim.0c00174\">DOI: 10.1021\/acs.jcim.0c00174<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently came across a review of \u201cDe novo molecular drug design benchmarking\u201d by Lauren L. Grant and Clarissa S. Sit where they highlighted the recently proposed benchmarking methods including Fr\u00e9chet ChemNet Distance [1], GuacaMol [2], and Molecular Sets (MOSES) [3] together with its current and future potential applications as well as the steps moving [&hellip;]<\/p>\n","protected":false},"author":63,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[187,447,31,201],"tags":[],"ppma_author":[494],"class_list":["post-7598","post","type-post","status-publish","format-standard","hentry","category-cheminformatics","category-molecular-design","category-notes","category-small-molecules"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":494,"user_id":63,"is_guest":0,"slug":"an","display_name":"An Goto","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/fa2cef36889bddc3093900adbfca8e73fbb9b9340436d143d7ae087cbffa1f3c?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/7598","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/63"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=7598"}],"version-history":[{"count":4,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/7598\/revisions"}],"predecessor-version":[{"id":7625,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/7598\/revisions\/7625"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=7598"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=7598"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=7598"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=7598"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}