{"id":9570,"date":"2023-04-04T11:46:35","date_gmt":"2023-04-04T10:46:35","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=9570"},"modified":"2023-04-18T16:28:12","modified_gmt":"2023-04-18T15:28:12","slug":"molecular-conformation-generation-with-a-dl-based-force-field","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2023\/04\/molecular-conformation-generation-with-a-dl-based-force-field\/","title":{"rendered":"Molecular conformation generation with a DL-based force field"},"content":{"rendered":"\n<p>Deep learning (DL) methods in structural modelling are outcompeting force fields because they overcome the two main limitations to force fields methods &#8211; the prohibitively large search space for large systems and the limited accuracy of the description of the physics [4].<\/p>\n\n\n\n<p>However, the two methods are also compatible. DL methods are helping to close the gap between the applications of force fields and <em>ab initio<\/em> methods [3]. The advantage of DL-based force fields is that the functional form does not have to be specified explicitly and much more accurate. Say goodbye to the 12-6 potential function.<\/p>\n\n\n\n<p>In principle DL-based force fields can be applied anywhere where regular force fields have been applied, for example conformation generation [2]. The flip-side of DL-based methods commonly is poor generalization but it seems that force fields, when properly trained, generalize well. ANI trained on molecules with up to 8 heavy atoms is able to generalize to molecules with up to 54 atoms [1]. Excitingly for my research, ANI-2 [2] can replace UFF or MMFF as the energy minimization step for conformation generation in RDKit [5].<\/p>\n\n\n\n<p>So let&#8217;s use Auto3D [2] to generated low energy conformations for the four molecules caffeine, Ibuprofen, an experimental hybrid peptide, and Imatinib:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"raw\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">CN1C=NC2=C1C(=O)N(C(=O)N2C)C CFF\nCC(C)Cc1ccc(cc1)C(C)C(O)=O IBP\nCc1ccccc1CNC(=O)[C@@H]2C(SCN2C(=O)[C@H]([C@H](Cc3ccccc3)NC(=O)c4cccc(c4C)O)O)(C)C JE2\nCc1ccc(cc1Nc2nccc(n2)c3cccnc3)NC(=O)c4ccc(cc4)CN5CCN(CC5)C STI<\/pre>\n\n\n\n<!--more-->\n\n\n\n<p>The first attempt was sobering. UFF does work on my laptop in seconds but without a GPU, auto3d reports it would take hours to finish &#8211; I stopped the program after 28 minutes.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"bash\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">~\/Projects\/auto3d > python auto3D.py \"example\/files\/mymols.smi\" --use_gpu=False --k=1\n\n         _              _             _____   ____  \n        \/ \\     _   _  | |_    ___   |___ \/  |  _ \\ \n       \/ _ \\   | | | | | __|  \/ _ \\    |_ \\  | | | |\n      \/ ___ \\  | |_| | | |_  | (_) |  ___) | | |_| |\n     \/_\/   \\_\\  \\__,_|  \\__|  \\___\/  |____\/  |____\/  2.0\n        \/\/ Automatic generation of the low-energy 3D structures                                      \n    \nChecking input file...\n        There are 4 SMILES in the input file example\/files\/mymols.smi. \n        All SMILES and IDs are valid.\nSuggestions for choosing isomer_engine and optimizing_engine: \n        Isomer engine options: RDKit and Omega.\n        Optimizing engine options: ANI2x, ANI2xt and AIMNET.\nThe available memory is 16 GB.\nThe task will be divided into 1 jobs.\nJob1, number of inputs: 4\n\n\nIsomer generation for job1\nEnumerating cis\/tran isomers for unspecified double bonds...\nEnumerating R\/S isomers for unspecified atomic centers...\nRemoving enantiomers...\nEnumerating conformers\/rotamers, removing duplicates...\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4\/4 [00:01&amp;lt;00:00,  2.04it\/s]\n\n\nOptimizing on job1\nPreparing for parallel optimizing... (Max optimization steps: 5000)\nTotal 3D conformers: 81\n 10%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                                                                                    | 499\/5000 [22:21&amp;lt;2:44:24,  2.19s\/it]Total 3D structures: 81  Converged: 5   Dropped(Oscillating): 0    Active: 76\n 13%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f                                                                                 | 650\/5000 [28:13&amp;lt;2:48:11,  2.32s\/it]<\/pre>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<p>Next, I tried on a beefy GPU and the runtime was just about 1 minute.<\/p>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"bash\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">bash-5.2$ python auto3D.py \"example\/files\/mymols.smi\" --k=1\n\n         _              _             _____   ____  \n        \/ \\     _   _  | |_    ___   |___ \/  |  _ \\ \n       \/ _ \\   | | | | | __|  \/ _ \\    |_ \\  | | | |\n      \/ ___ \\  | |_| | | |_  | (_) |  ___) | | |_| |\n     \/_\/   \\_\\  \\__,_|  \\__|  \\___\/  |____\/  |____\/  2.0\n        \/\/ Automatic generation of the low-energy 3D structures                                      \n    \nChecking input file...\n        There are 4 SMILES in the input file example\/files\/mymols.smi. \n        All SMILES and IDs are valid.\nSuggestions for choosing isomer_engine and optimizing_engine: \n        Isomer engine options: RDKit and Omega.\n        Optimizing engine options: ANI2x, ANI2xt and AIMNET.\nThe available memory is 80 GB.\nThe task will be divided into 1 jobs.\nJob1, number of inputs: 4\n\n\nIsomer generation for job1\nEnumerating cis\/tran isomers for unspecified double bonds...\nEnumerating R\/S isomers for unspecified atomic centers...\nRemoving enantiomers...\nEnumerating conformers\/rotamers, removing duplicates...\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4\/4 [00:03&amp;lt;00:00,  1.21it\/s]\n\n\nOptimizing on job1\nPreparing for parallel optimizing... (Max optimization steps: 5000)\nTotal 3D conformers: 84\n 10%|\u2588\u2588\u2588\u2588\u2588\u2588\u258a                                                              | 496\/5000 [00:11&amp;lt;01:35, 46.92it\/s]Total 3D structures: 84  Converged: 3   Dropped(Oscillating): 0    Active: 81\n 20%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a                                                       | 999\/5000 [00:21&amp;lt;01:17, 51.58it\/s]Total 3D structures: 84  Converged: 18   Dropped(Oscillating): 0    Active: 66\n 30%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e                                               | 1498\/5000 [00:29&amp;lt;00:50, 68.81it\/s]Total 3D structures: 84  Converged: 38   Dropped(Oscillating): 2    Active: 44\n 40%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f                                        | 1999\/5000 [00:36&amp;lt;00:36, 82.38it\/s]Total 3D structures: 84  Converged: 49   Dropped(Oscillating): 3    Active: 32\n 50%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                                  | 2496\/5000 [00:42&amp;lt;00:27, 90.32it\/s]Total 3D structures: 84  Converged: 57   Dropped(Oscillating): 5    Active: 22\n 60%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b                           | 2996\/5000 [00:47&amp;lt;00:21, 93.65it\/s]Total 3D structures: 84  Converged: 68   Dropped(Oscillating): 8    Active: 8\n 70%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                    | 3497\/5000 [00:52&amp;lt;00:15, 94.99it\/s]Total 3D structures: 84  Converged: 70   Dropped(Oscillating): 9    Active: 5\n 80%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e             | 3990\/5000 [00:57&amp;lt;00:10, 92.39it\/s]Total 3D structures: 84  Converged: 72   Dropped(Oscillating): 10    Active: 2\n 88%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588        | 4420\/5000 [01:02&amp;lt;00:08, 70.75it\/s]\nOptimization finished at step 4421:   Total 3D structures: 84  Converged: 73   Dropped(Oscillating): 11    Active: 0\nBeggining to select structures that satisfy the requirements...\nEnergy unit: Hartree if implicit.\nProgram running time: 1 minutes<\/pre>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<p>And there we have it: UFF and Auto3D optimized molecule conformations. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"625\" height=\"294\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=625%2C294&#038;ssl=1\" alt=\"\" class=\"wp-image-9572\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=1024%2C481&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=300%2C141&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=768%2C361&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=1536%2C721&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?resize=624%2C293&amp;ssl=1 624w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?w=1700&amp;ssl=1 1700w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?w=1250&amp;ssl=1 1250w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><figcaption class=\"wp-element-caption\">Figure: UFF minimized conformations in blue, Auto3D minimized conformations in yellow<\/figcaption><\/figure>\n\n\n\n<p>We could evaluate the energies of the generated conformations next but staring at the caffeine molecule has made me realize it is coffee break time. <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">References<\/h4>\n\n\n\n<p>[1] J. S. Smith, O. Isayev, and A. E. Roitberg, \u201cANI-1: an extensible neural network potential with DFT accuracy at force field computational cost,\u201d <em>Chem Sci<\/em>, vol. 8, no. 4, pp. 3192\u20133203, 2017, doi: <a href=\"https:\/\/doi.org\/10.1039\/C6SC05720A\">10.1039\/C6SC05720A<\/a>.<\/p>\n\n\n\n<p>[2] Z. Liu, T. Zubatiuk, A. Roitberg, and O. Isayev, \u201cAuto3D: automatic generation of the low-energy 3D structures with ANI neural network potentials,\u201d <em>J Chem Inf Model<\/em>, vol. 62, no. 22, pp. 5373\u20135382, Nov. 2022, doi: <a href=\"https:\/\/doi.org\/10.1021\/acs.jcim.2c00817\">10.1021\/acs.jcim.2c00817<\/a>.<\/p>\n\n\n\n<p>[3] O. T. Unke <em>et al.<\/em>, \u201cMachine learning force fields,\u201d <em>Chem Rev<\/em>, vol. 121, no. 16, pp. 10142\u201310186, Aug. 2021, doi: <a href=\"https:\/\/doi.org\/10.1021\/acs.chemrev.0c01111\">10.1021\/acs.chemrev.0c01111<\/a>.<\/p>\n\n\n\n<p>[4] M. Baek and D. Baker, \u201cDeep learning and protein structure modeling,\u201d <em>Nat Methods<\/em>, vol. 19, no. 1, pp. 13\u201314, Jan. 2022, doi: <a href=\"https:\/\/doi.org\/10.1038\/s41592-021-01360-8\">10.1038\/s41592-021-01360-8<\/a>.<\/p>\n\n\n\n<p>[5] G. Landrum <em>et al.<\/em>, \u201cRDKit Q3 2022 Release.\u201d Zenodo, Feb. 23, 2023. doi: <a href=\"https:\/\/doi.org\/10.5281\/ZENODO.7671152\">10.5281\/ZENODO.7671152<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Deep learning (DL) methods in structural modelling are outcompeting force fields because they overcome the two main limitations to force fields methods &#8211; the prohibitively large search space for large systems and the limited accuracy of the description of the physics [4]. However, the two methods are also compatible. DL methods are helping to close [&hellip;]<\/p>\n","protected":false},"author":92,"featured_media":9572,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[187,189,258,201],"tags":[152,129,134],"ppma_author":[487],"class_list":["post-9570","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cheminformatics","category-machine-learning","category-optimization","category-small-molecules","tag-python","tag-rdkit","tag-small-molecules"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/04\/rendered.png?fit=1700%2C798&ssl=1","jetpack_sharing_enabled":true,"authors":[{"term_id":487,"user_id":92,"is_guest":0,"slug":"martin","display_name":"Martin Buttenschoen","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/766a8e998df1df02635f3d2411a8526569f394d114b2fc9ebb896d84bb37484f?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/9570","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/92"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=9570"}],"version-history":[{"count":5,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/9570\/revisions"}],"predecessor-version":[{"id":9664,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/9570\/revisions\/9664"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media\/9572"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=9570"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=9570"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=9570"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=9570"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}