{"id":8916,"date":"2023-03-16T12:21:01","date_gmt":"2023-03-16T12:21:01","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=8916"},"modified":"2023-08-08T16:27:36","modified_gmt":"2023-08-08T15:27:36","slug":"__trashed","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2023\/03\/__trashed\/","title":{"rendered":"Ligands of CASF-2016"},"content":{"rendered":"\n<p>CASF-2016 is a commonly used benchmark for docking tools. Unfortunately, some of the provided ligand files cannot be loaded using RDKit (version 2022.09.1) but there is an easy remedy.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>The ligands are provided in two file formats &#8211; <code>MOL2<\/code> and <code>SDF<\/code>. Let us try reading the provided <code>SDF<\/code> files first.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code data-enlighter-language=\"python\" class=\"EnlighterJSRAW\"># load CASF-2016 SDF files with RDKit\n\nfrom pathlib import Path\nfrom rdkit.Chem.rdmolfiles import SDMolSupplier\n\npath_casf = Path('.\/CASF-2016\/coreset')\nnames = sorted(&#091;d.stem for d in path_casf.iterdir() if d.is_dir()])\nsuccess = set()\nfailed = set()\nfor name in names:\n    path_sdf = path_casf \/ name \/ f\"{name}_ligand.sdf\"\n    mols = SDMolSupplier(str(path_sdf), sanitize=True)\n    if len(mols) &gt; 0 and mols&#091;0] is not None:\n        success.add(name)\n    else:\n        failed.add(name)\nprint(\"Success:\", len(success))\nprint(\"Failed:\", len(failed))<\/code><\/code><\/pre>\n\n\n\n<p>Running the above we get 86 failures for 285 files.<\/p>\n\n\n\n<p>Let us try the provided <code>MOL2<\/code> files next.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code data-enlighter-language=\"python\" class=\"EnlighterJSRAW\"># load CASF-2016 MOL2 files with RDKit\nfrom rdkit.Chem.rdmolfiles import MolFromMol2File\n\nsuccess = set()\nfailed = set()\nfor name in names:\n    path_mol2 = path_casf \/ name \/ f\"{name}_ligand.mol2\"\n    mol = MolFromMol2File(str(path_mol2), sanitize=True)\n    if mol is not None:\n        success.add(name)\n    else:\n        failed.add(name)\nprint(\"Success:\", len(success))\nprint(\"Failed:\", len(failed))\nprint(sorted(failed))<\/code><\/code><\/pre>\n\n\n\n<p>This time we only get 12 failures. <\/p>\n\n\n\n<p>If we use the <code>MOL2 <\/code>files first and fall back to the <code>SDF <\/code>file, we get 6 ligands which we cannot read properly. They are the ligands for complexes 1BZC, 1VSO, 2ZCQ, 2ZCR, 4TMN, and 5TMN. <\/p>\n\n\n\n<p>To see what is going on, we spot check 5TMN. The <code>SDF<\/code> sanitization error reads &#8220;explicit valence for atom # 25 C, 6, is greater than permitted&#8221;. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_ligand-1.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_ligand-1.png?resize=381%2C286&#038;ssl=1\" alt=\"\" class=\"wp-image-8927\" width=\"381\" height=\"286\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_ligand-1.png?w=640&amp;ssl=1 640w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_ligand-1.png?resize=300%2C225&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_ligand-1.png?resize=624%2C468&amp;ssl=1 624w\" sizes=\"auto, (max-width: 381px) 100vw, 381px\" \/><\/a><figcaption class=\"wp-element-caption\"> CASF-2016 ligand 0PJ of entry 5TMN loaded from the SDF file in PyMOL<\/figcaption><\/figure>\n\n\n\n<p>The <code>.mol2<\/code> files with error message &#8220;warning &#8211; O.co2 with non C.2 or S.o2 neighbor.&#8221;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_mol2.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_mol2.png?resize=378%2C284&#038;ssl=1\" alt=\"\" class=\"wp-image-8928\" width=\"378\" height=\"284\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_mol2.png?w=640&amp;ssl=1 640w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_mol2.png?resize=300%2C225&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/5tmn_mol2.png?resize=624%2C468&amp;ssl=1 624w\" sizes=\"auto, (max-width: 378px) 100vw, 378px\" \/><\/a><figcaption class=\"wp-element-caption\"> CASF-2016 ligand 0PJ of entry 5TMN loaded from the .mol2 file in PyMOL<\/figcaption><\/figure>\n\n\n\n<p>The easiest way to solve these errors is to go find the ligand in the PDB and download a new SDF file from there. Viola, this time the file can be read, and we get a nice ligand. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/image-1.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/image-1.png?resize=450%2C150&#038;ssl=1\" alt=\"\" class=\"wp-image-8918\" width=\"450\" height=\"150\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/image-1.png?w=450&amp;ssl=1 450w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2022\/11\/image-1.png?resize=300%2C100&amp;ssl=1 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><figcaption class=\"wp-element-caption\"> Ligand 0PJ of PDB entry 5TMN loaded from the SDF file provided by the PDB<\/figcaption><\/figure>\n\n\n\n<p>Luckily we only have to do download a new file 6 times. <\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>CASF-2016 is a commonly used benchmark for docking tools. Unfortunately, some of the provided ligand files cannot be loaded using RDKit (version 2022.09.1) but there is an easy remedy.<\/p>\n","protected":false},"author":92,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[291,272],"tags":[679,152,129],"ppma_author":[487],"class_list":["post-8916","post","type-post","status-publish","format-standard","hentry","category-protein-ligand-docking","category-software-services","tag-casf-2016","tag-python","tag-rdkit"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":487,"user_id":92,"is_guest":0,"slug":"martin","display_name":"Martin Buttenschoen","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/766a8e998df1df02635f3d2411a8526569f394d114b2fc9ebb896d84bb37484f?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/8916","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/92"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=8916"}],"version-history":[{"count":5,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/8916\/revisions"}],"predecessor-version":[{"id":10219,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/8916\/revisions\/10219"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=8916"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=8916"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=8916"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=8916"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}