{"id":3357,"date":"2017-02-21T10:25:24","date_gmt":"2017-02-21T10:25:24","guid":{"rendered":"http:\/\/www.blopig.com\/blog\/?p=3357"},"modified":"2017-02-21T10:25:58","modified_gmt":"2017-02-21T10:25:58","slug":"parallel-computing-gnu-parallel","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2017\/02\/parallel-computing-gnu-parallel\/","title":{"rendered":"Parallel Computing: GNU Parallel"},"content":{"rendered":"<p>Recently I started using the OPIG servers to run the algorithm I have developed (CRANkS) on datasets from DUDE (Database of Useful Decoys Enhanced).<\/p>\n<p>This required learning how to run jobs in parallel. Previously I had been using computer clusters with their own queuing system (Torque\/PBS) which allowed me to submit each molecule to be scored by the algorithm as a separate job. The queuing system would then automatically allocate nodes to jobs and execute jobs accordingly. On a side note I learnt how to submit these jobs an array, which was preferable to submitting ~ 150,000 separate jobs:<\/p>\n<p><code>qsub -t 1:X array_submit.sh<\/code><\/p>\n<p>where the contents of<em> array_submit.sh<\/em> would be:<br \/>\n<code><br \/>\n#!\/bin\/bash<br \/>\n.\/$SGE_TASK_ID.sh<br \/>\n<\/code><\/p>\n<p>which would submit jobs 1.sh to X.sh, where X is the total number of jobs.<\/p>\n<p>However the OPIG servers do not have a global queuing system to use. I needed a way of being able to run the code I already had in parallel with minimal changes to the workflow or code itself. There are many ways to run jobs in parallel, but to minimise work for myself, I decided to use <a href=\"https:\/\/www.gnu.org\/software\/parallel\/\">GNU parallel<\/a> [1].<\/p>\n<p>This is an easy-to-use shell tool, which I found quick and easy to install onto my home server, allowing me to access it on each of the OPIG servers.<\/p>\n<p>To use it I simply run the command:<br \/>\n<code><br \/>\ncat submit.sh | parallel -j Y<br \/>\n<\/code><\/p>\n<p>where Y is the number of cores to run the jobs on, and <em>submit.sh<\/em> contains:<br \/>\n<code><br \/>\n.\/1.sh<br \/>\n.\/2.sh<br \/>\n...<br \/>\n.\/X.sh<br \/>\n<\/code><\/p>\n<p>This executes each job making use of Y number of cores when available to run the jobs in parallel.<\/p>\n<p>Quick, easy, simple and minimal modifications needed! Thanks to Jin for introducing me to GNU Parallel!<\/p>\n<p>[1] O. Tange (2011): GNU Parallel &#8211; The Command-Line Power Tool, The USENIX Magazine, February 2011:42-47.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently I started using the OPIG servers to run the algorithm I have developed (CRANkS) on datasets from DUDE (Database of Useful Decoys Enhanced). This required learning how to run jobs in parallel. Previously I had been using computer clusters with their own queuing system (Torque\/PBS) which allowed me to submit each molecule to be [&hellip;]<\/p>\n","protected":false},"author":33,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[29,14],"tags":[],"ppma_author":[521],"class_list":["post-3357","post","type-post","status-publish","format-standard","hentry","category-code","category-howto"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":521,"user_id":33,"is_guest":0,"slug":"hannahpatel","display_name":"Hannah Patel","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/475fce49714d9af7623be7dafe4a36b34bbcc765845293c995c7def372b15846?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/33"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=3357"}],"version-history":[{"count":2,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3357\/revisions"}],"predecessor-version":[{"id":3359,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/3357\/revisions\/3359"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=3357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=3357"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=3357"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=3357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}