{"id":1795,"date":"2014-04-13T19:49:35","date_gmt":"2014-04-13T18:49:35","guid":{"rendered":"http:\/\/www.blopig.com\/blog\/?p=1795"},"modified":"2014-04-22T11:00:27","modified_gmt":"2014-04-22T10:00:27","slug":"quick-standalone-blast-setup-for-ubuntu-linux","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2014\/04\/quick-standalone-blast-setup-for-ubuntu-linux\/","title":{"rendered":"Quick Standalone BLAST Setup for Ubuntu Linux"},"content":{"rendered":"<p>Some people run into trouble trying to setup a standalone version of BLAST using the NCBI <a href=\"http:\/\/www.ncbi.nlm.nih.gov\/books\/NBK52640\/\">instructions<\/a>. Here a stremalined process will be presented, targeted at Ubuntu.<\/p>\n<div>I assume that you are aware of the paradigms of blast, meaning that there are several executables for searching nucleic acids or proteins and there are different databases you can blast against. Sinon, you should <a href=\"http:\/\/www.ncbi.nlm.nih.gov\/books\/NBK1763\/\">read up<\/a> on the\u00a0<a id=\"\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/BLAST\/blast_program.shtml\" target=\"_blank\">available search tools<\/a>\u00a0 and <a href=\"http:\/\/blast.ncbi.nlm.nih.gov\/Blast.cgi?CMD=Web&amp;PAGE_TYPE=BlastDocs&amp;DOC_TYPE=ProgSelectionGuide\">databases<\/a>\u00a0before you attempt to install Blast.\u00a0NB, throughout this document, I am using protein blast and protein input &#8211; changing to nucleotide sequences is trivial as you just change blastp to blastn and &#8216;prot&#8217; to &#8216;nt&#8217; in obvious places (and of course you use different queries and target databases).<\/div>\n<div><\/div>\n<div>Without further ado, Blast setup for UNIX.<\/div>\n<div><\/div>\n<div>There are two components for the installation:<\/div>\n<div><\/div>\n<div>\n<ol>\n<li>Executables (bastn, blastp etc.)<\/li>\n<li>Databases. (nr, nt etc.)<\/li>\n<\/ol>\n<\/div>\n<div>Both are described below with follow-up examples of usage.<\/div>\n<div><\/div>\n<div><strong>Ad.1<\/strong> The executables can be downloaded and compiled from\u00a0<a id=\"\" href=\"http:\/\/www.ncbi.nlm.nih.gov\/books\/NBK52640\/\" target=\"_blank\">here<\/a>\u00a0(download the source, run .\/configure then make and finally make install in the directory of the untarred file). However a much easier way to do it under Ubuntu is:<\/div>\n<div><\/div>\n<div>\n<pre class=\"lang:sh decode:true\">sudo apt-get install ncbi-blast+<\/pre>\n<\/div>\n<div>This automatically installs everything. In both cases to check if all went ok, type:<\/div>\n<div>\n<pre class=\"lang:sh decode:true\">which blastp<\/pre>\n<\/div>\n<div>If you get a directory such as \/usr\/local\/bin than all went well and that&#8217;s where your executables are.<\/div>\n<div><\/div>\n<div><strong>Ad.2<\/strong>\u00a0FIrst, you need to decide on where to store the databases. Do this by setting the environment variable:<\/div>\n<div><\/div>\n<div>export BLASTDB=\/path\/to\/blastdbs\/of\/your\/chosing<\/div>\n<div><\/div>\n<div>Now, we can either use one of the ncbi-curated databases or create our own. We will do both.<\/div>\n<div><\/div>\n<div>A) Downloading and using an ncbi-curated database.<\/div>\n<div><\/div>\n<div>The databases can be downloaded using the\u00a0<a id=\"\" href=\"https:\/\/www.google.co.uk\/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=1&amp;cad=rja&amp;uact=8&amp;ved=0CC8QFjAA&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2FBLAST%2Fdocs%2Fupdate_blastdb.pl&amp;ei=V6VKU8TSG4iA7QbY-oDoBg&amp;usg=AFQjCNH4zfPZxwqbeoRz2K7TKUrs-VCNiw&amp;sig2=oXp0JtGYUgwjPzqoRUkb8w&amp;bvm=bv.64542518,d.ZGU\" target=\"_blank\">update_blastdb<\/a>\u00a0script. As an example I will download a non redundant protein database which is referred to as &#8216;nr&#8217;:<\/div>\n<div><\/div>\n<div>\n<pre class=\"lang:sh decode:true\">cd $BLASTDB\r\nsudo update_blastdb --passive --timeout 300 --force --verbose nr\r\nls *.gz |xargs -n1 tar -xzvf\r\nrm *.gz.*<\/pre>\n<p>The penultimate command extracts all the files you have downloaded and the last one removes the downloaded archives.<\/p>\n<p>Now you should be able to use your new database by executing (where somesequence.fasta is your sample query):<\/p>\n<pre>blastp -db nr -query somesequence.fasta<\/pre>\n<p>Done.<\/p>\n<p><span style=\"line-height: 1.714285714;font-size: 1rem\">B) Creating your own database.<\/span><\/p>\n<p>Firstly, put a bunch of fasta protein sequences into a file called sample.fa<\/p>\n<p>Next, execute the following<\/p>\n<pre>makeblastdb -in sample.fa -dbtype 'prot' -out NewDb\r\nmv NewDB* $BLASTDB\/<\/pre>\n<p>We have now created a blast protein database from your fasta file, called NewDB. The last line simply moves all the blast files to the database directory.<\/p>\n<p>Now you should be able to use your new database by executing (where somesequence.fasta is your sample query):<\/p>\n<pre class=\"lang:sh decode:true\">blastp -db NewDb -query somesequence.fasta<\/pre>\n<p>Done.<\/p>\n<p><strong>Afterword<\/strong><\/p>\n<p>These instructions are the shortest way I could find to get a working stand-alone BLAST application. If you require more info, you can look <a href=\"http:\/\/www.rocksclusters.org\/roll-documentation\/bio\/5.4\/blast_usage.html\">here<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Some people run into trouble trying to setup a standalone version of BLAST using the NCBI instructions. Here a stremalined process will be presented, targeted at Ubuntu. I assume that you are aware of the paradigms of blast, meaning that there are several executables for searching nucleic acids or proteins and there are different databases [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[14],"tags":[],"ppma_author":[482],"class_list":["post-1795","post","type-post","status-publish","format-standard","hentry","category-howto"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":482,"user_id":4,"is_guest":0,"slug":"konrad","display_name":"Konrad Krawczyk","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/fdb224fe7b0775e3c9a6956ae2a5ffd7c35ab8ce3ff99c5f6e0a51d45557cdd6?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"Scientist","first_name":"Lucky","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/1795","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=1795"}],"version-history":[{"count":8,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/1795\/revisions"}],"predecessor-version":[{"id":1805,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/1795\/revisions\/1805"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=1795"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=1795"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=1795"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1795"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}