[ Team LiB ] |
10.2 WU-BLAST InstallationObtaining WU-BLAST software is slightly more complicated than NCBI-BLAST because it requires a license from Washington University in St. Louis. If you are affiliated with an academic institution or a nonprofit organization, the license is free. If you are part of a for-profit enterprise, you must pay a licensing fee. The price is expensive by shrink-wrapped software standards, but is similar to other bioinformatics software packages available from universities. If you find the cost prohibitive, an earlier version of WU-BLAST is available for free. The free version contains fewer features, and is available for a limited number of operating systems, but for most people, it works just fine. If your operating system isn't supported and your specific use doesn't require gapped alignment, a free version of the classic, ungapped BLAST with public domain source code also exists. This older version, 1.4.9, is nearly identical to NCBI-BLAST Version 1.4, which is no longer available from the NCBI. Should you wish to license WU-BLAST or download the free versions, visit the official site for the WU-BLAST software at http://blast.wustl.edu. The free versions can be downloaded with a couple clicks, but more patience is required for the licensed version. After the license is issued, you will be sent a user-specific URL from which to download the software. It's a good idea to save this information because you will use it again to download the free updates. Licensed users are notified by email as new features are added (usually a few times per year). WU-BLAST is available only for Unix operating systems. If you don't have access to a Unix computer, you can run Linux or FreeBSD under a virtual machine with products such as VMWare (http://www.vmware.com) or VirtualPC (http://www.connectix.com). 10.2.1 Expanding the tarballThe software comes as a compressed Unix archive, or tarball. First, create a directory such as /usr/local/pkg/wu-blast; if you don't have root access, create a wu-blast directory inside your home directory. Next, download the tarball to that directory. If you do this from a browser, the files may be extracted automatically. If not, use the following command, where your_platform_name will be something like linuxi686.tar.Z: tar -xzf blast2.your_platform_name Not all versions of tar support the -z option above, in which case you can use the following command line: zcat blast.your_blastform_name | tar -xf - Before you continue with the rest of the installation procedures, look at what's inside the tarball. 10.2.2 Files and DirectoriesThere are a number of files and two subdirectories. The most important items are described very briefly in Table 10-2 in logical, rather than alphabetical, order. See the WU-BLAST reference in Chapter 14 for more information.
10.2.3 ExecutablesLet's assume the tarball has been downloaded to /usr/pkg/wu-blast, and you normally keep your executables in /usr/local/bin. Issue the following commands to put the executables in your path. ln -s /usr/pkg/wu-blast/blasta /usr/local/bin/blastn ln -s /usr/pkg/wu-blast/blasta /usr/local/bin/blastp ln -s /usr/pkg/wu-blast/blasta /usr/local/bin/blastx ln -s /usr/pkg/wu-blast/blasta /usr/local/bin/tblastn ln -s /usr/pkg/wu-blast/blasta /usr/local/bin/tblastx ln -s /usr/pkg/wu-blast/xdformat /usr/local/bin ln -s /usr/pkg/wu-blast/xdget /usr/local/bin Note, unlike the NCBI program blastall, blasta can not be executed by its own name, but only through aliases. 10.2.4 Environment VariablesYou'll need to set three environment variables: BLASTDB, BLASTMAT, and BLASTFILTER. These variables correspond to the locations of the databases, scoring matrices, and complexity filters. WU-BLAST environment variables use a colon-delimited list of locations, like the PATH variable. This is especially useful for database files, which can be placed in several locations in the filesystem and then be accessed by name rather than explicit path. This is convenient because it allows computers to access databases on a networked server or on their local disks, and this is invisible to the user. Databases are looked for from a colon-delimited list of locations defined in the BLASTDB environment variable (similar to the PATH variable for executables). If BLASTDB isn't set, blasta looks in the current directory and in /usr/ncbi/blast/db. In these cases, FASTA databases of the same name must be present (or symbolic links to such databases). It's generally a better idea to use the BLASTDB variable because this strategy uses less disk space and is much less confusing. Two environment variables, BLASTMAT and BLASTFILTER, must be set so blasta can find the scoring matrices and complexity filters. These variables also use colon-delimited lists, but there's little reason to have them in more than one location. Now set the BLASTMAT and BLASTFILTER environment variables to the explicit paths of the matrix and filter directories (we'll assume that the software was unpackaged in /usr/local/wu-blast). Here's how to do so in csh and its derivatives: setenv BLASTMAT /usr/local/wu-blast/matrix setenv BLASTFILTER /usr/local/wu-blast/filter And in sh and its derivatives: BLASTMAT=/usr/local/wu-blast/matrix BLASTFILTER=/usr/local/wu-blast/filter export BLASTMAT BLASTFILTER 10.2.5 Setting Resource Limits with /etc/sysblastWU-BLAST has a special file called /etc/sysblast that sets systemwide resource limitations for each machine running BLAST jobs. The /etc/sysblast file currently supports three commands: nice, cpus, and cpusmax. The nice value gives BLAST processes a lower priority (nice values are generally in the range of 1 to 20, with 20 being the least demanding). If the computer is used for other jobs, such a workstation, setting this to 5 makes the workstation more responsive, but the BLAST job will take over at idle times. The cpus value is the default number of CPUs to use, and cpusmax defines the maximum number of CPUs allowed. These two should be set on any large, multiprocessor machine. Here is a sample /etc/sysblast file: nice = 5 cpus = 1 cpusmax = 4 The behavior of WU-BLAST on multiprocessor systems is worth discussing, and if you're one of the lucky people who have access to a computer with 16 processors or more, /etc/sysblast will definitely help you. WU-BLAST lets users control the number of CPUs with the -cpus command-line parameter. If this parameter isn't given an explicit value, the programs uses all the processors in the computer (except for BLASTN, which reins itself in at four processors). While this may be good for BLAST users, your other users may not be so happy. This is where the /etc/sysblast file is critical because it allows you to modify the default behavior and set limits for CPU usage. |
[ Team LiB ] |