Adding a BLAST Database to Cartwheel

Problem: You want to be able to BLAST against sequences in a specific database -- either a home-grown database of sequences, or something you downloaded from NCBI or EBI, or something else. This database isn't available through your Cartwheel server.

Solution: Add a BLAST database to your Cartwheel server.

Note: you must have PostgreSQL administrator access to do this''

This is a three-step process.

1. Obtain the sequence database & format it.

So, download or construct or otherwise obtain the database of sequences. They must be in FASTA format.

Then format them for BLAST using formatdb. If they're protein sequences, do

formatdb -i dbname -o T -p T

If they're DNA sequences, do

formatdb -i dbname -o T -p F

NCBI BLAST does not support databases containing both protein and DNA sequences.

Check 'formatdb.log' for errors and warnings.

Note that after this step, you can delete the 'dbname' file.

2. Put it in the right place.

formatdb will leave you with a bunch of 'dbname.*' files. You need to move them (or link them) into the directory where Cartwheel expects them to be. This is the [blast] blastdb= setting in your Cartwheel config.rc file. If it's empty, check the $BLASTDB environment variable.

3. Add it into the Cartwheel database.

You now need to tell Cartwheel that this BLAST database exists by inserting it into the blast_database_entries table. Use 'psql <postgresql database name>' to get a command prompt for your PostgreSQL database. Then do something like

INSERT INTO blast_database_entries (db_type, filename, name) VALUES ('DNA', 'dbname', 'A New Database')

Replace 'DNA' with 'protein' as appropriate. 'dbname' should be the filename you ran 'formatdb' on, and 'A New Database' is the name that will appear in the pull-down menu in the Web interface.

Ta-da!

Note that you may have to restart the Web server in order to get it to recognize the new database, but the batchqueue system should recognize it immediately.

--titus Mar 25, 2006.