Blastn output format 6
See also:
- Blast documentation
- Blast on Metagenomics Wiki
blastn -helpand check-outfmtformatting option
Prepare the database
Only needed if reference sequences are going to be used frequently
makeblastdb \
-in myRefSeqs.fasta \
-out "My/Folder/myRefSeqs_blastdb" \
-dbtype "nucl" \
--parse_seqids \
-logfile My/Folder/logFile.log
Run Blast
-num_alignment default is 250. Modify it if more alignments are expected
-outfmt sets the output format
blastn \
-task megablast \
-query InputSequence.fasta \
-db "My/Folder/myRefSeqss_blastdb" \
-out My/Output/Folder/InputSeq_vs_RefSeqs.res \
-outfmt 6 \
-num_alignments 10000000
Run blast with 2 fasta sequences:
blastn \
-query genes.fasta \
-subject genome.fasta \
-outfmt 6
Default value for -outfmt 6 corresponds to -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore"
Output
Column headers:
qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore
In R:
c("qseqid","sseqid", "pident", "length",
"mismatch", "gapopen", "qstart", "qend",
"sstart", "send", "evalue", "bitscore")
| Col | name | Description |
|---|---|---|
| 1. | qseqid | query (e.g., unknown gene) sequence id |
| 2. | sseqid | subject (e.g., reference genome) sequence id |
| 3. | pident | percentage of identical matches |
| 4. | length | alignment length (sequence overlap) |
| 5. | mismatch | number of mismatches |
| 6. | gapopen | number of gap openings |
| 7. | qstart | start of alignment in query |
| 8. | qend | end of alignment in query |
| 9. | sstart | start of alignment in subject |
| 10. | send | end of alignment in subject |
| 11. | evalue | expect value |
| 12. | bitscore | bit score |