Blastn output format 6

See also:

Prepare the database

Only needed if reference sequences are going to be used frequently

makeblastdb \
  -in myRefSeqs.fasta \
  -out "My/Folder/myRefSeqs_blastdb" \
  -dbtype "nucl" \
  --parse_seqids \
  -logfile My/Folder/logFile.log

Run Blast

-num_alignment default is 250. Modify it if more alignments are expected
-outfmt sets the output format

blastn \
  -task megablast \
  -query InputSequence.fasta \
  -db "My/Folder/myRefSeqss_blastdb" \
  -out My/Output/Folder/InputSeq_vs_RefSeqs.res \
  -outfmt 6 \
  -num_alignments 10000000

Run blast with 2 fasta sequences:

blastn \
  -query genes.fasta \
  -subject genome.fasta \
  -outfmt 6

Default value for -outfmt 6 corresponds to -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore"

Output

Column headers:
qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore
In R:

c("qseqid","sseqid", "pident", "length", 
  "mismatch", "gapopen", "qstart", "qend", 
  "sstart", "send", "evalue", "bitscore")
Col name Description
1. qseqid query (e.g., unknown gene) sequence id
2. sseqid subject (e.g., reference genome) sequence id
3. pident percentage of identical matches
4. length alignment length (sequence overlap)
5. mismatch number of mismatches
6. gapopen number of gap openings
7. qstart start of alignment in query
8. qend end of alignment in query
9. sstart start of alignment in subject
10. send end of alignment in subject
11. evalue expect value
12. bitscore bit score
Pascal GP Martin
Pascal GP Martin
Senior Research Specialist (IR1) at INRAE

Research Specialist in Genomics and Bioinformatics

Related