Blastn output format 6

See also:

Prepare the database

Only needed if reference sequences are going to be used frequently

makeblastdb \
  -in myRefSeqs.fasta \
  -out "My/Folder/myRefSeqs_blastdb" \
  -dbtype "nucl" \
  --parse_seqids \
  -logfile My/Folder/logFile.log

Run Blast

-num_alignment default is 250. Modify it if more alignments are expected
-outfmt sets the output format

blastn \
  -task megablast \
  -query InputSequence.fasta \
  -db "My/Folder/myRefSeqss_blastdb" \
  -out My/Output/Folder/InputSeq_vs_RefSeqs.res \
  -outfmt 6 \
  -num_alignments 10000000

Run blast with 2 fasta sequences:

blastn \
  -query genes.fasta \
  -subject genome.fasta \
  -outfmt 6

Default value for -outfmt 6 corresponds to -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore"

Output

Column headers:
qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore
In R:

c("qseqid","sseqid", "pident", "length", 
  "mismatch", "gapopen", "qstart", "qend", 
  "sstart", "send", "evalue", "bitscore")
ColnameDescription
1.qseqidquery (e.g., unknown gene) sequence id
2.sseqidsubject (e.g., reference genome) sequence id
3.pidentpercentage of identical matches
4.lengthalignment length (sequence overlap)
5.mismatchnumber of mismatches
6.gapopennumber of gap openings
7.qstartstart of alignment in query
8.qendend of alignment in query
9.sstartstart of alignment in subject
10.sendend of alignment in subject
11.evalueexpect value
12.bitscorebit score
Pascal GP Martin
Pascal GP Martin
Senior Research Specialist (IR1) at INRAE

Research Specialist in Genomics and Bioinformatics

Related