BLAST Search parameter E-value

Expect value

The E-value or Expect value is a metric for the significance of an alignment in a homology match. In particular, BLAST search results are assigned an E-value which indicates how significant the BLAST hit is.

The Expect value – as the name suggests – indicates the number of random hits you would expect by chance for the given query sequence and the size of the sequence database against which the BLAST is performed. Therefore, an Expect value of 1.0 means that you would on average get one match in the database for the submitted query simply by chance. In other words: If you get a hit for your query sequence, this does not necessarily indicate a specific homology with the hit sequence, but could well be a chance hit.

In general, the E-values of hits should be well below 1.0 – in the range of 0.01 to 0.1 – in order to be taken seriously. However, for very short sequences such as oligonucleotides, it may be necessary to increase the E-value threshold to 10 or 100, in order to find imperfect homologies.

The Expect value is calculated in this way:

E=K * m * n * e^( -lambda * S)

where K and lambda are constants that depend on the scoring matrix used. m is the query length and n is the database length. S is the alignment score .

As you can see, the E-value is linearly dependent on the length of the query and database. That means that the E-value automatically increases if you use a larger database. Therefore it is very important to adjust the E-value for the database length if you want to compare E values between searches against different databases.

BLAST uses a threshold to remove hits from the output which are above a certain E value. The default threshold is 10.0. Note that this may remove perfect matches if the database is very large and the query sequence is short (such as oligos), since the maximum score for a short sequence is limited.