Low complexity filtering

The server filters your query sequence for low compositional complexity regions by default. Low complexity regions commonly give spuriously high scores that reflect compositional bias rather than significant position-by- position alignment. Filtering can elminate these potentially confounding matches (e.g., hits against proline-rich regions or poly-A tails) from the blast reports, leaving regions whose blast statistics reflect the specificity of their pairwise alignment. Queries searched with the blastn program are filtered with DUST. Other programs use SEG.

Low complexity sequence found by a filter program is substituted using the letter "N" in nucleotide sequence (e.g., "NNNNNNNNNNNNN") and the letter "X" in protein sequences (e.g., "XXXXXXXXX"). Users may turn off filtering by using the "Filter" option on the "Advanced options for the BLAST server" page.

