Description of Graphs
Below is a detailed description of each graph. Note that the characters A, C, G, and T in the formulas represent the number of corresponding nucleotides in a window.
DNA Flexibility — Identifies regions of high DNA helix flexibility in a DNA sequence. The average Threshold in a window is calculated using the following formula:
[ \frac{\text{sum of flexibility angles in the window}}{\text{window size} - 1} ]
For more detailed information, see the DNA Flexibility section.
GC Content (%) — Shows the percentage of nitrogenous bases (either guanine or cytosine) in a DNA molecule. It is calculated using the formula:
[ \frac{(G+C)}{(A+G+C+T)} \times 100 ]
AG Content (%) — Shows the percentage of nitrogenous bases (either adenine or guanine) in a DNA molecule. It is calculated using the formula:
[ \frac{(A+G)}{(A+G+C+T)} \times 100 ]
GC Frame Plot — This graph is similar to the GC content graph but shows the GC content of the first, second, and third positions independently. It is most effective in organisms with GC-rich genomic sequences but also works on all microbial sequences.
GC Deviation (G-C)/(G+C) — Shows the difference between the “G” content of the forward strand and that of the reverse strand. GC Deviation is calculated using the formula:
[ \frac{(G-C)}{(G+C)} ]
AT Deviation (A-T)/(A+T) — Shows the difference between the “A” content of the forward strand and that of the reverse strand. AT Deviation is calculated using the formula:
[ \frac{(A-T)}{(A+T)} ]
Karlin Signature Difference — Represents the dinucleotide absolute relative abundance difference between the whole sequence and a sliding window. Let:
[ f(XY) = \text{frequency of the dinucleotide XY} ] [ f(X) = \text{frequency of the nucleotide X} ]
[ p(XY) = \frac{f(XY)}{f(X) \times f(Y)} ]
[ p_seq(XY) = p(XY) \text{ for the whole sequence} ] [ p_win(XY) = p(XY) \text{ for a window} ]
The Karlin Signature Difference for a window is calculated using the formula:
[ \frac{\sum(p_seq(XY) - p_win(XY))}{16} ]
Informational Entropy — Calculated from a table of overlapping DNA triplet frequencies. The use of overlapping triplets smooths the frame effect. Informational Entropy is calculated using the formula:
[ -(\text{triplet frequency}) \times \log_{10}(\text{triplet frequency})/\log_{10}(2) ]