|
Aligning the genomic DNA of C. Elegans and C. Briggsae identifies conserved promoter
elements. While not all promoter elements are conserved, we can identify highly
conserved sequences 5' of genes as likely promoter elements. The function of promoter
elements can not be accurately predicted computationally. Instead we use the size
and number of identified promoter elements as a measure of promoter complexity.
We identify potential promoter sequence by several local sequence alignment methods.
The total conserved promoter sequence for each gene gives us a measure for promoter
complexity. Monte Carlo random sampling is used to identify Gene Ontology and KEGG
Pathway annotated gene groups that appear to have significantly low or high complexity.
For instance, we find ribosomal genes have low complexity while growth genes have
high complexity. Genes contributing to the extracellular region scored a high complexity
while basal transcription factors scored low in complexity.
Finally, we calculate a dynamic regulation score for each gene from expression changes
measured in microarray experiments; this score is then correlated with promoter
complexity. Software written for this project is publicly available and easily adapted
for similar experiments with other organisms.
|