Gene Expression Data from Correlation to Cluster Processing using
Low-Dose IR Data for Illustration
 

As microarrays become cheaper and more genes are included on a single array, the ease of generating data quickly outpaces the ease of  analyzing it. Many fundamental questions that microarray data help explore involve co-expression, identifying groups of genes with correlated RNA transcript levels. With tens of thousands of genes leading to hundreds of millions of correlations, high performance computational tools are needed to analyze so much data. While there are many clustering algorithms to do just that, we are employing novel combinatorial tools to isolate cliques and other forms of extremely dense clusters. With new methods come questions about data preparation and analysis: how much data is required? how should it be normalized?  how many conditions are needed? what type of correlation computation is most appropriate? and what can we conclude from our results? This poster demonstrates how the steps prior to cluster extraction can affect results, and how the resultant clusters can be analyzed to answer
biologically meaningful questions using low-dose ionizing radiation expression data for illustration.

-------------------------------------------------------------------------------

Michael Langston, Arnold Saxton, Jon Scharff, Brynn Voy
Oak Ridge National Laboratory