<< Chapter < Page | Chapter >> Page > |
What are the spot numbers for the three highest ranking spots?
What are the "R/G Normalized (Mean)" values for the three highest ranking spots?
Does it change the ranking of the spots to use the log transformations of the ratios instead of the ratios?
To demonstrate a different method of visualizing and analyzing microarry data,take a look at the MIT Cancer Genomics Microarray Data Sets . Scroll down to the section entitled "Gene Expression Correlates of Clinical Prostate Cancer Behavior".Click on the first data set, for Prostate tumor and normal samples, entitled "Prostate_TN_final0701_allmeanScale.res". This data set originates fromAffymetrix chips. In this case, the signal is recorded as "A" for absent, "P" for present, and"M" for marginal, as determined by the Affymetrix GeneChip software. The numerical values are scaled average difference units for tumor vs. normalprediction, and these values are also generated by the Affymetrix software. A more complete discussion of gene expression data analysis for AffymetrixGeneChip Arrays can be found at the Affymetrix web site .
So far, the discussion has been primarily about visualizing and quantifying the fluorescence signal from a microarray experiment. However, analysis ofgene expression under experimental conditions versus reference conditions requires determining whether observed differences are significant or not.There are many sources of noise and variability in microarray data, including experimental sources such as image scanning inconsistencies, issues involved incomputer interpretation and quantification of spots, hybridization variables such as temperature and time discrepancies between experiments, andexperimental errors caused by differential probe labeling and efficacy of RNA extraction.In addition, as the size of the sample increases, so does the probability of finding some large differences due to chance. Therefore, statistical analysisis required to show that gene expression differences are real.
There are some complex problems underlying statistical analysis of microarray data, primarily related to the fact that the number of samples is very, verylarge, but the number of times that each measurement is repeated is comparatively very small. (This is due mostly to cost and time issues.)Also, the simplest statistical techniques commonly assume a normal distribution, which cannot necessarily be assumed in microarray experiments. For a detaileddiscussion, D. K. Slonim (3) has authored agood review of the most current approaches to gene expression data analysis.
This tutorial will provide an oversimplified example of the type of statisticalanalysis that needs to be applied to microarray data, using the t-test. For a given gene, A, the gene will have two associated vectors:{a(ref)1, ..., a(ref)n} and {a(exp)1, ..., a(exp)n}, where a(ref) contains n measurements of expression levels under reference conditions and a(exp)contains n measurements of expression levels under experimental conditions.
Notification Switch
Would you like to follow the 'Bios 533 bioinformatics' conversation and receive update notifications?