Analysis of transcript abundance between two cDNA libraries

Information about mRNA transcript abundance under different experimental conditions can be obtained from the analysis of EST sequences. If we assume that ESTs are randomly selected from non-normalised libraries, then the number of EST sequences representing each gene is directly proportional to the mRNA abundance in the tissue from which the library was constructed. Comparison of EST sequence frequency across a number of different libraries can build up a transcript profile of that gene. Alternatively the comparison of ESTs sequenced from two cDNA libraries can identify which genes are differentially regulated between the two conditions or tissue types that the libraries were constructed from. A statistical analysis method has been developed to identify whether the difference in EST sequence abundance between two libraries is significant (Audic and Claverie, 1997). This is based on the probability that the difference in the frequency of ESTs between two libraries is due to a difference in expression levels rather than random sampling. The theory links the threshold of selection of putatively regulated genes to the fraction of false positive clones one is willing to risk.

There are 53,247 Magnaporthe grisea ESTs deposited in the dbEST database at NCBI. These have been sequenced from 20 different cDNA libraries. For the purpose of this analysis we only used ESTs from non-subtracted libraries from which at least 1000 ESTs had been sequenced (listed below, click on library name for further details, total number of ESTs sequenced from each library in brackets).

Mag02: 70-15 appressorium (2553).
Mag03: 70-15 mycelium grown in minimal medium (1466).
Mag04: CP987 mycelium grown in medium containing rice cell walls (3924).
Mag06: Guy11 mycelium grown in complete medium (3193).
Mag07: Guy11 conidia (3405).
Mag08: Guy11 mycelium grown in nitrogen starvation medium (4310).
Mag10: pmk1 germinated conidia (4421).
Mag15: Mixed mated culture (6521).

Only unisequences that were constructed from at least five ESTs from the libraries listed above were used in this analysis. Generally, one unisequence corresponds to one gene, but in some cases there is more than one unisequence corresponding to a particular gene. Unisequences were compared to the M. grisea genome sequence to identify unisequences corresponding to the same gene. EST transcript profiles were summed in these cases. Transcript profiles are available for 958 unique genes (unigenes) (made up of 1070 unisequences). These can be accessed from the information pages for the individual unisequences. Pairwise comparisons of cDNA libraries can also be made, showing those unigenes that show significant differences in EST transcript abundance between the two libraries.