Molecular Human Reproduction, Vol. 7, No. 6, 545-552,
June 2001
© 2001 European Society of Human Reproduction and Embryology
Embryology |
Meta-analysis of gene expression in mouse preimplantation embryo development
Department of Anatomy and Structural Biology, University of Otago, Medical School, P.O.Box 913, Dunedin, New Zealand
Abstract
Mammalian preimplantation development is characterized by a number of major events. These potentially involve significant but transient changes in early embryonic gene expression. We have undertaken a meta-analysis of gene expression in mouse preimplantation development using a set of 71 346 expressed sequence tags (EST) derived from 15 non-normalized cDNA libraries. These libraries span seven stages of development from the unfertilized oocyte to the blastocyst stage. EST were clustered using UniGene. The 71 346 EST identified 11 483 separate genes, of which 1585 are not found elsewhere in the mouse. Aggregate sets of EST for each of the seven stages were analysed for differences in gene expression using Fisher's exact test. This analysis identified 109 genes that were differentially expressed. Some of these genes were associated with degradation of transcripts at the 1-cell stage whereas other genes underwent increased expression at the blastocyst stage. The set of 11 483 genes identified in mouse preimplantation embryo development provides the starting point for the design of DNA microarrays targeted at early mammalian embryogenesis. By anchoring the analysis of mouse preimplantation development in UniGene, it will be possible to identify homologous genes that are likely to be involved in human preimplantation embryo development.
expressed sequence tags/gene expression/mouse/preimplantation development/UniGene clusters
Introduction
Mammalian preimplantation embryo development passes through four important stages following fertilization: degradation of oocyte transcripts, transcriptional activation of the zygotic genome, compaction, and differentiation into inner cell mass (ICM) and trophectoderm. In mouse, the first two events occur at the 1-cell stage, compaction occurs at the 8-cell stage, and differentiation at the blastocyst stage (Piko and Clegg, 1982
; Schultz, 1993
; Nothias et al., 1995
). These steps are likely to be associated with marked changes in the profile of embryonic gene expression. This paper describes a meta-analysis of gene expression profiles ranging from unfertilized mouse oocyte to mouse blastocyst.
Gene expression profiling of cells and tissues identifies both the transcripts that are being expressed and their expression levels. Expression profiling can be approached in two ways: (i) through methods that hybridize transcripts to arrayed DNA or oligonucleotide libraries; or (ii) through methods that generate expressed sequence tags (EST) from a representative sample of transcripts. The first approach is restricted to the constituent genes of the libraries that are arrayed whereas the second makes no prior assumption about gene identities.
The databases of the National Institutes of Health (NIH) hold a large number of EST deposited from high throughput-sequencing of cDNA libraries. These cDNA libraries include 15 non-normalized libraries derived from seven stages of mouse preimplantation embryo development: the unfertilized and fertilized mouse oocyte, 2-cell, 4-cell, 8-cell and 16-cell mouse embryos and the mouse blastocyst. The EST for these libraries were generated from the RIKEN, the ERATO-Doi and the Washington University-Merck sequencing projects. The 15 libraries collectively hold 71 346 mouse preimplantation EST.
A meta-analysis of the distribution of these EST using methods based on UniGene clustering identified 11 483 genes that are expressed in preimplantation embryo development, including 1585 that are not expressed elsewhere in the mouse. Over 100 genes are expressed in a stage-specific manner.
Materials and methods
The 71 346 EST used in this analysis were downloaded from lists of GenBank Accession Numbers aggregated by NIH. The lists were accessed as UniGene libraries at www.ncbi.nlm.nih.gov/UniGene/Mm_DATA/lib_report.html. The catalogue numbers of the UniGene libraries used in this analysis were: unfertilized oocyte, 89 (403 EST), 151 (3096 EST); fertilized oocyte, 106 (3314 EST), 319 (7664 EST); 2-cell embryo, 88 (14 813 EST), 149 (3687 EST), 414 (6315 EST); 4-cell embryo, 175 (3011 EST); 8-cell embryo, 91 (98 EST), 150 (3443 EST); 16-cell embryo, 176 (3195 EST); blastocyst, 85 (12 955 EST), 94 (2499 EST), 102 (5692 EST), 134 (1161 EST). Detailed descriptions of cDNA library construction are available from the UniGene website. The cDNA libraries were non-normalized.
UniGene libraries were imported into FileMaker Pro software (Claris Corporation, Santa Clara, CA, USA) for construction of seven databases. The corresponding UniGene Cluster Number was identified for each GenBank Accession Number. Digital gene expression profiles were generated by summing UniGene cluster abundances. Differences between digital gene expression profiles were identified using Fisher's exact test. Gene assignments to UniGene cluster numbers were based on mouse UniGene build #81.
Results
Digital gene expression profiles were constructed in FileMaker Pro. Profiles (`digital Northerns') were obtained by summing separately the UniGene abundances for each of the 15 UniGene libraries used in this study. About 20% of EST in each library do not map to a UniGene cluster.
Most stages of mouse preimplantation embryo development had two or more UniGene libraries available for analysis (see Materials and methods). Libraries at each stage were first compared one with another to assess internal, library-related differences in gene expression. The stages of development for which the largest number of EST were available from more than one library are the fertilized oocyte, the 2-cell embryo and the blastocyst (see Materials and methods). Application of Fisher's exact test identified four genes showing differences in gene expression between libraries at the fertilized oocyte stage, 24 genes between libraries at the 2-cell embryo stage, and five genes between libraries at the blastocyst stage.
Having examined differences between cDNA libraries at the same stage of embryonic development, we undertook a meta-analysis of changes of gene expression during development. EST from individual libraries at the same developmental stage were aggregated to produce single data sets representing all EST from that stage. These aggregate sets were then compared using Fisher's exact test. A total of 109 differentially expressed genes were identified by this method. The percentage abundances of these genes is shown diagrammatically in Figure 1
(scales are 00.05%; 0.050.1%; 0.10.2%; 0.20.4%; 0.40.8%; 0.81.6%; 1.63.2%).
|
We also undertook a separate analysis of the set of cDNA libraries used in the ERATO-Doi sequencing project (UniGene library ID 151, 106, 149, 175, 150, 176 and 102 respectively). These formed the basis of a previous study of mouse preimplantation gene expression (Ko et al., 2000
|
We also ranked the UniGenes in order of abundance for each developmental stage and normalized these abundances so that direct comparisons between libraries could be made. A composite list of the most highly expressed genes for each stage is shown in Table I
|
In addition, we analysed the aggregate set of 71 346 EST obtained by amalgamating all 15 cDNA libraries. From this, we were able to identify 11 483 distinct UniGenes while a further 15 067 EST could not be assigned to UniGene clusters. A set of 1585 of the 11 483 UniGenes are currently unique to preimplantation embryo development and not found in other tissues contributing to the mouse UniGene database.
In a comparative exercise, we performed the same aggregate analysis on the subset of 25 438 EST sequenced in the ERATO-Doi project (UniGene library ID 151, 106, 149, 175, 150, 176 and 102). We found that 76.9% of these EST could be assigned to a UniGene cluster. Aggregation of these Unigene clusters identified 6752 distinct UniGenes. By subtraction, some 5833 EST from this subset were not assignable to a UniGene cluster.
Discussion
GenBank contains a large number of EST obtained from high-throughput sequencing of cDNA libraries. These include 71 346 EST obtained from non-normalized mouse preimplantation embryo cDNA libraries. We used these data to construct and analyse digital gene expression profiles for unfertilized and fertilized mouse eggs, 2-cell, 4-cell, 8-cell, 16-cell mouse embryos and mouse blastocysts.
First, digital expression profiles were compared for individual libraries at the same developmental stage. The purpose of this part of the exercise was to establish whether there were gross differences in gene transcript representation arising from independent library construction. This is a matter of some importance, for two reasons. First, it was assumed in the previous analysis of the ERATO-Doi libraries (Ko et al., 2000
) that all expression profiles were captured accurately and that differences between developmental stages could be assigned to true differences in gene expression. No independent proof of this assumption was available but, given that the ERATO-Doi libraries were constructed using polymerase chain reaction, it remains an important assumption. Second, it is important to show that purportedly significant differences in stage-related gene expression are not overshadowed by large differences in libraries taken from the same stage of development. Comparison of the digital expression profiles of the two fertilized oocyte libraries using Fisher's exact test showed significant differences in the levels of only four genes and comparision of the four blastocyst libraries showed significant differences in only five genes. However, the three libraries from the 2-cell embryos show greater internal differences (24 genes). One possible reason for the differences at the 2-cell stage may lie in the rapid changes in gene expression that occur after fertilization. Fertilization triggers large-scale mRNA degradation (Piko and Clegg, 1982
) and the late 1-cell stage is associated in mouse with transcriptional activation of the zygotic genome (Schultz, 1993
; Nothias et al., 1995
). The precise time-point at which the material for the 2-cell embryonic cDNA libraries is captured may therefore have an effect on gene expression profiles, as may stress on the embryo.
Having established the extent of differences between libraries at the same embryonic stage of development, we aggregated EST into seven databases for a meta-analysis of gene expression from the unfertilized oocyte through to the blastocyst stage. The stage-aggregated expression profiles were analysed for differences in gene expression using Fisher's exact test. A total of 109 genes was identified to be differentially expressed in mouse preimplantation embryo development (shown in Figure 1
). Those candidates marked with an asterisk represent genes which were shown previously to have significant differences in expression between libraries at one of the same stages of development. This does not, in itself, eliminate these genes as candidates for differential gene expression across preimplantation embryo development, but it does identify one possible cause of the difference.
The 109 differentially expressed genes showed a range of expression patterns. We examined the patterns for their possible association with the four major preimplantation events: degradation of oocyte mRNA post-fertilization, transcriptional activation of the zygotic genome at the late 1-cell stage, compaction of the embryo at the 8-cell stage, and differentiation of the blastocyst into trophectoderm and ICM. There are genes whose differential expression parallels one or more of these events but the significance of these associations is currently unclear.
There are other major uses for these data. First, the list of genes expressed at each stage of development becomes the starting point for a better description of the cell biology. For example, a preliminary examination of the genes associated with the unfertilized oocyte indicates that our knowledge of this cell is limited. Many of its genes are known only through EST, with insufficient DNA sequence available to identify the possible coding regions. This applies both to highly expressed genes as well as low-copy-number transcripts. Other stages in mouse preimplantation embryo development also contain poorly characterized gene products. Therefore a major challenge for the future is to obtain a fuller length DNA sequence for the several thousand preimplantation genes whose existence is only known through limited EST data. Although highly expressed genes of unknown function will undoubtedly attract attention, cells are not democracies and transcript abundance is not an infallible indicator of importance. Table II
shows the abundances of genes that are lethal in mouse early embryonic development (Michaud et al., 1993
; DeGregori et al., 1994
; Larue et al., 1994
; Miller et al., 1994
; Deng and Behringer, 1995
; Tsuzuki et al., 1996
; Xanthoudakis et al., 1996
; Murphy et al., 1997
; Schrank et al., 1997
; Arman et al., 1998
; Gallicano et al., 1998
; Nichols et al., 1998
; Rassoulzadegan et al., 1998
; Wang et al., 1998
; Dealy et al., 1999
; Koutsourakis et al., 1999
; Smyth et al., 1999
; Tudor et al., 1999
; Zizioli et al., 1999
; Pu et al., 2000
). Most of these genes were found to be expressed at low levels.
|
A second major use of the data is in identifying pathways within cells. We show data on ubiquitination as an example. Table III
|
In undertaking our analysis of over 71 000 preimplantation embryo EST, we were interested in comparing our results with the analysis undertaken previously on the set of 25 438 EST generated by the ERATO-Doi project (Ko et al., 2000
There is some evidence that the number of genes identified in the ERATO-Doi catalogues was overestimated. Table I
in this paper lists the most highly expressed preimplantation genes identified by UniGene clustering. The rank orders of gene expression at each stage are similar to those that we obtained by UniGene clustering of the subset of 25 438 EST used in the ERATO-Doi project. It is clear from these data that the most highly expressed genes identified in Table I
are almost entirely absent from the tables published earlier (Ko et al., 2000
). We sought to identify the source of the discrepancy. We compared expression levels for those genes that can be identified unambiguously in both reports (the genes identified in common are named genes and can be identified rigorously). There is close agreement in expression levels for these genes in both cases. Therefore the difference in the two analyses lies in the absence of highly expressed genes in the set identified previously (Ko et al., 2000
). We can only speculate why the previous analysis failed to identify these genes. The most likely explanation appears to be a defect in the clustering algorithm.
The smaller data sets that make up the ERATO-Doi sequence project (25 438 EST) limit the number of genes that show statistically significant differences in expression during preimplantation embryo development. Of the 15 genes in this category, eight show significant differences when the ERATO-Doi libraries are compared with other libraries constructed at the same developmental stage. This indicates that the earlier conclusion of widespread stage-specific gene expression in preimplantation embryo development (Ko et al., 2000
) was not supported strongly by the then available evidence. Even though expansion of the analysed data set to over 71 000 EST goes some way to rectifying this defect, analysis based on EST generation is always limited by the number of EST available. The most effective analysis of gene expression in future will undoubtedly come from DNA microarrays.
Anchoring the digital analysis of gene expression in UniGene also has many advantages. These derive from the interconnections of the NIH databases, of which UniGene is one, and the frequent upgrading of these databases. Contingent on availability of data, a corresponding UniGene can be found for human and other species, the protein sequence corresponding to a UniGene cluster can be identified, the UniGene cluster and its contributing EST can be used to identify IMAGE clones (http://image.llnl.gov) for assembling DNA microarrays, the chromosomal location of the UniGene can be mapped, and the UniGene cluster can be linked to Online Mendelian Inheritance in Man (OMIM).
These advantages have important implications in the present circumstances. The set of mouse preimplantation EST used previously (Ko et al., 2000
) and EST for mouse postimplantation embryos form the basis of a 15K set of clones that have been made available for DNA microarray construction by the National Institute on Aging. The distribution of these clones is limited to ten laboratories (see the website at http://lgsun.grc.nia.nih.gov). The ostensible reason for this restriction is the desire to encourage archiving of microarray studies in a shared database. While this objective is laudable, it is unclear that the interests of the research community are best served by using an encapsulated database that is divorced from the major NIH databases, particularly UniGene. The EST from which the 15K set is derived are all identifiable in public databases using the methods described in this paper. In addition, the majority of the 71 346 EST identified in this paper have at least one corresponding IMAGE clone (obtainable from the IMAGE consortium). We suggest that the optimum archiving strategy for understanding mouse preimplantation embryo development is to use the integrated NIH databases, including UniGene. Despite the more conservative approach to clustering imposed by UniGene, the aggregate set of 71 346 EST identifies 11 483 genes. This is the largest set of genes expressed in mammalian preimplantation embryos currently known.
Early mammalian embryogenesis is currently the focus of intense interest because of its potential for therapeutic and reproductive cloning. Embryo quality is also a major concern in human assisted reproduction and following embryo manipulation (e.g. enucleation/renucleation). Knowledge of the gene expression profile in mouse preimplantation embryo development provides the basis for designing DNA microarrays that are targeted at high-throughput screening of mammalian embryos. For example, the low marginal cost of DNA microarrays will make it possible to screen mouse embryos subjected to a wide range of insults. These screens may help to identify predictors of embryo quality both in mouse and other species. The existence of a list of UniGenes expressed in mouse preimplantation embryo development makes it possible to build DNA microarrays targeted at human preimplantation embryo development and human embryonic stem cells.
Appendix
Acknowledgements
We wish to thank Andrew MacGregor for his invaluable assistance in helping to construct databases in FileMaker Pro. This work was supported by grants from the Marsden Fund and the New Economy Research Fund.
Notes
1 To whom correspondence should be addressed. E-mail: david.green{at}stonebow.otago.ac.nz ![]()
References
Arman, E., Haffner-Krausz, R., Chen, Y. et al. (1998) Targeted disruption of fibroblast growth factor (FGF) receptor 2 suggests a role for FGF signaling in pregastrulation mammalian development. Proc. Natl. Acad. Sci. USA, 95, 50825087.
Dealy, M.J., Nguyen, K.V., Lo, J. et al. (1999) Loss of Cul1 results in early embryonic lethality and dysregulation of cyclin E. Nat. Genet., 23, 245248.[ISI][Medline]
DeGregori, J., Russ, A., von Melchner, H. et al. (1994) A murine homolog of the yeast RNA1 gene is required for postimplantation development. Genes Dev., 8, 265276.
Deng, J.M. and Behringer, R.R. (1995) An insertional mutation in the BTF3 transcription factor gene leads to an early postimplantation lethality in mice Transgenic Res., 4, 264269.[ISI][Medline]
Gallicano, G.I., Kouklis, P., Bauer, C. et al. (1998) Desmoplakin is required early in development for assembly of desmosomes and cytoskeletal linkage. J. Cell Biol., 143, 20092022.
Joazeiro, C.A.P. and Weissman, A.M. (2000) RING finger proteins: mediators of ubiquitin ligase activity. Cell, 102, 549552.[ISI][Medline]
Ko, S.H.M., Kitchen, J.R., Wang, X. et al. (2000) Large-scale cDNA analysis reveals phased gene expression patterns during preimplantation mouse development. Development, 127, 17371749.[Abstract]
Koutsourakis, M., Langeveld, A., Patient, R. et al. (1999) The transcription factor GATA6 is essential for early extraembryonic development. Development, 126, 723732.[Abstract]
Larue, L., Ohsugi, M., Hirchenhain, J. et al. (1994) E-cadherin null mutant embryos fail to form a trophectoderm epithelium Proc. Natl. Acad. Sci. USA, 91, 82638267.
Michaud, E.J., Bultman, S.J., Stubbs, L.J. et al. (1993) The embryonic lethality of homozygous lethal yellow mice (Ay/Ay) is associated with the disruption of a novel RNA-binding protein. Genes Dev., 7, 12031213.
Miller, M.W., Duhl, D.M., Winkes, B.M. et al. (1994) The mouse lethal nonagouti (a(x)) mutation deletes the S-adenosylhomocysteine hydrolase (Ahcy) gene. EMBO J., 13, 18061816.[ISI][Medline]
Murphy, M., Stinnakre, M.G., Senamund-Beaufort, C. et al. (1997) Delayed early embryonic lethality following disruption of the murine cyclin A2 gene. Nat. Genet., 15, 8386.[ISI][Medline]
Nichols, J., Zevnik, B., Anastassiadis, K. et al. (1998) Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct 4. Cell, 95, 379391.[ISI][Medline]
Nothias, J.Y., Majumder, S., Kaneko, K.J. and DePamphilis, M.L. (1995) Regulation of gene expression at the beginning of mammalian development. J. Biol. Chem., 270, 2207722080.
Piko, L. and Clegg, K.B. (1982) Quantitative changes in total RNA, total poly(A), and ribosomes in early mouse embryos. Dev. Biol., 89, 362378.[ISI][Medline]
Pu, W.T., Wickman, K. and Clapham, D.E. (2000) ICln is essential for cellular and early embryonic viability. J. Biol. Chem., 275, 1236312366.
Rassoulzadegan, M., Yang, Y. and Cuzin, F. (1998) APLP2, a member of the Alzheimer precursor protein family, is required for correct genomic segregation in dividing mouse cells. EMBO J., 17, 46474656.[ISI][Medline]
Schrank, B., Götz, R., Gunnersen, J.M. et al. (1997) Inactivation of the survival motor neuron gene, a candidate gene for human spinal muscular atrophy, leads to massive cell death in early mouse embryos. Proc. Natl. Acad. Sci. USA, 94, 99209925.
Schultz, R.M. (1993) Regulation of zygotic gene activation in the mouse. BioEssays, 15, 531538.[ISI][Medline]
Smyth, N., Vatansever, H.S., Murray, P. et al. (1999) Absence of basement membranes after targeting the LAMC1 gene results in embryonic lethality due to failure of endoderm differentiation. J. Cell Biol., 144, 151160.
Tsuzuki, T., Fujii, Y., Sakumi, K. et al. (1996) Targeted disruption of the Rad51 gene leads to lethality in embryonic mice. Proc. Natl. Acad. Sci. USA, 93, 62366240.
Tudor, M., Murray, P.J., Onufryk, C. et al. (1999) Ubiquitous expression and embryonic requirement for RNA polymerase II coactivator subunit Srb7 in mice. Genes Dev., 13, 23652368.
Wang, S., Gebre-Medhin, S., Betsholtz, C. et al. (1998) Targeted disruption of the mouse phospholipase C beta3 gene results in early embryonic lethality. FEBS Lett., 441, 261265.[ISI][Medline]
Xanthoudakis, S., Smeyne, R.J., Wallace, J.D. et al. (1996) The redox/DNA repair protein, Ref-1, is essential for early embryonic development in mice. Proc. Natl. Acad. Sci. USA, 93, 898198923.
Zizioli, D., Meyer, C., Guhde, G. et al. (1999) Early embryonic death of mice deficient in gamma-adaptin. J. Biol. Chem., 274, 53855390.
Submitted on November 6, 2000; accepted on March 15, 2001.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J.-A.L. Stanton, A.B. Macgregor, C. Mason, M. Dameh, and D.P.L. Green Building comparative gene expression databases for the mouse preimplantation embryo using a pipeline approach to UniGene Mol. Hum. Reprod., October 1, 2007; 13(10): 713 - 720. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sudheer and J. Adjaye Functional genomics of human pre-implantation development Brief Funct Genomic Proteomic, July 31, 2007; (2007) elm012v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. O'Shea Self-renewal vs. Differentiation of Mouse Embryonic Stem Cells Biol Reprod, December 1, 2004; 71(6): 1755 - 1765. [Abstract] [Full Text] [PDF] |
||||
![]() |
D.J. Bloor, A.D. Metcalfe, A. Rutherford, D.R. Brison, and S.J. Kimber Expression of cell adhesion molecules during human preimplantation embryo development Mol. Hum. Reprod., March 1, 2002; 8(3): 237 - 245. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.L. Stanton and D.P.L. Green A set of 1542 mouse blastocyst and pre-blastocyst genes with well-matched human homologues Mol. Hum. Reprod., February 1, 2002; 8(2): 149 - 166. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




