The genomic 'inner fish' and a regulatory enigma in the vertebrates

Information on how genomic information from fish to human encodes the same tissues has until now emerged one gene at a time. The study published in this issue now provides lists of genes and their expression levels for 20 vertebrate tissues spanning 450 million years of vertebrate evolution. It reveals a core set of genes with similar tissue-expression patterns yet no common regulatory signatures – a gene-expression paradox.

Even before the origin of species by descent from a common ancestor was posited, it was realized that groups of animals had related morphologies. Georges Cuvier, the father of comparative anatomy, viewed anatomical structures though the lens of form and function. Similar looking anatomical structures should have similar function, and anatomy could be used diagnostically to group organisms -a theory he termed "the correlation of parts" [1]. A famous story illustrates the idea. One of Cuvier's students dressed as the Devil with horns on his head and hoof-shaped shoes burst into Cuvier's bedroom when he was asleep and said, "I am the Devil. I have come to devour you!" Cuvier woke up and replied, "I doubt whether you can. You have horns and hooves. You eat only plants." The relationship between form and function during evolution is a classic problem in biology. Yet to fully understand form and function at the level of anatomy and how those anatomical features change over time, it is important to probe the proximate mechanisms that create adult anatomy. It was therefore only natural that embryology became such an important tool. Karl Ernst von Baer showed that nearly all organs and tissues were derived from the same embryonic layers in practically all animals. This similarity, an instructional 'inner fish', implies that core processes shared among all vertebrates shape much of ontogeny and ultimately adult anatomy [2].
DNA sequencing and the understanding that changes in sequence can also be used to deduce the relatedness of species marks another important landmark in the history of science. Conservative and non-conservative changes to codons have been a boon in understanding the influence of selection and chance on evolution. The form and function of vertebrate tissues and organ systems are a thoughtprovoking tour of the inner workings of organisms. Understanding what genes are deployed in a tissue-or organ-specific manner and across a variety of divergent species will be as valuable to our understanding of evolutionary processes and will inform us about what parts of the gene expression networks in an organism of particular interest, such as humans, have core functionality.
In this issue, Chan et al. [3] dissect the inner workings of vertebrate tissues and organs with a genomic scalpel and show that gene-expression profiles of orthologs are correlated.
These data strongly suggest that both genes and geneexpression networks are derived from common ancestral genes and gene-expression networks. While maybe not surprising, this is important genomic confirmation of what comparative anatomists and embryologists have long described and believed. More important, these datasets are harbingers of the more quantitative and qualitative comprehensive description of morphology, a more theoretically grounded understanding of evolutionary processes that will follow. As pointed out by the authors, the contribution of selection and random drift in gene-expression profiles remains unclear. The theoretical framework for finding meaning in comparative gene-expression and network topology data is still in its infancy [4]. These studies will provide the basis for a revolution in our understanding of evolution of gene-expression networks that is likely to thematically recapitulate the study of DNA sequence change in protein-coding regions.
Chan et al. [3] sampled a range of different tissues, which should probably become standard in comparative expression analyses, as not all organs tell the same story. For example, eyes must be very well adapted, as they show impressively conserved morphologies within the vertebrates [5]. Chan et al. show that they also express a set of core genes that have been highly conserved ( Figure 1). As a counter example, Darwin suggested that sexual selection is an important driving force in evolution [6], and there are many studies showing that genes preferentially expressed in males, and in the testis in particular, are rapidly evolving in the vertebrates -in frogs [7], birds [8], rodents [9], and primates [10]. Chan et al. also find that testis gene expression is rapidly evolving in the vertebrates.
Curiously, they find that the kidney gene-expression profile may also be rapidly evolving. This could be related to the changes in water homeostasis in freshwater-dwelling organisms compared with those that inhabit drier or saltier environments, and the related excretion of urea or uric acid. Alternatively, it could be related to the closely linked embryonic origins and development of gonads and kidneys, both of which produce products that are passed from the body. The mesodermal urogenital ridge in the vertebrate embryo gives rise to both kidney and gonad, and the development of the kidney and the reproductive tract shows a remarkable development of functional embryonic nephritic tissue and an array of used, reused, and discarded plumbing arrangements (for example, Mullerian and Wolffian ducts) that connect the gonad and the kidney to the outside world [11]. It would be interesting to explore the idea that fast changes in testis expression profiles drive changes in the kidney as well.
Given that organs and organ expression profiles are derived from a common ancestor, one might expect that the regulation of gene expression should also be conserved. This is not really terribly different from the idea that similar organs should express similar genes. The logical idea that coexpression and co-regulation are linked was one of the early driving forces behind DNA microarray analysis, but links between coexpression of batteries of genes and coregulation have not been as clearcut as initially expected. Indeed, Chan et al. [3] make the point that they fail to find conserved non-genic sequences that are expected to be driving the core organ-specific expression patterns. Not finding is a negative result, but if results like this continue to accumulate, it will be important to fully explore why. Although it is possible that we are not yet good enough at finding cis-regulatory sites, this negative result is becoming a common theme in array studies. Genomic features with highly conserved functions, such as core promoters and enhancers, clearly can be swapped between genes and species (where they function as expected as judged by the endogenous patterns) but show remarkable diversity in nucleotide sequence. For example, vertebrate transgenes expressed in hepatocytes of different species show similar expression even though the transcription factor occupancy profiles differ, and divergent enhancers from different species of Drosophila drive the same patterns of expression in Drosophila melanogaster embryos [12,13].
Remarkably, there have been natural wholesale exchanges of regulatory sequences to drive the expression of highly conserved ribosomal protein encoding genes in yeasts [14], suggesting that different transcription factors can coregulate large groups of genes in different species. It is beginning to look as if there is a more profound explanation than technical limitations for our inability to find conserved cisregulatory patterns among orthologs with similar expression patterns. Maybe conservation in cis-regulatory regions is difficult to find because they are highly malleable and transient.
It is perhaps worthwhile to step back and think about the unit of selection. For an animal to reproduce and pass on its genome, it needs to develop and use organs and organ systems. We know that early errors in organ development are catastrophic for adult viability and reproductive success. So, intuitively, the initial steps in a genetic pathway or the first committed step in a series of enzymatic reactions must be critical, but this is only true if there are few ways to generate a pattern or product. If there are multiple mechanisms, how an organism bootstraps to an acceptable outcome is less important. In a 'Christmas tree' model of evolution, this represents changing the branches on which ornaments are placed while maintaining the same ? ?
Ancestral cell Gene A Gene A decorative appearance [15]. Sexual reproduction is a good example of this model. Sex results in remarkably similar gametes in a wide range of species, but the genetic pathways that govern the early steps of sex differentiation show astounding differences in theme and gene [16]. The malleability of sexual mode can be seen in the nematodes, where hermaphrodite and separate sexes have evolved multiple times using different underlying mechanisms, and within Caenorhabditis elegans the prime sex-determination signal can be experimentally switched from chromosomal to temperature [17].
If the females and males of the same species can be built using such different basal genetic hierarchies while maintaining the expression of critical well-adapted 'terminal' functions like sperm and eggs, then maybe organ geneexpression patterns can also be maintained with different underlying sets of transcription factors. This really boils down to asking how many solutions exist for a given expression pattern problem. If more than 10% of genes in a genome encode transcription factors and some substantial fraction of those genes are expressed in a given cell type, then there may be many ways to achieve the same transcriptional output. In these circumstances, a rather fluid exchange of regulatory mechanisms might be expected during the evolution of the vertebrates (Figure 2). De novo evolution of transcription factor binding sites should be relatively simple as these are short (usually less than 10 base pairs) and degenerate. The combination of conserved factor function and site turnover might result in multiple functionally equivalent cis-regulatory elements. Indeed, the exchange of one cis-regulatory sequence for another can occasionally be spotted [18]. If this proves to be generally true, then the implications for evolution and for using phylogeny to discover cis-regulatory regions are significant.
Since Cuvier, careful cataloging of anatomy in the context of phylogeny and development has had a major impact on our understanding of how living organisms evolve. While there are occasional examples of convergent evolution that has, for example, resulted in wings and thermal homeostasis in both mammals and birds, the vast bulk of comparative anatomy data reveals the deep roots of tissues and organ systems. Morphology indicates that the basic sensory, digestive, reproductive and excretory functions in animals are conserved. Although we do not have a rigorous understanding of the role of selection and drift in the evolution of gene expression, form and function has probably required the conservation of much of the core organ-specific expression network in the vertebrates. The lack of a relationship between coexpression and coregulation at evolutionary timescales indicates that either we still do not understand how to find cis-regulatory modules, or time has erased the vestiges of the intermediates in the vertebrates sampled.