Phenotypic variation is frequently determined by regulatory regions of DNA sequence that affect mRNA abundance. By subjecting variation in mRNA abundance observed in individuals from meiotic mapping populations to genetic analysis, each mRNA can be treated as a heritable, quantitative trait. Subsequently, the chromosomal loci that regulate the abundance of specific mRNAs in a given biological sample can be identified as expression Quantitative Trait Loci (eQTLs).
The methodology to dissect complex traits by simultaneously measuring the expression of many genes across a genetically defined population was suggested and experimentally tested by Damerval and collegues in 1994. Later, by combining the power of large scale mRNA profiling platforms, full genome sequences, high throughput genotyping technologies and precision phenotyping, modelling of integrated interaction networks underlying complex traits has become possible.
To date, many traits for number of species have been addressed using genetic analysis of gene expression. Behavioural and neuro-anatomical traits, hematopoietic stem cell turnover, susceptibility to obesity, hypertension, type 2 diabetes mellitus and carcinogenesis in animals, somatic embryogenesis and stress response in Arabidopsis, lignification in trees, digestibility in maize, seed development in wheat and resistance to the stem rust fungus in barley are some examples of the phenotypic traits analysed by combining large scale mRNA profiling with classical genetic analyses.
Genetic analysis can be used to determine if the observed variation is heritable. If it is, the analysis can reveal which genetic factors influence the variation and to what degree. For quantitative traits such factors are defined as Quantitative Trait Loci (QTLs). They are chromosomal regions harbouring certain number of genes that are determined by the size of the QTL and the gene density distribution across the chromosome. Thus, any given QTL can comprise dozens or even hundreds of genes (called candidate genes), but only one or a few account for any given phenotype. Very often a single trait is determined by several interacting QTLs influencing a relatively small proportion of the overall observed phenotypic variation. This adds even more complexity to the process of inheritance and consecutively to the analysis. The conventional approach for candidate gene identification uses genetic dissection of a population with a certain resolution and/or populations of different composition (for example, more individuals and different crosses). This approach becomes impractical due to prohibitively large number of individuals required to achieve satisfactory resolution. Therefore, alternative, more efficient approaches, that would allow to estimate gene content for any given QTL and to obtain some information about functionality of the genes underlying it, are needed. Large-scale, parallel mRNA profiling is well suited as one of them. The mRNA or transcript abundance (also often referred as gene expression level) can be analysed as a trait, no different from any other conventional (higher-order) traits in terms of QTL detection.
The major difference between the mRNA abundance trait and classical higher-order traits is that available genomics tools allow obtaining mRNA abundance values for tens of thousands of genes in a highly parallel fashion which then can be used for QTL mapping. But, there is also profound difference when interpreting mRNA abundance and higher order trait QTLs. For higher order trait QTLs the underlying genes are either unknown, or we are testing hypothesis about genes we inferred from other experiments. For mRNA abundance trait, the gene is known by the definition. Therefore, the first question when addressing transcript abundance QTLs (also called as eQTLs) is whether the gene's location coincides with its transcript abundance QTL. Positive answer would mean that the factors regulating mRNA abundance or the gene expression are in the proximity of the gene (called cis-regulation). On the other hand, if the QTL location is different from the gene we have detected loci that encode factors that act distantly (trans-regulation).
For unsequenced species like barley, where majority of genes are not mapped, identification of the cis- QTLs for thousands of genes have significant added value by enabling the construction of high density gene map. To characterise mRNA abundance as a quantitative trait and identify candidate genes for barley higher order trait QTLs, we performed mRNA profiling experiment using population of 150 recombinant lines. This population was developed over a decade ago by crossing high-yielding feed barley cultivar Steptoe with cultivar Morex, yielding less than Steptoe, but with excellent malting traits. The population has been extensively phenotyped mostly for malting and yield related traits resulting in identification of over thirty QTLs. As an mRNA profiling platform we used Barley1 Affymetrix' GeneChip, which measures transcript abundance for at least 21,000 barley genes. Resolution of any given genetic linkage map is determined by the number of recombinant lines used to generate it. Our current genetic linkage map used for QTL mapping consisted of about 500 loci. Detection of over 23,000 QTLs inevitably led to the identification of the QTL clusters (overlapping QTLs). Two hypotheses can be put forward explaining QTL cluster; co-localisation and co-regulation (see figure). Co-localisation implies number of independently acting genes at any given locus. Each of these genes is linked to a cis-regulated mRNA abundance QTL. Co-regulation hypothesis implies that a single regulatory gene can affect mRNA accumulation of other genes from the QTL cluster. Such trans-regulatory loci, is of great interest because they determine the basic building blocks of the regulatory networks – regulatory hubs or modules.
Inference of the trans-factor (master regulator) controlling expression of multiple mRNAs. The approach is based on genetic dissection of the eQTL cluster. If all the genes underlying such cluster map at the same locus, it is likely that their mRNA abundance is regulated by independent cis-elements. Otherwise, there is a possibility of common trans-regulator - genes map to different loci. The challenge is to differentiate cis- and trans- regulation, because any given eQTL cluster appears to have mixture of both types of regulation. The first step in testing which hypothesis to follow is to map the underlying genes using the conventional approach to the genetic linkage mapping. This analysis could either assist finding regulatory genes involved in the processes or pinpoint more precise gene locations. If map positions of the genes involved differ from their transcript QTL position it is possible that the region of the QTL contains a regulatory gene or gene cluster that influences all the genes in question. Otherwise, co-localisation would simply imply no direct functional interaction between the genes in the QTL cluster. Real-life situation is that about two thirds of all genes in any given cluster seem to be cis-regulated and therefore a direct functional link is unlikely. However the remaining one third of the genes of any given QTL cluster could potentially have a functional relationship. To investigate this, we are planning to perform detailed experimental analysis of one of the loci we identified on the chromosome 2H and named Required for Puccinia resistance 2 (Rpr2) that putatively encodes a trans-acting factor controlling programmed cell death (PCD) in barley. We reached this conclusion because many of the individual genes that mapped as eQTL to this locus are physically located in different regions of the genome, and a large number (but not all) are known to be involved in PCD. Manifestation of PCD activation in case of pathogen attack is hypersensitive reaction followed by localised necrosis – one of the defense mechanisms plants employ to prevent pathogen invasion. PCD can also be constitutive as in case of disease lesion mimic mutants, or simply it takes place throughout normal plant development as spatially and temporally tightly controlled biological process.
Figure: One of the strategies for identification of the candidate genes for complex traits
Three different components are employed:
- experimental population where the trait segregates
- induced mutations containing lines, that either have obvious phenotype related to that of segregating population (forward genetics approach), or mutations in the candidate genes (reverse genetics, Tilling)
- as a third component, mRNA abundance phenotypes of thousands of genes are used to link the first two.
In this example, we specifically address group of PCD-related genes that have eQTLs associated with those of partial resistance to the wheat stem rust fungus in barley. We also mapped the phenotype of one of the disease lesion mimic mutants to the same locus. Detailed genetic analysis of this locus, employing all three components mentioned above, should reveal whether a single gene controls all three groups of phenotypes.
Projects at SCRI
Genetics of gene expression in barley (2004-2007)
This BBSRC/RERAD funded project finished in November 2007 but the mRNA profiling data set still has considerable potential for data mining and it is being exploited in several other projects. The project initiated a fruitful, complementary and still ongoing collaboration between SCRI group which has expertise in setting up large scale mRNA profiling experiments and the University of Birmingham's quantitative genetics group. Another important collaboration has been established with Robert Williams' group at the University of Tennessee, Memphis, USA to provide public access to the barley trait and mRNA datasets through the GeneNetwork's analytical online environment. The project also opened an opportunity for the SCRI to be involved in the co-ordination of the GeneSys project.
The following points summarise results of the project.
1) The principal technological (design was based on Affymetrix' GeneChip technology representing about 16,000 barley genes). Using this mRNA profiling platform we identified 23,738 significant eQTLs that affected expression of 12,987 genes. Over a third of these genes with expression variation had only one identified eQTL. The number of QTLs linked to the expression of the rest of the genes varied from two to six. Both cis- and trans- effects can be observed in a large proportion of the quantitatively controlled transcripts . In this population more than half of the quantitatively controlled transcripts appear to be primarily regulated by cis eQTLs.
2) We explored the potential of using these mRNA abundance QTL traits as surrogates for the identification of candidate genes underlying higher order traits. We used a well studied interaction between barley and the wheat stem rust fungus Puccinia graminis f. sp. tritici as a model . We showed that the approach based on the transcript abundance QTL can complement or even replace more traditional gene isolation methods.
3) Existing barley genetic linkage map was supplemented with several thousands transcript derived markers (TDMs).
Integrated EU project 'Exploitation of natural plant biodiversity for the pesticide-free production of food' (BIOEXPLOIT) is supported by the European Commission through the Sixth Framework Programme. The aim of this project is to force a breakthrough by developing efficient and rational breeding strategies using genomics and post-genomics tools to exploit natural host plant resistance. BIOEXPLOIT's focus is on wheat and potato - the two most important staple crops for all consumers in the EU - for which pesticides, mainly fungicides, are indispensable at the moment.
Within this project, by using barley as a model for wheat, we are exploiting the principle of genetic analysis of the mRNA abundance to address a more specific question - interaction of barley plant with leaf rust fungus. The project brought together SCRI's expertise in QTL analysis and knowledge on leaf rust biology at the University of Wageningen.
The following points summarise SCRI barley-related activities within the project.
1) The mRNA profiling platform for barley (Barley2) based on Agilent's 8x15K format was developed and used for the mRNA profiling. Barley2 represents about the same number of genes as Barley1, but the microarray is customisable and costs significantly less than the Barley1 microarray.
2) The mRNA abundance data sets representing several barley plant - leaf rust fungus interaction combinations have been performed. They included:
- time course study of two cultivars, Steptoe and Morex that were either infected or mock inoculated with the leaf rust fungus
- profiling of the infected/mock infected barley reciprocal near isogenic lines carrying either susceptibility allele or resistance allele of one of the barley-leaf rust fungus interaction QTLs
- profiling across leaf rust fungus challanged doubled haploid line population that was generated by crossing cultivar Morex with Steptoe.
3) The data sets above have been analysed and the follow-up experiments are currently carried out.