Studying mobile element-host interactions in a compact genome
The Dictyostelium genome: an inhospitable environment for mobile elements
Published in Mob. Genet. Elements 1(2), 145-150 (2011) (PubMedExternal link)
Since the discovery of social amoebae almost 150 years ago, generations of researchers have been interested in uncovering the molecular mechanisms underlying the formation of multicellular fruiting bodies from aggregated single cells. Dictyostelium discoideum has become their favorite model species. Consequently, the genome of D. discoideum was the first social amoeba to be fully sequenced. Not only has this sequence provided valuable information on the general genome structure of this organism, but it has also uncovered an unexpected diversity of mobile genetic elements in this species.
Roughly 65% of the 34 Mb D. discoideum genome is composed of genes. The intergenic regions accommodate gene regulatory sequences and are generally less than 1,000 bp in length. Molecular parasites invading such a compact genome face the problem of avoiding the disruption of genes or their regulatory elements to become fixed in the cell population. We hypothesize that D. discoideum has invented two prominent mechanisms to limit the expansion of mobile elements in its gene-dense genome and to maintain genome stability. First, D. discoideum cells are haploid. In a diploid cell, the constant threat of insertional mutagenesis caused by the activity of mobile elements would lead to the accumulation of heterozygous, potentially lethal mutations. Second, D. discoideum cells have an extended G2 phase of the cell cycle, which could make an unaffected template DNA strand available for the repair of de novo mobile element integrations before cell division.
Dictyostelium tRNA genes: landmarks for "safe" integration sites
We predict that D. discoideum cells either do not tolerate extensive amplification of mobile elements in their compact genomes or that some kind of integration preference was invented by the mobile elements, allowing them to spread without causing extensive damage to the host genome. In favor of the latter assumption, we note that the D. discoideum genome contains an unexpected diversity of mobile elements, which contribute about 10% of the genome size. By analyzing the genome sequence of D. discoideum, that is, looking at integration events that occurred putatively thousands or even millions of years ago, the population of mobile elements can be roughly divided into two groups: (i) elements that seem to actively recognize tRNA genes as integration sites (or more generally, genes transcribed by RNA polymerase III; discussed below) and (ii) elements that do not seem to possess mechanisms of active targeting.
The two groups of mobile elements show dramatically different distributions in the genome, which may be a direct consequence of the presence or absence of active target site recognition. On the one hand, the Dictyostelium gypsy-like transposable element A (DGLT-A) and tRNA gene-associated retrotransposons (TREs), which together make up 39% of the total copy number of mobile elements, contribute more than 80% of mobile element loci in euchromatic regions of the D. discoideum genome. In other words, these retrotransposons have been exceptionally successful in capturing space between protein-coding regions, apparently without causing insertional mutagenesis, by targeting one of the approximately 390 tRNA genes distributed on all chromosomes. On the other hand, the majority of mobile elements in the D. discoideum genome form large clusters in which mobile elements seem to favor integration into pre-existing copies of themselves or other mobile elements. The most prominent examples are the centromeres of the D. discoideum chromosomes, which consist of 86% repetitive elements. In the centromeric repetitive element clusters, all mobile elements that do not follow an obvious tRNA gene targeting strategy are highly overrepresented. For instance, approximately 48% of centromere DNA consists of DIRS-1, which is virtually absent from the other parts of the chromosomes.
Should we consider the clustering of mobile elements in centromeres and other cluster hotspots in the D. discoideum genome as target site selectivity? Some kind of passive target site preference may be derived from homologous recombination of a mobilized element into pre-existing chromosomal copies of the same element, rather than from targeting mechanisms comparable to the active recognition of tRNA genes as integration sites. However, owing to the high gene density and the haploid state of the D. discoideum genome, the observation that "non-TRE" mobile elements often form clusters by insertion of one transposon species into another transposon species may point to an alternative explanation. Imagine that non-TRE mobile elements have no integration preference and frequently disrupt essential genes by insertional mutagenesis. Affected cells would be lost from the population, leaving only cells with integrants located in genomic regions where they can be tolerated to a certain extent. This process would enrich mobile elements in transposon clusters over evolutionary time and imply an integration preference that does not exist; the size of such clusters may only be limited by positive selection on satisfactory genome stability. On the contrary, de novo TRE integrants near tRNA genes in euchromatic regions may be more tolerable because they do not per se disrupt protein-coding genes. This tolerance may allow for the accumulation of TRE integrants in these regions over time, even though cells may occasionally be lost from the population if resident TREs serve as sites for illegitimate recombinations that eliminate essential genes. A major challenge remaining to confirm this hypothesis is to determine whether the tRNA gene-selective integration of TREs is a common phenomenon and off-site integration occurs rarely. Obviously, this question cannot be solved by analyzing "old" integrants; instead, assays must be developed that allow for the analysis of de novo integration events.
The TRE trap: defining the integration target
The first tRNA gene-associated mobile element in D. discoideum was discovered by the Dingermann group in the late 1980s. While analyzing the upstream-flanking DNA sequences of tRNA genes, it appeared that the "Dictyostelium Repetitive Element" (DRE) integrated approximately 50 bp upstream of tRNA genes in a strictly orientation-specific manner. Later, the Dictyostelium genome project uncovered six more tRNA gene-associated retrotransposons with structural similarity to DRE. Now, we address retrotransposons that integrate upstream of tRNA genes as TRE5 and elements inserting downstream of tRNA genes as TRE3. Preliminary analysis of the complete D. discoideum genome sequence revealed that approximately 61% of the tRNA genes are associated with at least one TRE5 or TRE3, and tRNA genes can be targeted several times by TRE5 or TRE3 elements. Obviously, due to their precise integration distance to tRNA genes, TREs never integrate into each other but rather insert sequentially one before the other, forming tandems.
Analyzing the D. discoideum genome revealed that TRE5-A integrants are located within a window of 37-211 bp upstream of tRNA genes, with 61% of TRE5-As being inserted at a distance of 48±3 bp. D. discoideum tRNA genes are scattered throughout the chromosomes and do not show conserved flanking sequences suggesting the targeting integration of TRE5-A to certain DNA sequences. Thus, it initially remained elusive how TRE5-A recognizes potential integration sites. All tRNA genes are transcribed by RNA polymerase III (pol III) and exclusively use pol III-specific transcription factors. Therefore, TRE5-A-encoded proteins may recognize pol III or its associated transcription factors to select integration sites. The multi-subunit transcription factor TFIIIC recognizes the intragenic promoter elements, named A box and B box, of tRNA genes in a sequence-specific manner. DNA-bound TFIIIC then mediates the binding of TFIIIB, which in turn initiates transcription of the tRNA gene by recruiting pol III. Given the presumed topology of the TFIIIC/TFIIIB complex on tRNA genes and the average distance of approximately 50 bp of TRE5-A integrants to the tRNA gene, we speculated that TFIIIC may be essential for the integration process because it recruits TFIIIB, which is the actual partner for TRE5-A-derived proteins to confer tRNA gene-directed integration.
To test this assumption, we developed an in vivo retrotransposition assay that we named "TRE trap". The TRE trap consists of the D. discoideum UMP synthase gene cloned into a plasmid. This gene contains an intron into which a tRNA gene had been inserted as a potential target for TRE5-A integration. An outline of the TRE trap assay is shown in the figure. The TRE trap plasmid is transformed into cells that lack functional UMP synthase (ura- cells) and are therefore resistant to the cytostatic drug 5-fluoroorotic acid (5-FOA). The idea of the TRE trap is that D. discoideum transformants carrying a chromosomal copy of the TRE trap would at first gain uracil prototrophy (ura+) and thus be sensitive to 5-FOA; however, if the UMP synthase gene was disrupted by integration of a TRE5-A mobilized from the endogenous population, the transformant would gain resistance to 5-FOA and cells grow clonally under selection with 5-FOA and in the presence of uracil. Using this assay, we found that there is a fairly active population of TRE5-As in D. discoideum cells and that, in accordance with our prediction, the B box promoter of tRNA genes is essential for integration site recognition by TRE5-As. We ruled out the possibility that TRE5-A proteins directly recognized the B box by showing that the pol III-transcribed ribosomal 5S gene, which lacks a B box but nevertheless recruits the TFIIIC/TFIIIB complex by interaction with TFIIIA, is well recognized as an integration target in the context of the TRE trap. Additional evidence to support the hypothesis that TRE5-A elements may use protein-protein interactions rather than direct DNA binding to detect target sites comes from in vitro studies, suggesting that the ORF1 protein of TRE5-A is able to bind to TFIIIB subunits.
Interestingly, the majority of TRE5-A integrants recovered from the TRE trap were located 48±3 bp upstream of the bait tRNA gene, suggesting that this distance may present the intrinsic target preference of the element and that the broader window of integrations detected by the analysis of old genomic integrations may be misleading. Thus, the TRE trap experiments strongly supported our general hypothesis that TREs have invented mechanisms for actively targeting tRNA genes. However, this assay could not be used to answer the question of whether TREs may cause insertional mutations despite their a priori integration preferences. A step further in answering this question was the development of genetically tagged TRE5-A retrotransposons capable of amplifying in D. discoideum cell cultures and presenting unbiased integration to natural target sites.
Genetically tagged TRE5-As: defining the retrotransposon structure
It is clear that analysis of genomic integrations of mobile elements that probably date back millions of years is limited by high genome flexibility. Hence, it is difficult to estimate how frequently tRNA gene-specific integration events have occurred. In fact, inserted retrotransposons could have been separated from their target tRNA genes by genomic rearrangements. Therefore, it may be misleading to conclude how frequently TREs may cause insertional mutations (despite their generally strong target site preference) from rare examples of apparent off-site integrations found in the D. discoideum genome sequence data. Thus, although the majority of TREs are found associated with tRNA genes, it cannot be determined whether "tRNA gene specificity" is overestimated because off-site integrations will ultimately cause elimination of the affected cells from the population.
In a recent study, we developed genetically traceable TRE5-A retrotransposons (TRE5-Absr elements) that help provide insight into the retrotransposition mechanism used by this element. TRE5-Absr elements contain a blasticidin selection marker (mbsrI) that is interrupted by a reverse intron. Thus, the mbsrI gene is transcribed as minus strand RNA of the TRE5-Absr retrotransposon, but the intron cannot be removed and no blasticidin resistance is established. On the other hand, the mbsrI intron is in the correct orientation with respect to the retrotransposon-derived plus strand RNA. If the plus strand RNA is spliced, reverse transcribed and integrated, a functional mbsr gene is expressed and confers blasticidin resistance to cells that experienced at least one retrotranspsoistion event. In the TRE5-Absr retrotransposition assay, cells are first co-transformed with plasmids carrying the TRE5-Absr element and a G418 resistance gene, respectively. G418-resistant clones, which contain TRE5-Absr "master" elements inserted into the D. discoideum genome, are pooled and subjected to blasticidin selection. The numbers of blasticidin-resistant clones give an estimate of the retrotransposition activity of the transformed TRE5-Absr "master" elements.
The TRE5-Absr retrotransposition assay has the advantage over the TRE trap assay that de novo integrations of TRE5-A are allowed at any natural target site. The assay could eventually be used to challenge our hypothesis that TREs avoid insertional mutagenesis by highly specific targeting of tRNA genes and gain a selection advantage over other mobile elements to spread into active regions of the D. discoideum genome and maintain populations of retrotransposition-competent elements in euchromatic regions.
tRNA gene-targeted retrotransposition: avoiding insertional mutagenesis?
Since their discovery some 20 years ago, our understanding of tRNA gene-targeting retrotransposons in the D. discoideum genome has been that such elements have invented active integration site selection into "safe" loci to maintain an active, retrotransposition-competent population in the gene-dense host genome. The observation that TREs have successfully spread in euchromatic regions while other ("non-TRE") elements have been largely excluded from these regions seems to favor this perspective. With genetically tagged TRE5-A retrotransposons that mimic the authentic TRE5-A target site preference when amplifying in D. discoideum cells, we will now gain more detailed insight into fundamental questions, such as the strictness of the target site specificity and the strength of its impact on the successful colonization and general shaping of the D. discoideum genome in the past and present. Taking into account that target site recognition by TRE5-A depends primarily on binding of TFIIIC to genomic B boxes, one could argue that there may be more B box-like motifs distributed in the genome than the total number of tRNA genes. This prediction would be supported by results showing that B boxes occur on the extrachromosomal DNA elements that carry the D. discoideum ribosomal tRNA genes. Thus, it will be interesting to see if TRE5-Absr elements will integrate at loci not associated with tRNA genes and whether these loci are off-site targets or previously unrecognized functional B boxes occupied by the TFIIIC/TFIIIB complex. If we predict that every authentic TRE5-Absr integration site is occupied by the TFIIIC/TFIIIB complex in vivo, TRE5-Absr elements are valuable tools to investigate genome organization and evolution in D. discoideum.