Systematic chemical studies indicate that the capability of Watson-Crick base-pairing is widespread among potentially natural nucleic acid alternatives taken from RNA's close structural neighborhood. A comparison of RNA and such alternatives with regard to chemical properties that are fundamental to the biological function of RNA provides chemical facts that may contain clues to RNA's origin.
Wir wollen nicht nur wissen wie die Natur ist (und wie ihre Vorgänge ablaufen), sondern wir wollen auch nach Möglichkeit das vielleicht utopisch und anmassend erscheinende Ziel erreichen, zu wissen, warum die Natur so und nicht anders ist. [Albert Einstein (1, p. 126)] [We not only want to know how nature is (and how her transactions are carried through), but we also want to reach, if possible, a goal which may seem utopian and presumptuous, namely, to know why nature is such and not otherwise.]
Chemical etiology (2) of nucleic acid structure refers to systematic experimental studies aimed at narrowing the diversity of possible answers to the question of why nature chose the structure type of ribofuranosyl nucleic acids, rather than some other family of molecular structures, as the molecular basis of life's genetic system. The quest is to uncover the criteria by which nature arrived at this choice; comprehending these criteria in chemical terms would constitute a central element of any theory on the origin of the particular kind of chemical life known today. The strategy is to conceive (through chemical reasoning) potentially natural alternatives to the nucleic acid structure, to synthesize such alternatives by chemical methods, and to compare them with the natural nucleic acids with respect to those chemical properties that are fundamental to the biological function of RNA and DNA. Basic to this research is the supposition that the RNA structure originated through a process that was combinatorial in nature with respect to the assembly and functional selection of an informational system within the domain of sugar-based oligonucleotides. The investigation can be viewed as an attempt to mimic the selectional part of such a hypothetical natural process by chemical means. In principle, the study has no bias with regard to the question of whether RNA first appeared in an abiotic or a biotic environment (3).
The etiological relevance of the strategy depends on the criteria by which the alternatives are chosen. The condition that a given candidate must fulfill in order to be selected for study is that one can deem it a potentially natural type of molecular structure. For this to be the case, one requires the system to be structurally to derivable from a (CH2O)n sugar (n = 4, 5, or 6) by the same type of potentially natural chemistry that allows the structure of RNA to be derived from ribose. Whatever the specific chemistry that may have actually resulted in RNA, I hypothesize that the same type of chemistry had the potential to produce not just a single oligonucleotide system, but rather a variety of them, provided that not only ribose, but also other members of the family of aldosugars were part of the environment. That such should have been the case would be consonant with the (CH2O)n-sugars' exceptional chemical status of a family of molecular structures that, despite their apparent constitutional and configurational complexity, are all elementary in the sense that they can form from elementary starting materials.
The first chemical property by which a nucleic acid alternative is to be compared with RNA is the system's capacity for informational base pairing in the Watson-Crick mode. In more extended versions of the study, comparisons may also have to focus on constitutionally different kinds of base pairing, that is, on variants involving recognition partners distinct from the canonical purines and pyrimidines. Although it would seem that the constitutional diversity of potentially natural nucleobase alternatives is much smaller than that of possible backbones, a comprehensive chemical etiology of nucleic acid structure will have to screen nucleobase alternatives analogously to backbone alternatives. In principle, the same would hold for the third component of nucleic acid structure--the phosphodiester junction.
Screening alternative backbones with respect to base-pairing capacity may reveal that a given candidate is, in fact, not a base-pairing system. Such an "alternative" can be dropped from the list of potentially relevant evolutionary competitors of RNA. On the other hand, a system may possess a base-pairing capability comparable or even superior (with regard to pairing strength) to that of RNA. Then, it will have to be subjected to comparisons with respect to chemical properties that are etiologically more demanding than base pairing alone. Heading the list of such properties, in reference to an abiotic scenario, is the capability for (nonenzymatic) self-replication under potentially natural conditions, one of at least two general conditions to be met by any informational molecular system that eventually would have to assume the role of a "genetic system" in a given environment. The second of these conditions is a system's capacity to express a chemical phenotype in that environment. The term "chemical phenotype" is intended to denote the whole of an informational system's sequence-specific, and thus "inheritable," reactivities and catalytic properties that are able to act as selection factors for the system's replication in a given environment. This notion points to one of the directions along which an experimental chemical etiology of nucleic acid structure may have to proceed, should potentially natural alternative base-pairing systems be encountered that cannot be eliminated in early rounds of investigation as rivals of RNA and that do not reveal through their properties why the latter should have excelled in evolution. The structural and functional properties that may gain special relevance in such a study are largely unpredictable, and whether the natural system will ever disclose the reasons for its superiority over alternatives on the level of chemical properties is uncertain. It is conceivable that an alternative system may eventually require comparison with RNA on a level that would no longer be deemed "purely chemical," because the comparison would refer to the systems' evolvability. In an extreme case, this could amount to comparing and experimentally studying elements of an artificial alternative biology.
The research program delineated here has its underpinnings in the general views held by
the "geneticists'" (4) school of thought on the
etiology of life, pioneered by Eigen [(5); see also (6)], Orgel [(7); see also (8)],
Kuhn (9), and others, and modified later in consequence of
the discovery of catalytic RNA (10). The literature in the
field of nucleic acid chemistry records contributions that either have
adumbrated or complement the investigations surveyed here. A pioneering study
by Usher [(11); see also (12-16)] dealt
experimentally with the question of why RNA's phosphodiester junction is
positioned between carbons 3' and 5' and not between 2' and 5'. Its conclusion
pointed to the higher susceptibility to hydrolytic strand cleavage of duplexes
containing a unit with a (2'
5')-phosphodiester junction. A markedly lower pairing strength of
the (2'
5')-ribofuranosyl system has
since been documented (17). Recently, the same question,
but now asked of DNA (3), has been studied by Breslow (18) and Switzer (19). Alternative base
pairs were the subject of investigations by Benner (20) and
others [(21); see also (22)]. The
advantages for RNA and DNA to have the sugar units held together by
phosphodiester groups have been discussed in Westheimer's classic paper (23), and physicochemical aspects of the RNA structure's
suitability for its role as a biological informational system have been
summarized by Turner and Bevilacqua (24).
The ETH project of examining alternative nucleic acid systems was initiated in 1986 with the model study "homo-DNA" (25), an artificial oligonucleotide system differing from DNA in that the five-membered furanose ring is expanded to a six-membered pyranose ring by an additional methylene group (26), thus providing the opportunity to investigate whether five-memberedness of the sugar ring in an oligonucleotide system is a structural prerequisite for Watson-Crick base pairing (Fig. 1). Experimentally, homo-DNA was the first example (among the many now known) of a backbone-modified oligonucleotide system that shows much stronger Watson-Crick base pairing than DNA, and moreover, it was the first oligonucleotide system whose base pairing is orthogonal to that of the natural systems (25, 27). The reason for its greater base-pairing strength was postulated to be a consequence of the higher rigidity of pyranose as compared to that of furanose rings, resulting in a preorganization of the single strand's backbone conformation toward the double strand's pairing conformation (28, 29). The system also has a pronounced propensity for adenine and guanine self-pairing in the reverse-Hoogsteen mode.
Fig. 1. Constitution and configuration of the repeating units of
(potentially natural) nucleic acid alternatives investigated in the ETH and TSRI
laboratories. B, nucleobase; thick solid line, substituent above the ring plane; dashed
line, substituent below the ring plane.
Although homo-DNA was rewarding from a chemical point of view, it was never supposed to
be a potentially natural nucleic acid alternative (30), but
it was considered as a model system for the family of hexopyranosyl-(4'
6') oligonucleotides, whose
structures differ from that of homo-DNA by two additional hydroxy groups per
repetitive unit and relate to the natural hexoses in the same way that RNA
relates to ribose. Three systems within this family have been studied:
-allo-,
-altro-,
and
-glucopyranosyl (31) (Fig. 1). None of these exhibit discernible Watson-Crick
base pairing between adenine and uracil up to dodecamer strands under standard
conditions. Instead, purine-purine self-pairing in the reverse-Hoogsteen mode
occurs in the
-allo- and
-altropyranosyl series (but not in the
-glucopyranosyl series), although again much
less efficiently than in homo-DNA. Guanine-cytosine pairing in the
-allo and
-altro systems is far weaker than in RNA; moreover, it is
strongly sequence dependent with regard to the pairing mode. The data given in Table 1 demonstrate the incompetence of the
-allo- and
-altropyranosyl
systems for informational base pairing.
|
||||||||||||||||||||||||
The contrast between the base-pairing behavior of homo-DNA and its fully hydroxylated
counterparts was interpreted as the consequence of intrastrand steric hindrance
in the pairing conformation of the bulkier fully hydroxylated systems (Fig. 2, top left). Models, based on a nuclear magnetic resonance
(NMR) structure analysis of the homo-DNA duplex (6'-A5T5-4')2
(32), indicated that the insertion of an additional hydroxy
group at the equatorial 2' position of the pyranose ring in homo-DNA results
in a steric clash between this hydroxy group and the neighboring nucleobase.
In monodeoxy-
-allopyranosyl model systems,
studies of the temperature at which ~50% of duplex molecules are dissociated
into single strands (Tm) corroborated the hypothesis that adenine
self-pairing in the 2'-deoxy series is comparable in strength to that of
homo-DNA, but in the 3'-deoxy system, it is as weak as in the
-allopyranosyl series (33).
Fig. 2. Idealized pairing conformations of sugar-phosphodiester
units in hexopyranosyl-(4'The outcome of these studies has led to the conclusion that, for functional reasons,
the three hexopyranosyl-(4'
6')
oligonucleotide systems investigated (34) could not have
acted as viable competitors of RNA in the emergence of nature's genetic system.
Because steric bulk of the hexopyranosyl sugar units ("too many atoms")
was deemed responsible, studies were refocused on the sterically less bulky
pentose-derived nucleic acid alternatives, no longer inquiring "Why
pentose and not hexose nucleic acids?"--the question that had led to the
start of the ETH project in the first place--but rather "Why ribose and
not another pentose?" and "If ribose, why ribofuranose and not
ribopyranose?"
The primary object of our studies has been the pyranosyl isomer of RNA (p-RNA), a
system derived from the pyranose form of ribose with the nucleobases placed
equatorially on the pyranose chairs and with phosphodiester junctions between
(equatorial) positions 2' and 4' (Figs. 1 and 2).
Conformational analysis on the level of idealized conformations (35) predicted p-RNA to be a Watson-Crick base-pairing system
that would form duplexes that will be quasi-linear or at least much less
helical than RNA (Fig. 2). Experimentally, p-RNA was not only
a much stronger Watson-Crick pairing system than RNA, but also a more selective
one with respect to pairing modes (no self-pairing of guanine-rich sequences in
the Hoogsteen or reverse-Hoogsteen mode). After an extensive investigation of
the system's structural and chemical properties, the study was extended to
the whole family of diastereoisomeric pentopyranosyl-(2'
4') oligonucleotide systems containing
-ribo-,
-xylo-,
-lyxo-, and
-arabinopyranosyl units as building blocks,
always with the nucleobases equatorially positioned on the pyranosyl chairs.
Surprisingly, all members of this family of RNA isomers were stronger
Watson-Crick base-pairing systems than RNA itself (36); the
-arabinopyranosyl isomer
ranks among the strongest oligonucleotide-type base-pairing systems encountered
(37). Promiscuous cross-pairing between all members of the
pentopyranosyl family was observed (Fig. 3) (36,
38), suggesting a remarkable capacity of their
diastereoisomeric backbones for adopting a common type of pairing conformation.
Nevertheless, differences are discernible: Cross-pairing is smoother between
members (between ribo and xylo and between lyxo and arabino) that have
phosphodiester bridges attached in the same conformation (4'-equatorial versus
4'-axial; see Fig. 2). The pentopyranosyl (2'
4') oligonucleotides do not cross-pair with
RNA. However, a somewhat capriciously sequence-dependent cross-pairing
capability with DNA was observed in the case of the (L)-
-lyxopyranosyl-(3'
4') system (see Fig. 1, bottom left),
which, remarkably, was found to constitute a base-pairing system by itself, in
contrast to the corresponding (3'
4')
isomer of the ribopyranosyl series (39). The
-lyxopyranosyl-(3'
4') system is the first duplex-forming
oligonucleotide system that has only five (instead of the usual six) covalent
bonds in the repetitive unit of its backbone.
Fig. 3. Pairing, cross-pairing, and self-pairing in the
pentopyranosyl-(2'An NMR structure analysis by Schlönvogt et al. (40) of the p-RNA duplex derived from the self-complementary base sequence (4'-CGAATTCG-2') confirmed the type of pairing conformation predicted by conformational analysis, and modeling (40) based on molecular dynamics indicates that its duplex ladder structure has a weak left-handed twist (duplex derived from D-ribose) (Fig. 4). Base pairing in p-RNA is strictly antiparallel, and base stacking is overwhelmingly interstrand as opposed to intrastrand, the latter being the stacking mode common to the natural systems and most clearly displayed in a B-form DNA helix. Interstrand stacking correlates conformationally with the pronounced inclination of the p-RNA backbone toward the base-pair axes; it is also interpreted as the major controlling factor (other than base-pair constitution) in the sequence dependence of p-RNA duplex stability and the relative rates of both template-directed ligations and self-templated oligomerizations and the major factor in determining the regioselectivity of the effect of dangling bases on duplex stability (41).
Fig. 4. Time-averaged and refined structure of an octamer duplex
derived from self-complementary pyranosyl-RNA sequence 4'-CGAATTCG-2', based on a 1000-ps
Molecular Dynamics calculation (Amber program) and an NMR structure analysis (40). The ball-and-stick representation and the excerpts of
purine-pyrimidine and purine-purine stacking (top right) and pyrimidine-pyrimidine
"nonstacking" (bottom left) illustrate the interstrand base-stacking that is
characteristic of pentopyranosyl-(2'
Strands of p-RNA form hairpin structures with comparable ease, as do RNA strands (41). Replicative copying of p-RNA sequences is achieved by
template-directed ligations of short oligomers with 2',3'-cyclophosphate
derivatives, possibly the simplest mode of phosphate activation (35, 42). Ligation is regioselective; it
exclusively affords the correct (2'
4')-phosphodiester bridges, whereas the analogous cyclophosphate-mediated
ligation in the RNA series fails to produce the natural (3'
5') connection, resulting instead in the
isomeric (2'
5') junction (42, 43). Copying guanine-rich template sequences
in the p-RNA series is not hampered by guanine-guanine self-pairing.
Replication experiments, including ones with templates that form hairpin
structures, have had marked success in terms of sequence copying but have
failed to demonstrate (under the conditions tested) template-catalysis turnover
numbers greater than one (44).
The 2',3'-cyclophosphate derivatives of p-RNA tetramers containing hemi-self-complemenary base sequences such as (4'-ATCG-2') or (4'-GCCG-2') were shown to undergo self-templating oligomerization and co-oligomerization to duplexes of higher oligomers (45). The process is highly chiroselective and offers a pathway for the constitutional self-assembly of complex libraries of (largely) homochiral p-RNA strands starting from D- and L-ribose-derived diastereoisomeric tetramers. In principle, such chiroselective self-templating co-oligomerizations of activated short oligonucleotide oligomers, when starting from racemic mixtures of diastereoisomers, have the potential to break molecular mirror symmetry by deracemization, provided the sequence diversity of the product libraries will exceed a critical level of complexity (45). This capacity, intrinsic to any informational polymer family that can chiroselectively self-assemble to form libraries of critical constitutional diversity, deserves special attention in the context of the problem of the origin of biomolecular homochirality. Homochiral p-RNA strands with an opposite sense of chirality also can pair with each other but much less strongly so and by pairing rules that are the inverse to the canonical ones (for example, cytosine pairs with isoguanine, and guanine pairs with isocytosine) (22).
The base-pairing strength landscape of Fig. 5 surveys relative base-pairing capabilities of potentially natural nucleic acid alternatives, giving melting temperatures (Tm values) of A8/T8 and A12/T12 duplexes (46) determined under standard conditions (31, 36, 38, 39, 47-49). The profile of this landscape in the pentopyranosyl region is confirmed by Tm data of duplexes with irregular AT sequences (Fig. 3) as well as with sequences containing GC base pairs. These findings falsify one of the possible hypotheses about the criteria that nature may have followed in selecting the RNA structure, namely, that maximization of base-pairing strength within the domain of pentose-derived oligonucleotide systems was such a criterion. On the other hand, in conjunction with the observations made in the hexopyranosyl series, these data lend support to the notion that optimization, not maximization, of base-pairing strength was a determinant of RNA's selection. Biological reasoning would emphasize that moderate base-pairing strength, as encountered in RNA and resulting from the high conformational flexibility of the ribofuranose backbone, was essential for the evolution of a rich diversity of nucleic acid-related biological functions. The observed base-pairing strength in the family of the conformationally more rigid pentopyranosyl oligonucleotide systems provides a background of chemical facts in support of such an argument.
Fig. 5. Pairing-strength landscape of hexose-, pentose-, and
tetrose-derived oligonucleotide systems, showing the range of the constitutional and
configurational diversity of (potentially natural) alternatives of the RNA structure and
giving Tm values of duplexes A8 · T8 (red
level) and A12 · T12 (black level) investigated in the ETH and
TSRI laboratories. For conditions, see the caption of Fig. 3. (A)
Column height unit is 10°C above 0°C (see also Fig. 3); (B)
observed Tm values; (C) estimated Tm values,
purple for A8 · T8 and gray for A12 · T12;
and (D) not investigated. Data are as follows: hexopyranosyl series (31), pentopyranosyl series (36, 38,
39), pentofuranosyl series (48), and
tetrofuranosyl series (47).
Also, other observed chemical properties related to high base-pairing strength could be of etiological relevance. One is overtolerance to base-pair mismatches. Such behavior is exemplified in the pentopyranosyl series by the amazingly strong self-pairing of non-self-complementary base sequences such as (4')-TATTTTAA-(2') and (4')-TTAAAATA-(2') (see Fig. 3). The central issue, however, is the capability of nonenzymatic autocatalytic replication (50). Whereas high base-pairing strength can be expected to facilitate the selective recognition of template sequences by activated ligands and to accelerate sequence copying, it will concomitantly strengthen product inhibition of template sequences. Such inhibition has been observed to dominate strand replication in the p-RNA series. The issue is not specific to strongly pairing nucleic acid alternatives, however. Although great strides have been made in achieving template-directed copying of base sequences in the RNA (50-52), 2',5'-RNA (15, 53), DNA (54), and p-RNA (42, 44) series and genuinely auto- and cross-catalytic replication has been accomplished for constitutionally modified, short deoxyribonucleotide sequences (55), it has not been demonstrated that any oligonucleotide system possesses the capacity for efficient and reliable nonenzymatic replication under potentially natural conditions (56). Whether, and under what conditions, such a capacity can be demonstrated experimentally remains one of the crucial criteria for judging, from a chemical point of view, whether RNA (or, for that matter, any nucleic acid alternative) could have initiated an evolutionary process. Ongoing research probing RNA's chemical phenotype by running enzyme-assisted in vitro evolution of RNA sequences deserves to be mentioned specifically in this context, particularly the demanding attempts to discover (in this way) ribozymes that would be able to catalyze RNA replication (57).
Not only RNA's (or an alternative's) potential for nonenzymatic replication, but also its chances for formation in an abiotic natural environment remain open to question. Whereas there is a consensus on the notion that the building blocks of RNA (sugars, purines, and pyrimidines) potentially are of prebiotic origin (51, 58) and whereas the broad chemical contours of an assembly of the RNA structure from such building blocks seem clear, convincing experimental evidence that such a process can in fact occur under potentially natural conditions is still lacking (3, 59); this is particularly true with regard to such crucial steps as nucleotide formation and phosphate activation (60). Considering the etiological importance of the issue together with the chemical experience of a vast diversity of possible paths and conditions in the generation of almost any complex organic molecule, much more experimental work on this problem is needed before a chemically reliable assessment of RNA's (or an alternative's) chances for an abiotic origin can or should be made.
In principle, a chemical etiology of nucleic acid structure has to reckon with the possibility that the RNA structure might have originated as a consequence of synthetic contingency, not as a result of synthetic variation and functional selection. It is conceivable that circumstances could have favored the selective formation of the RNA structure in preference to alternatives, be it in an abiotic or a biotic environment (61). This would imply a synthetic rather than a functional selection as the primary determinant in RNA's emergence. If this were the case, then the chemical study of nucleic acid alternatives and the discovery of efficient informational base-pairing systems would acquire an altered importance, namely, that it would provide chemical facts in questioning the uniqueness of RNA and, implicitly, of the kind of life that is known.
Asking the central question regarding the criteria for RNA's natural selection and extending the inquiry to whether its emergence was dominated by combinatorial generation and functional selection or by synthetic contingency could mean to embark on a program of much more comprehensive chemical screening of potentially natural nucleic acid alternatives. In such a program, the selection of candidates would not be limited to RNA's structural neighborhood (62) and, moreover, would include the systematic exploration of the candidates' potential for being generated under natural conditions. Eventually, the course and the outcome of such studies would emphasize the point that the aim of an experimental etiological chemistry must be not primarily to show how life on Earth could have originated, but to provide decisive experimental evidence--through the realization of model systems in the laboratory ("artificial chemical life")--that life can arise as a result of the organization of organic matter.