for DNA Testing of the Human Remains from Columbia Park, Kennewick,
At the request of the Department of Justice and Dr. Francis P. McManamon, Departmental Consulting Archaeologist of the National Park Service, Department of the Interior, we supply this discussion of the potential for DNA analysis of the human skeletal remains from Kennewick, Washington that are the objects of the lawsuit now pending (Bonnichsen et al., vs. United States of America, Civil No. 9601481-JE). The purpose of such an analysis would be to determine the genetic affinity of the above individual by isolating DNA from bone, and comparing any data generated with the known range and variation in human mitochondrial1 DNA.
The following is a synopsis of the potential results of a mitochondrial DNA analysis of a human skeleton found at Columbia Park in Kennewick, Washington. The possible results are listed in the order that we deem most likely based on our own experience and the data available in the literature. In the text of this document, we have detailed both the reasoning and the published support for these positions. We would emphasize that DNA testing of skeletalized human remains at this time depth is not a routine matter. To our knowledge, DNA data has never been used in whole or in part by the courts or as part of a NAGPRA (Native American Grave Protection Act) request to resolve the identification of an individual skeleton as either as American Indian or as a tribal member.
Should DNA analysis of the skeleton in question be undertaken, the following are the likely outcomes:
1. No DNA of suitable size or integrity remains in
2. A DNA type indicative of an American Indian will
3. A DNA type of ambiguous origin will be found.
4. Contaminating DNA from contemporary sources will
prevent an ancient DNA analysis.
None of the outcomes described above will unambiguously allow the remains
of the skeleton in question to be assigned to a given tribal authority.
Mitochondrial DNA analyses of known American Indian genetic groupings
will not resolve questions of cultural affiliation at the tribal or regional level. It is possible
that mitochondrial DNA analyses of the skeleton will allow assignment
of the skeleton to the greater cultural and biological grouping of American
Indian. On January 13, 2000 the Department of Interior determined "it
is reasonable to conclude that the human remains from Columbia Park
in Kennewick, WA are "Native American" as defined by the Native American
Graves Protection and Repatriation Act" (memorandum
from the Departmental Consulting Archaeologist to the Assistant Secretary,
Fish and Wildlife and Parks). As bioscientists we would consider
the prudent and ethical course of action is to seek the advice and consent
of interested American Indian descendants before any genetic testing
of this skeleton is done.
Ancient DNA Background and a Consideration of Finding No DNA in a Subfossil Bone
Standard molecular genetic analyses of humans and other organisms utilize a technological innovation that was developed in the mid-1980s (Mullis and Faloona 1987). The polymerase chain reaction, or PCR (see Figure 1), enables the specific targeting and amplification of a discrete region of the genome while the remaining bulk of the genomic DNA is excluded from the reaction and effectively stays in the background. Multiple cycles of amplification result in the exponential production of PCR product since product from each cycle serves as a template for additional product in each subsequent cycle. In a typical 30-cycle reaction, one billion copies are made of a single initial DNA template.
The exponential amplification of a specific region of DNA from only a few molecules has permitted the investigation of "ancient" DNA samples that are too degraded or damaged for analysis by traditional cloning methods, which require a much higher quantity and quality of DNA. "Ancient" samples are generally those that were not collected for the purpose of immediate DNA or RNA analysis and include archaeological, clinical, and natural history specimens. Since these specimens were not originally collected or preserved for nucleic acid analysis, endogenous DNA is typically damaged to an extent that enzymatic amplification can be quite difficult, if not impossible, to achieve. The types of DNA damage that are primarily encountered include modifications of pyrimidines and sugar residues as well as baseless sites and intermolecular crosslinks (Pääbo 1989). Only limited research has been conducted on the chemistry of ancient (aDNA) damage and possible methods of in vitro repair. Various protocols have been developed to determine the utility of specific ancient specimens for aDNA analysis (Handt et al., 1994, 1996, Richards et al., 1995, Poinar et al., 1996, Hummel et al., 1999) and these methods have recently been evaluated in a comparative study (Kolman and Tuross 2000).
Although the study of variability in the human genome has long been an area of research, both the ability to retrieve DNA from human skeletal remains and a substantial database with which to compare the results are fairly new scientific developments. The first Ancient DNA meeting was held in Nottingham, England in 1991. Since that time, the development in the field has been slow, and not without controversy. Early, spectacular claims of successful DNA extraction and amplification from extremely old specimens, such as 17-20 million year old (Myr old) Magnolia leaf fossils (Golenberg et al., 1990), 25-135 Myr old specimens preserved in amber (DeSalle et al., 1992, Cano et al., 1993), and 80 Myr old dinosaur bones (Woodward et al., 1994), generally have been disproved or cast into serious doubt (Sidow et al., 1991, DeSalle et al., 1993, Hedges and Schweitzer 1995, Austin et al., 1997, Walden and Robertson, 1997, Austin et al., 1998). Later authors, using relatively simple methods, were able to detect contamination in the early studies, such as Hedges and Schweitzer's (1995) phylogenetic analysis of proposed dinosaur DNA that identified it as modern human contamination.
There are relatively few published ancient DNA studies of bone of an age that approaches or exceeds the 9000 years reported for the human skeleton in this case (Taylor et al., 1998; see also Table 3). Horse fossil and subfossil bone were subjected to PCR amplification, and from a total of fifty-two specimens, two bones yielded DNA data: one from Kent's Cavern in Britain (approximately 12,000 BP), and another about 100 years old (Lister et al., 1998). When nine ancient cattle bones were subjected to PCR of a coding region of mitochondrial DNA, no specimen more than 2,000 years old was successfully amplified (Turner et al., 1998). The bones of Pleistocene megafauna have produced DNA that was successfully PCR amplified (Yang et al., 1996; Greenwood et al., 1999; Hanni et al., 1994) from a ground sloth, a cave bear, two mastodons and three mammoths. The later study (Greenwood et al., 1999) reported the retrieval of multi- and single copy nuclear genes, opening up the possibility that permafrost stored fossils may be a good source of a wider range of ancient DNA data.
In the Americas, one thorough ancient DNA study of a human skeleton comes close to the temporal range ascribed to the skeleton in question (Stone and Stoneking 1996). The skeleton from Hourglass Cave, estimated to be 8000 years old by radiocarbon dating, was found in 1988 in western Colorado at an altitude of 3000 meters. Stone and Stoneking (1996) attributed the DNA preservation to "the cold and constant environment of the high altitude cave in which the skeletal remains were found", and determined that this individual belonged to the B haplogoup (see Table 1 and the next section for a definition of haplogroups A, B, C and D) based on length polymorphism, restriction analysis and direct and cloned d-loop DNA sequencing. A partial genetic typing of a skeleton from the Pyramid Lake region of Western Nevada with an associated radiocarbon date of 9515 ± 60 BP was identified as belonging to the C haplogroup by the lack of a restriction digestion (HincII) at nucleotide position bp 13259 (Kaestle, 1997).
The only large genetic study of ancient American Indians (n=108) was applied to a population that lived approximately 700 years ago (Stone and Stoneking 1998). Seventy percent of the bone samples produced mtDNA results in this study (Stone and Stoneking 1999) and mitochondrial haplogroups that are found in contemporary American Indians make up 95% of the ancient genetic types (Stone and Stoneking, 1998). The other genetic types found associated with the human skeletal remains in this study were ascribed to contemporary human DNA contamination or were of ambiguous origin (Stone and Stoneking 1998), although further analysis of a fifth haplogroup found at Norris Farm associates with Mongolian sequences (Stone and Stoneking 1999). In general, when a high-resolution genetic analysis, such as the sequencing of multiple PCR clones, is used (Handt et al., 1996; Kolman and Tuross 2000) contamination from contemporary human DNA seems to be a persistent albeit not always fatal problem. Specifically, the high-resolution analyses revealed the presence of multiple DNA sequences in several specimens implying multiple sources of DNA, only one of which could represent ancient, endogenous DNA. The difficulty lies in determining which sequence, if any, is derived from the ancient specimen and is not modern contamination. Where only lower resolution data is available (Parr et al., 1996; Fox 1996; Merriwether et al., 1997; Kaestle 1997) it is difficult, and at times impossible, to determine the impact of contemporary human DNA contamination in the form of airborne products, handling or PCR products. In other parts of the world, a robust analysis of one Neanderthal skeleton has been reported (with accompanying contemporary contamination) (Krings et al., 1997). In this case, the Neanderthal sequence could be discriminated from the contamination because the ancient sequence was completely novel and highly divergent from all previously reported modern human sequences.
Recent information regarding the general organic preservation of the skeleton in question (see below) further strengthens the possibility that no DNA remains in a state that is useful for genetic analysis.
Defining a Choice of Genetic Markers With an Emphasis on the New World
Human beings carry two types of DNA, mitochondrial and nuclear, both of which are suitable for genetic analysis. Traditionally, mitochondrial DNA (mtDNA) has been studied much more extensively although the analysis of nuclear markers has become increasingly common in recent years. There are several advantages to the analysis of mtDNA that account for its early popularity in genetic and evolutionary studies. The mitochondrial genome has a higher mutation rate compared to the nuclear genome (although specific loci exist within both genomes that provide exceptions to this statement) such that mutations are generated sufficiently rapidly in the mitochondrial genome that the process of evolution can be detected and investigated. Furthermore, the region of the mitochondrial genome involved in replication of the genome, the control region, appears to have a mutation rate that is approximately ten times the rate of the mitochondrial genome as a whole. For this reason, many researchers have focused on the control region for evolutionary or population studies. Thousands of mitochondria are present in each cell meaning that thousands of copies of mtDNA are present in contrast to a single copy of each nuclear genome per cell. Due to the high copy number of mitochondrial genomes, mtDNA is relatively easy to isolate in the laboratory although technological advances over the past decade have minimized this technical difference between mitochondrial and nuclear DNA. Finally, mtDNA is characterized by strict maternal inheritance (offspring receive mtDNA only from their mother) and lack of recombination between different regions of the genome in contrast to nuclear DNA, which is biparentally inherited and subject to extensive recombination. The significance of the mitochondrion's simpler mode of transmission from parent to offspring is the ease with which any particular region of the mitochondrial genome can be traced through time and through the maternal lineage. Recent reports suggesting the contribution of parental mitochondria or nuclear copies of mtDNA to mtDNA (e.g., Awadalla et al., 1999; Hagelberg et al., 1999) are unlikely to obscure the utility of markers discussed in this report. It can, therefore, be a straightforward matter to reconstruct evolutionary relationships of populations or individuals through an analysis of mtDNA.
When choosing a particular region of the genome, or locus, to investigate in a genetic study, there are two criteria that must be met. First, the locus must have a level of variability among the individuals or populations being studied such that the individuals or populations can be differentiated from one another. Second, a comparative database for relevant populations on the same genetic region must be available in order to determine the relationship or identity of the individual or population under study with respect to other populations.
Currently, all genetic analyses are performed using amplification products derived from the polymerase chain reaction as described above. In order to identify regions with the best levels of variability, researchers are constantly testing new regions and assaying their variability in different populations. This means that the comparative database is spread out over many different genetic loci and populations making comparison between specific loci and particular populations difficult. However, due to the long-term focus on the mitochondrial genome for genetic studies, the comparative database for mitochondrial loci is much more extensive than that for nuclear loci. Furthermore, in human populations, DNA sequence determination of the control region is the most frequently generated type of mitochondrial data. Analysis of restriction fragment length polymorphisms (RFLPs), in which the PCR product is cleaved by a restriction enzyme that recognizes a particular DNA sequence, or analysis of regions with large deletions or insertions also is commonly conducted.
Figure 1 illustrates the three types of genetic markers that typically are assayed in New World indigenous populations. After PCR amplification of the region of interest, the PCR product can be analyzed in a number of ways. First, the order of nucleotides, or DNA sequence, of the PCR product, usually the control region, can be determined compared. Second, a marker commonly called the 9bp deletion (located between base pairs [bps] 8272 and 8289 [numbering according to Anderson et al., 1981]) can be identified. The 9bp deletion represents a region where a stretch of 9bps has been deleted relative to a reference sequence. Based on the size difference between the deleted and non-deleted alleles that is observed by electrophoresis of the PCR product through an agarose gel matrix, it can be determined which allele a particular individual carries. Third, there are certain RFLPs that are highly informative in New World indigenous population that are also assayed based on a size difference between alleles. Any difference between individuals that is detected in the above analyses can be referred to as a marker, a polymorphism, or an allele and the combination of these markers in one individual is referred to as a haplotype.
New World indigenous groups were first assayed for mtDNA RFLPs and the 9bp deletion by Douglas C. Wallace and coworkers in the 1980s. These investigators used a phylogenetic analysis, which is similar to drawing a family tree, to define four clusters of haplotypes, or haplogroups, that were present at varying frequencies in populations distributed throughout the New World (Torroni et al., 1992). Each haplogroup was defined by a single RFLP or deletion and the four clusters were called haplogroups A, B, C, and D. Briefly, a HaeIII site at bp 663 defined haplogroup A; the 9bp deletion defined haplogroup B; an AluI site at bp 13262 defined haplogroup C; and, loss of an AluI site at bp 5176 defined haplogroup D. These haplogroups were proposed to represent the entire mitochondrial diversity of New World indigenous populations and also to correspond to the founding haplotypes present at the initial colonization of the New World. These haplogroups have now also been defined by specific polymorphisms in the mitochondrial control region (Horai et al., 1993). All diagnostic sites are listed in Table 1.
With the advent of PCR technology in the field of aDNA, the analysis of these markers in ancient individuals or populations would seem obvious though perhaps not as straightforward as originally thought. The RFLPs and deletion defined by Wallace and coworkers represent only a subset of all mitochondrial polymorphisms currently assayed in contemporary New World indigenous populations. Furthermore, contemporary New World populations carry only a fraction of the mitochondrial variation present worldwide. Since modern populations may carry less genetic variation than their ancestors or may be more distantly related to ancient populations than is currently recognized, it is a dangerous strategy to assay prehistoric populations for a restricted set of markers that have been culled from contemporary populations. To assay ancient specimens for only a few diagnostic markers with the justification of damaged aDNA and commensurate increase in time required for aDNA analyses is to invite incorrect haplogroup assignments. In other words, more markers, rather than fewer, should be assayed in ancient specimens relative to modern ones in order to increase the probability of an accurate classification of the ancient specimen.
In order to obtain maximal information and comparability of their data, most researchers assay both control region DNA sequence and RFLP/deletion markers in New World populations, both contemporary and ancient (eg., Ward et al., 1991, Stone and Stoneking 1998, Kolman and Tuross 2000). The assignation of haplogroup using both control region sequence data and RFLP/deletion data provides a quality control check for the accuracy of the data, which is a necessary safeguard in an aDNA study. Furthermore, the existence of databases with only RFLP/deletion data or only control region sequence data means that the comparability of one's data is doubled if both types of loci are analyzed. Comparability of data is essential if the goal of the study is to determine relatedness or identity of an individual since it is only through comparison with other populations that an identity classification can be made.
Genetic Classification of a Single Archaeological Specimen and a Consideration of Ambiguous Genetic Results
In the absence of accompanying cultural artifacts, a single, isolated skeleton can often be classified with respect to other human populations using genetic data with the caveat that detailed classifications are more difficult to resolve than more general ones. Classification of an individual as being more closely affiliated with one population than another is based on a measure of distance of some character between the ancient individual and comparable populations. Physical morphological characters, such as cranial measurements or dental characteristics, can be used although these data may be valid only for divergence times of several thousand years as it appears that there is more plasticity in osteological characters than was previously believed, particularly in the New World (Powell 1998). On the other hand, current data suggest that there is very little genetic change measured by mitochondrial DNA over time throughout New World indigenous populations and substantial continuity between ancient and contemporary populations.
As described above, mitochondrial control region DNA sequence data or RFLP data are most commonly used in human evolutionary or population genetic studies. Table 2 provides a summary of data available on contemporary human populations distributed worldwide that have been published in peer-reviewed scientific journals (see Figure 2 for a map of populations from Table 2). For the purposes of the question currently being considered, i.e. the genetic classification of the skeleton found in Washington State, only a representative listing of studies on African and European populations are presented in Table 2. Because of the increased relevance of populations geographically close to the discovery site, all DNA-based studies on American Indian populations and ancestral Asian populations are listed. Asian populations are considered ancestral to American Indians because it is generally accepted by the scientific community that the New World was colonized by ancient Asian population(s) crossing over the Bering land bridge that was exposed during the last Ice Age. Only populations with sample sizes greater than 20 were included in the table, with the exception of aDNA studies. Most aDNA studies have smaller sample sizes relative to studies of contemporary populations because many excavated ancient burial populations are fewer than 20 individuals and may be only a single individual, as in the Kennewick case, and due to the increased difficulty of analyzing ancient specimens. Twenty individuals are generally considered to be the minimal size of a population to be used in a comparative analysis. The type of data generated, RFLP or DNA sequence, in each study is presented. Also, the frequency of New World founding haplogroups A, B, C, and D determined for each population is listed, with all non-A, B, C, D haplotypes pooled together under "Other".
As can be seen in Table 2, the four New World haplogroups are found only in American Indian populations and ancestral Asian populations. Therefore, a distinction between American Indian ancestry and African or European ancestry easily can be made based on the presence or absence of a New World founding haplogroup in the individual under study. This conclusion assumes that contemporary populations accurately reflect the genetic make-up of their ancestors and that no distinct haplotypes have been lost over time (this point will be discussed more fully below). It is equally clear from Table 2 that contemporary American Indian populations look quite similar to one another from a mitochondrial perspective. All four haplogroups are found in populations distributed throughout the New World with no haplogroup unique to any population, geographic region, or linguistic classification (New World indigenous populations have been divided into three linguistic families, Esk-Aleut, Na-Dene, and Amerind [Greenberg et al., 1986]). Furthermore, New World ancient populations also appear similar to contemporary American Indian populations in that the four haplogroups are found throughout the studied ancient populations and throughout the New World. In general, non-A, B, C, D haplogroups make only a minor contribution to the genetic diversity of ancient New World populations, a result that is mirrored in contemporary populations. Note that 8.8% of haplotypes are listed as "Other" in all ancient New World studies relative to 4.5% "Other" haplotypes in all contemporary New World studies listed in Table 2. However, some of the "Other" haplotypes in aDNA studies are likely to be due to modern DNA contamination thus lowering the number of truly ancient "Other" haplotypes. A comparable continent-wide distribution and high frequency of the four founding haplogroups in ancient and contemporary New World populations suggest that descendant populations accurately represent ancestral populations. This conclusion implies that no haplotypes have become extinct during the human settlement of the New World and that the four haplogroups represent all founding lineages, although a very low frequency founding haplotype in an ancient population could still be missed given the small sampling of ancient populations at present. Asian populations show higher levels of mitochondrial diversity and more non-A, B, C, D haplotypes. Siberian populations are characterized by a lack of haplogroup B and Southeast Asian populations are characterized by presence of only haplogroup B of the four New World haplogroups. In Asia, all four haplogroups are found only in east central Asian populations. This non-random distribution of the New World haplogroups outside of the New World has been used to support the argument that colonizing populations originated in the greater Mongolia region (Kolman et al., 1996). This interpretation is consistent with the Asian ancestry of American Indians that had been proposed prior to molecular analyses.
Therefore, determination of a haplotype A, B, C, D in a skeletal specimen would strongly suggest American Indian ancestry. However, because of the ubiquity of haplogroups A, B, C, D throughout the New World, a more detailed classification of a single A, B, C, D haplotype to a particular American Indian population or tribal group would be virtually impossible based on a visual inspection of the data. Therefore, mtDNA sequence and RFLP/deletion data such as those presented in Table 2 typically are analyzed using phylogenetic algorithms to determine accurate genetic relationships among the haplotypes.
Phylogenetic analysis of genetic data is a means of determining the most accurate evolutionary relationship of individual haplotypes. The result is generally displayed as a tree, similar to a family tree, with an ancestral root haplotype denoted and branches of related haplotypes referred to as clades. There are basically two types of mathematical models used to derive phylogenetic trees. Cladistic approaches attempt to determine the shortest, most parsimonious, tree needed to accurately represent all of the characters that have been assayed. Phenetic approaches are based on a numerical genetic distance measured between assayed characters and is reflected in the branch lengths of the tree. Both approaches have advantages and disadvantages depending on the type of data analyzed and its mutation rate, the divergence time of the individuals or populations being studied, and other factors. Due to the difficulty in identifying a superior phylogenetic model for any particular dataset, multiple models and algorithms are typically used so that similarities between approaches are given greater weight relative to relationships that are detected using only a single model. However, all of the models depend on the strength of the signal being greater than any "noise." "Noise" is considered random mutational events or multiple mutations at identical sites that either do not reflect evolutionary history or violate assumptions implicit in the phylogenetic models. In other words, all phylogenetic methods assume that tracking DNA mutations through a given data set will reveal the evolutionary history of the populations being studied and any mutations that violate this assumption will confuse the outcome and compromise the integrity of the phylogenetic analysis. Moreover, any analysis is only as good as the input data. Although a phylogenetic analysis is quite useful for determining the affinity of one population to another, the classification of a single individual or single haplotype as belonging to one particular group, such as a specific Native American tribe, is most likely beyond the power of phylogenetic analysis and, indeed, any analysis. The exception is an individual and comparable population that are so uniquely similar that their relatedness is obvious, in which case no sophisticated analysis would be necessary for proof of the relationship.
Contamination of Ancient Human Specimens with Modern DNA
As explained above, PCR enables the specific, exponential amplification of a discrete region of the genome. This ability has permitted the investigation of DNA samples from ancient specimens that typically are much more degraded or damaged than DNA samples from fresh or modern samples. However, the damage to aDNA increases the potential for another characteristic of PCR, that of contamination to intrude into the analysis. Since PCR analysis involves the exponential generation of new, synthetic DNA products from a small number of molecules, contamination with exogenous DNA in one of the initial PCR cycles can result in exclusive amplification of the contaminating DNA. This possibility is increased in aDNA analysis where the contaminant is likely to be undamaged DNA which will be amplified preferentially over the damaged, endogenous DNA. The growing number of aDNA studies published and number of samples and polymorphic sites assayed may give the impression that all technological hurdles associated with aDNA technology have been overcome. However, identification of contamination remains the single most critical issue in aDNA methodology. Standard precautionary measures such as negative extraction and PCR controls, multiple extractions, and "clean" rooms, while necessary, have been proven insufficient to identify complex co-occurrence of endogenous ancient DNA and modern contamination in human skeletal remains (Kolman and Tuross 2000).
The determination of DNA sequence from an ancient human source is uniquely sensitive to contamination simply because every person involved in the study represents a potential source of contaminating DNA. Even ancient pathogenic DNA associated with human skeletons may be analyzed with more straightforward controls on possible contamination (Kolman et al., 1999). Numerous cases in the published literature indicate that researchers have encountered contamination of human remains with modern DNA although many laboratories are reluctant to report examples of contamination. Recent analysis of DNA extracted from the Neanderthal type specimen (Krings et al., 1997) revealed two distinct sets of mitochondrial D-loop sequences, one significantly different from modern humans and proposed to be Neanderthal in origin and one identical to the human reference sequence (Anderson et al., 1981) and presumed to reflect modern human contamination. A second example of contamination is provided by Kaestle (1997) who identified one sample in a collection of western Nevada skeletons as belonging to New World haplogroup B (described below) although the sample also exhibited a second diagnostic site for haplogroup C. Conscientiousness and complete disclosure of results make it possible to assess the types and extent of contamination that may be present in the majority of aDNA studies. Reluctance to report evidence of contamination and/or the use of research strategies that are unlikely to detect contamination, e.g. partial typing of samples, should not be interpreted as absence of contamination or as proof of authenticity of the data.
Richards et al. (1995) reported that approximately 50% of nonhuman bones excavated from a site in England exhibited contamination with human DNA sequences. Similar contamination should be assumed for all human bones and measures to identify contaminants should be integrated into the research design of all human aDNA studies. Furthermore, aDNA investigators should be aware of their own genetic haplotype at the markers being studied and constantly screen out any identical aDNA haplotypes as potential contaminants. Again, examples exist in the literature of researchers identifying themselves as sources of contamination in aDNA studies (Stone and Stoneking 1998; Kolman and Tuross 2000). In short, careful selection of polymorphic markers capable of discriminating between ancient DNA and probable modern DNA contaminants is critical. Research strategies must be designed with a goal of identifying all DNA contaminants in order to differentiate convincingly between contamination and endogenous DNA.
Many laboratories routinely include positive PCR controls to evaluate the effectiveness of the amplification reaction. Although this appears to be an obvious control given the high PCR failure rate of many aDNA samples, use of modern, undamaged DNA as a positive control represents the conscious introduction of a potential DNA contaminant. In the event that identical haplotypes are determined for both the ancient specimen and the control DNA sample, it becomes impossible to prove that the data on the ancient sample do not reflect contamination by the control DNA.
If the inclusion of a single modern DNA sample for use as a positive PCR control is to be avoided in aDNA studies, it must be evident that aDNA studies should not be conducted in laboratories where studies on genetically similar, contemporary populations are ongoing. Studies on contemporary populations typically involve the analysis of hundreds or thousands of modern DNA samples. The standard solution is to physically separate the rooms in which experiments on ancient and modern DNA samples are conducted and incorporate the use of air locks, "sticky" floor mats, dedicated lab ware, etc. However, locating an aDNA laboratory outside of the main laboratory that is still utilized by the same researchers is unlikely to eliminate contamination since DNA can adhere to clothing worn by the researcher. Previous work performed by our group on natural history specimens of fish provides an example of the pervasiveness of contamination; one year after moving all positive control goldfish DNA to another floor and wing of the building, goldfish contamination was still being detected in ancient fish DNA PCRs performed in the aDNA laboratory. The bottom line is, despite a decade of aDNA research, contamination by modern DNA remains a significant problem because the many sources and modes of contamination are still not known or understood and, therefore, can not be controlled or eliminated.
It must be understood from the outset that bone would have to be destroyed in order to proceed with any DNA analysis. The amount of organic matter remaining in the skeleton is quite low based on the available information supplied by the Department of Interior pursuant to 14C (radiocarbon) dating (memorandum from the Departmental Consulting Archaeologist to the Assistant Secretary, Fish and Wildlife and Parks and pers.comm. F. McManamon). In one case, (Beta Analytic), the amount of organic material produced from the bone was approximately 1.6% the theoretical yield of modern bone, 200 milligrams protein/gram of bone (Herring 1972). A second laboratory at the University of Arizona (UA AMS Facility) has also reported extremely low yields in carbon from the skeleton in question. Finally, it is not clear, based on information provided from the third radiocarbon laboratories (University of California at Riverside) whether the organic material has any of the major protein, collagen, still remaining in the bone. An amino acid analyses of two bone samples (CENWW.97.L.20b/DOI 2b and CENWW.97.R.24 (Mta)/DOI 1b) were reported to contain a "non-collagen amino acid composition." These preliminary results from three separate laboratories are consistent with extensive degradation of the organic matrix in this human skeleton. Furthermore, these data differ significantly from those reported in the widely circulated letter (Taylor et al., 1998) on the radiocarbon dating of this skeleton in which a "collagen-like pattern similar to that which is typically obtained from a modern bone" was found.
The radiocarbon dates from five separate analyses (memorandum from the Departmental Consulting Archaeologist to the Assistant Secretary, Fish and Wildlife and Parks) with their associated delta13C values (pers. comm., F. Mc Manamon) are shown in Table 3. The delta13C values from the materials that were submitted for radiocarbon dating from this one skeleton ranged from -10.3o/oo to -21.9o/oo. The three analyses that produced radiocarbon dates from the right first metatarsal and the fifth left metacarpal in excess of 8,000 BP (Beta-133993, UCR-3807/CAMS-60684 and UCR-3476/CAMS-29578) have delta13C values that range from -10.8o/oo to -14.9o/oo. The variability in the delta13C of the bone extracts suggest that heterogeneous sources of carbon were used in obtaining these radiocarbon dates. Deviations in the biological fidelity of the delta13C and delta15N obtained from bone collagen have been documented when the amount of carbon and nitrogen differs from that found in the protein, collagen (DeNiro 1985).
In sum, these most recent radiocarbon reports do not portend well for DNA testing of the skeleton from Kennewick, WA. The accumulating information regarding the organic preservation of the skeleton suggests the bone has very little, if any, of its original protein remaining, and by inference, one would assume very little, if any, DNA remains in a form adequate for genetic analysis. This assessment must involve some speculation because the professional literature is largely silent on the issue. However, a general consideration of organic preservation in the skeleton is a necessary part of planning any proposed genetic analysis.
The existence of a small amount of bone remaining from the original radiocarbon date (UCR-3476/CAMS-29578) of 8410±60 BP was communicated to us by fax from F. McManamon on January 19, 2000 as follows:
In 1996, a portion (reported by UC-Davis to be 1.5g) of metacarpal bone containing." a collagen-like pattern similar to that which is typical of modern bone. (Taylor et al, 1998, Letter to Science) " was submitted to the University of California at Davis. Before the DNA analysis of this bone could be completed, The U.S. Army Corps of Engineers ordered the testing halted and the Kennewick Human remains and residues returned to their possession. Presently, the bone material the University of California at Davis began to subject to DNA testing is sealed and in the possession of the U.S. Army Corp of Engineers.
The difference in the apparent organic content of this bone sample (Taylor et al., 1998) and that was which was reported in two other skeletal elements from the skeleton in question is both striking and puzzling. We cannot recommend that the remaining bone from the fifth left metacarpal described above serve as the sole source of putative genetic information on this skeleton for the following reasons: first, from the evidence at hand, this bone piece is substantially chemically different from any of the rest of the skeleton, and, second, the chain of custody, handling and storage of the fifth left metacarpal has a dramatically different history over the past four years from the rest of the skeleton.
In their report to the National Park Service, Drs. Joseph F. Powell and Jerome C. Rose suggested a number of teeth including "the left mandibular first molar, left premolars, right second molar and the premolars" but particularly "the maxillary canines" might be excellent sources of DNA [Report on the Osteological Assessment of the "Kennewick Man" Skeleton (CENWW.97.Kennewick)]. The instincts and observations of these two physical anthropologists should not be ignored, but would need to be coupled with a determination of organic content in these tissues before any DNA extractions were attempted. In general, the smaller quantity of material available from teeth relative to bone renders teeth less useful for DNA analyses.
DNA analyses must destroy calcified tissue in order to remove any DNA trapped in the mineral matrix. Upon decalcification, DNA is released into solution, and is purified from this solution for further testing. The amount of bone that is processed for DNA analyses varies, and the amount of starting material generally relates inversely to the amount of total organic matter remaining in the bone-the lower the amount of original organic matter, the greater the amount of bone that has to be used. The low amounts of protein that seem to be preserved in this skeleton would lead many analysts to request samples of bone on the order of 15-30 grams. The amount of bone requested is based on the assumption that, if DNA still exists in the mineral matrix, many of the molecules will be damaged beyond use for the required testing, and, thus, a larger sample will give the analyst a greater statistical probability of isolating undamaged DNA templates.
The commonly accepted practice for removing DNA from skeletal remains involves dissolving the bone in a calcium-chelating agent. This gentle decalcifying agent will leave any collagen that does exist in the bone in a form that can be used for radiocarbon dating (Bocherens et al., 1995; Tuross et al., 1994). Unfortunately, due to the damage caused by halide acids in the form of depurination, the soluble preparations from the previously obtained radiocarbon dates will not be useful for genetic analyses.
Should DNA analysis of this ancient skeleton be attempted, an important criterion in designing a research plan for the molecular analysis is to ensure that the resultant data are not due to contaminating, exogenous DNA. The research plan must be designed to be capable of discriminating between endogenous, ancient DNA and exogenous, contaminating DNA. This is accomplished by assaying markers that differentiate between the endogenous DNA and all potential sources of contaminating DNA. With the stated caveat that it may be difficult to ensure a distinction between endogenous DNA and all sources of contamination, the minimal number of markers that should be assayed for a complete genetic characterization of the skeleton in question are those listed in Table 1. In terms of the specifics of the analysis, a minimum of six PCRs would be required to assay these markers one time. Four independent amplification reactions would be required to assay the three RFLPs and 9bp deletion. Two PCRs are advisable for the control region so that it could be amplified in segments no larger than 150-200 bps, a necessary precaution when dealing with damaged, fragmented ancient DNA. These markers should be assayed at least two times, starting each time from a fresh amplification reaction. The ideal situation would be to generate two DNA extracts from different tissue samples. No positive PCR controls using modern human DNA should ever be performed. All primer testing and reaction optimization should be performed in an independent, geographically separate laboratory. If there is evidence of contamination, e.g. conflicting results from analysis of two sets of PCRs, the extracted DNA should be cloned into a plasmid vector and multiple clones should be sequenced from each amplification reaction. Ten clones per amplification reaction would be sufficient to identify the contamination and, perhaps, to determine if endogenous DNA could be differentiated from contaminating DNA.
The most important component of the research plan requires that the complete analysis be conducted in two independent laboratories. Neither laboratory should be involved in the analysis of contemporary human populations because the presence of overwhelming amounts of undamaged, potentially contaminating DNA would immediately compromise the results of any analysis. It is difficult to find laboratories that are experienced with the analysis of aDNA, but do not conduct analyses of contemporary populations since, from a scientific perspective, similar questions are addressed with both types of analyses. However, it is essential for the integrity and defensibility of the final results that all possibility of contamination with modern sources of DNA, with the exception of the investigators themselves, be eliminated.
Once data is generated on all of the assayed markers, a haplotype can be constructed that joins all of the polymorphisms. If there is no evidence of contamination, only a single result will have been noted at each marker and only a single haplotype construction will be possible. If contamination has been detected and multiple haplotypes can be constructed, a thorough analysis must be performed in order to determine which haplotype, if any, corresponds to endogenous DNA. Once a single, endogenous haplotype has been determined, its affiliation with published haplotypes will be determined through a hierarchical analysis. Phylogenetic analysis of the haplotype with other American Indian haplotypes can be performed but will very likely be unsuccessful in identifying an affiliation of this individual a particular American Indian tribe. If the haplotype is non-A, B, C, D, the skeleton may be non-American Indian or may represent a American Indian haplotype that has become extinct in modern Native American populations. A phylogenetic analysis of the ancient haplotype against contemporary populations distributed worldwide must be performed in order to attempt a general classification of the skeletal haplotype. However, this analysis likely will not be able to distinguish between the two possibilities listed above, i.e. non-American Indian ancestry vs. American Indian ancestry with an extinct haplotype. In the case where contamination has been detected during the analysis and a single, endogenous haplotype cannot be determined, then the analysis is inconclusive and no assignment to a haplogroup can be made. In all circumstances, the final results and conclusions must agree between the two laboratories in which the analyses were performed, and a genial commitment to work toward an accurate and complete genetic analysis of the skeleton is as important as the independence of the two laboratories. If different haplotypes were determined and the differences cannot be reconciled, again, the analysis is contradictory and no conclusions can be made.
The larger question is what would be done with any genetic
typing (or lack thereof) of this skeleton. If haplogroup A, B, C or
D is found, and a likely determination of American Indian biological
affiliation is made, will this set the standard for all future new finds
of human skeletal remains? Will this type of analysis never have to
be done again, and will all skeletons that predate the arrival of Europeans
to the Americas be assumed to be ancestral to American Indians? If the
results are ambiguous or if no DNA remains in the skeleton, how will
this be interpreted, and what will be the ramifications? It is our considered
opinion that, for all the parties concerned, the genetic analysis of
this skeleton may not yield the resolution that is so dearly sought.
1Mitochondria are cellular components that contain a separate and distinct type of DNA compared with the nucleus. Mitochondrial DNA (mtDNA) is thought to be exclusively maternally inherited and clonal in nature. Back to text
Back to Kennewick Man