Researcher concerns about the digitized records available on the web sites of NARA’s digitization partners
Recently on NARAtions we heard from researchers who expressed concerns about the digitized records available on the web sites of NARA’s digitization partners. We shared these concerns with NARA’s Access Programs office, NARA’s Digital Strategies and Services Staff, and with our digitization partners and we would like to respond to these concerns here on NARAtions.
Since 2007 NARA has entered into three major digitization partnerships with Ancestry.com, Footnote.com, and Genealogical Society of Utah (Family Search). Under these agreements, the partners have created approximately 60 million digital copies of NARA records. There are also an additional 70 million images of NARA records on Ancestry.com that Ancestry created prior to entering a digitization agreement with NARA. Ancestry.com digitized, indexed, and placed these images online using NARA microfilm publications that are available to anyone by purchase from NARA. This was strictly the work of Ancestry, with no involvement, oversight, or quality assurance work by NARA.
NARA takes the concerns raised by researchers seriously. We are working with our partners to improve their digital products, including those produced before the partnership agreement, as problems come to our attention. Our partners want to rectify errors and are cooperating in doing so, though some of the issues are difficult to resolve in a seamless and timely manner. In these difficult cases, the partners post advisory notices to alert users to the anomalies. There are two examples of issues affecting the browse structure that have been rectified by Ancestry at NARA’s request due to researcher input. In the first example, the ship name portion of the browse structure for New York passenger arrivals, on November 13, 1893, did not include the ship Etruria. That has been corrected. The second example involves a misspelled township in the 1920 Pennsylvania census. Throop township in Lackawanna County was spelled as “Throap.” Because of this, the search function did not work for this township. That has been corrected. We have pointed out to Ancestry that the browse list of townships still contains the misspelling, so we have asked them to correct it. This does not affect the ability to browse the township, but we all want it corrected nonetheless.
Both the partners and NARA are involved in several aspects of quality assurance work with regard to projects under the digitization partnerships. The quality assurance work relates the images, metadata, content completeness, and final transfer to NARA. There are four main areas of quality control (QC):
- QC of imaging is the responsibility of the partner, following standards reported to, and approved by, NARA. The precise standards are proprietary information.
- QC of metadata is the responsibility of the partner, following standards reported to, and approved by, NARA. The precise standards are proprietary information.
- QC of content is the responsibility of NARA – Specifically, NARA does a page-by-page review against a five percent sample of the original records to find and identify information which might have been left out, such as the back of a document that has only a stamp or small notation. All such information has to be captured. (Higher levels are reviewed if quality concerns surface during review.) The partner corrects any omissions found in the review. Skipped pages are imaged and inserted into the images folder at the correct location.
- QC relating to transfer of digital materials to NARA – The partners send the digital materials to NARA on hard drives. NARA staff checks a sample of the images and metadata to verify that the metadata on each hard drive is associated with the correct image and that the metadata the partner agreed to provide is delivered. The staff also checks a sample of the unique identifiers associated with each image to verify that the identifiers are correct. If there are problems with the metadata or images sent by the partner, NARA contacts the partner to resolve the problems.
At the March meeting of the Researcher Users Group at the National Archives in the Washington, DC Area, some researchers requested the ability to report partner website problems directly to NARA, so we can be involved in trying to resolve them. We agreed and indicated that users may report specific matters to us, though obviously this is not a requirement. Anyone choosing to report to us may write to firstname.lastname@example.org. Problem reporting is an option that we invite you to exercise when you come across specific problems, if you wish to use it. It is not a substitute for quality control, but rather an additional avenue for improving quality. With approximately 130,000,000 digitized images of NARA records on partner websites, some errors are inevitable. We are happy that some researchers want to help everyone by letting us know of problems, and we look forward to helping resolve them.
NARA recently posted a list of the digitized records available on the partners’ web sites at the request of researchers. The “status” column on the list has been deleted in response to a concern that we might be giving a false impression that some digitized publications contain all of the records in a publication. We certainly did not intend to do that. We had included a “status” column to indicate the status of the digitization work because some publications on Footnote were still being digitized even though images were available online. The status of “complete” for publications was simply meant to indicate that the partner had completed its digitization work on the publication.
We agree that the statement in the lengthy Family Search press release from 2007 about access to records held by NARA may not have been as clear as possible. Records held by NARA which aren’t restricted due to legal or preservation restrictions are available to anyone. NARA recognizes however that not everyone can visit one of our 30 NARA research rooms in person and a researcher at any one of these research rooms is limited to the records available at that facility only. Providing online access to digitized copies of records gives those who cannot readily visit NARA the ability to research our holdings. NARA’s own press release related to its partnership Family Search states that “Digitization makes possible unprecedented access to the unique historic documents in the custody of the National Archives… These records [Civil War and later pension files], of great interest to genealogists and others, are currently available only at the National Archives Building in Washington, DC.”
Thank you to all of the researchers who have taken the time to comment about our digitization partnerships.