A multi-part series on the fundamentals eDiscovery practitioners need to know about the processing of electronically-stored information
In “Why Understanding Processing is Important,” we discussed understanding processing in the context of lawyers’ duty of technology competence. In “Key Activities and Common Tools,” we discussed the core processing activities and some of the tools used to complete them. In “Common Exceptions and Special Cases,” we discussed scenarios requiring extra work and decisions during core processing activities. In “Objective Culling Options,” we discussed de-NISTing, deduplication, and content filtering. In this final Part, we review final steps and key takeaways.
In addition to the core activities of expansion, extraction, normalization, indexing, and objective culling that we have already discussed, there can be a variety of additional steps required during processing to prepare the materials for subsequent early case assessment, review, and production activities.
Depending on the platform in which the material will be used and the ways that it will be used, additional steps may be required to finish preparing it for those activities. For example, we noted in the last Part that it is not uncommon to create and populate a custom master date field that integrates values from different date/time fields associated with different file types. It is also common to create other custom metadata fields, such as a field that extracts the domain names associated with email addresses, or a field that documents collection source details such as custodian or directory. The specific fields to be created will depend on the material with which you will be working and what you hope to accomplish with it during ECA and review.
In addition to custom metadata fields, final preparation activities may also include the preemptive generation of TIFF images of the documents (i.e., PDF-style page images), if there is a desire to review documents in that form (or a need to have them ready for rapid production turnaround later). And, if the subsequent activities are taking place in a different software platform than the processing (which is often the case), some form of load file will also need to be prepared.
Load files are, essentially, enormous tracking spreadsheets that can contain every document, its extracted metadata (and any custom fields), its extracted text content, links to associated native files, links to standalone text files, links to associated TIFF images, and other details. They serve as Rosetta Stones for the ECA and/or review software to understand how all the thousands upon thousands of discrete files and pieces of information you’re loading into it for a given project fit together in a usable way.
Regardless of the specific steps taken in a given processing project, all processing efforts generally end with some form of quality control validation process prior to the hand-off to ECA and review activities. As we’ve described above, the end product of a processing effort is a complex assemblage of elements that may include hundreds of thousands of native files, image files, text files, load files, and a variety of customizations. Given that enormous volume, diversity, and complexity, a wide range of simple technical issues are possible, including file naming errors, load file field errors, file linking errors, imaging errors, and more.
To identify such issues prior to loading for subsequent activities, processors typically employ some combination of targeted quality control checks for specific issues, random sampling checks to spot any other issues, and software validation tools to backstop the human checks. Once any issues have been identified and remediated, materials can be handed off for ECA and review to begin.
For Assistance or More Information
Xact Data Discovery (XDD) is a leading international provider of eDiscovery, data management and managed review services for law firms and corporations. XDD helps clients optimize their eDiscovery matters by orchestrating precision communication between people, processes, technology and data. XDD services include forensics, eDiscovery processing, Relativity hosting and managed review.
XDD offers exceptional customer service with a commitment to responsive, transparent and timely communication to ensure clients remain informed throughout the entire discovery life cycle. At XDD, communication is everything – because you need to know. Engage with XDD, we’re ready to listen.
Whether you prefer email, text or carrier pigeons, we’re always available.
Discovery starts with listening.