A multi-part series on the fundamentals eDiscovery practitioners need to know about the preparation and production of ESI
In “The Final Countdown,” we discussed the importance of production and the primary production formats. In this Part, we review other important production format considerations.
Beyond just deciding on your optimal combination of paper, near-paper, native, and near-native production options, there are a range of more-detailed options for you to consider. Among the most important are options related to load files, metadata, redactions, numbering and endorsements, and paper integration.
As we discussed, many productions are accompanied by a load file that contains information about the various documents and images being produced and that makes it possible for those materials to be imported together into a document review platform. These load files can provide links to native or near-native files, to rendered images, and to extracted text files, and they can contain a variety of fields of metadata and extracted data for each document.
Load files are essentially large spreadsheets themselves, though their specific formatting requirements and applicable field delimiters vary some from system to system. For example, Summation originated the “DII” load file format and Ipro originated the “LFP” load file format. For this reason, it is important for the parties to be on the same page about what load file format(s) are going to be required.
Decisions will also need to be made (or negotiated) regarding what fields the load file should include, how they should be labeled, and what custom fields – if any – should be created. For example, a field might be included documenting the request number(s) in response to which each document is being produced, or a field might indicate the documents to which a protective order applies.
As we’ve discussed, metadata has tremendous value, both as potential evidence (e.g., revealing when and by who something was modified) and as the basis of many filtering, sorting, and searching options within document review tools. Thus, deciding (or negotiating) what metadata fields (and other extracted data fields), if any, will be included in your production (when producing non-natively) will have both evidentiary and usability impacts.
The EDRM organization’s model XML load file includes the following standard metadata and extracted data fields:
In addition to deciding what metadata and other extracted data gets included in your production, decisions may also need to be made (or negotiated) about what names will be used for those fields and what formats will be used for the values in them. Custom fields too may need to be discussed. For example, should there be a master date field? If so, what hierarchy of other date fields should be used to generate that value? In what time zone should all dates and times be normalized?
As we discussed, primary production format affects your ability to perform redactions within documents. Generally speaking, native and near-native files cannot be effectively redacted, while near-paper and paper productions can. The availability of effective redactions is one of the reasons for the continued popularity of near-paper, image-based productions.
When preparing a production that will involve redactions, you will need to consider how redactions should appear on the page, including whether redaction type (privilege, PII, etc.) affects appearance or requires a label. Additionally, if extracted document text is being provided (to facilitate searching), the extracted text for documents bearing redactions will have to be either excluded or replaced. It can be replaced by performing optical character recognition (OCR) on the page images rendered for the document with the redactions applied.
Also as we discussed, primary production format affects your options with regard to numbering and endorsements. Paper and near-paper productions allow for per-page Bates numbering, while native and near-native formats generally only allow for per-file numbering to be applied. Combination approaches require coordinating per-page numbering for some documents with per-file numbering for others.
Other endorsements, such as confidentiality warnings or protective order language work the same way. Paper and near-paper productions can have consistent endorsements applied in the headers and/or footers of each page, while native and near-native productions cannot. For native and near-native productions (and, often, for near-paper productions too), a custom load file field may be created that documents confidential status, protective order applicability, and other endorsement content for each document.
Another element to consider is how you will handle natively paper materials collected during discovery along with all of your ESI. Rather than producing such materials in paper format, you have the option of incorporating them into your electronic production. This can be accomplished by scanning the documents into page images, performing OCR to extract the available text for searching, and manually entering relevant “metadata” values (e.g., bibliographic and source information). Depending on their volume, you may have already taken these steps prior to review of the paper records.
Upcoming in this Series
In the next Part, Who Gets to Decide Production Options, we will continue our review of production fundamentals with a discussion of what the Federal Rules of Civil Procedure have to say about production formats.
Whether you prefer email, text or carrier pigeons, we’re always available.
Discovery starts with listening.